High Performance PostgreSQL for Rails: Confusing `SUBSTR` arguments in `SCRUB_EMAIL` (page 52)

Hi @douglasshuang. Thanks for the detailed write-up again here. I don’t remember for sure whether I knew functions like substring() had 1-based indexes and not 0-based.

I studied that documentation page and didn’t explicitly see it mentioned, however it can be deduced from the examples, for example:

substring('Thomas' from 2 for 3)hom

1,2,3,4,5,6
----------
T,h,o,m,a,s

We can see it starts at position 2 “h” and goes for a count of 3 (inclusive of “m”) to get “hom”.

With that in mind, I agree the code/sql/scrub_email_function_full.sql function would be more sensible to start from 1 and go for a count of “5”.

I guess before with a count of 0 it doesn’t raise an error, but isn’t a valid index position, so that’s why it was set to 6 to get “5” characters plus 1 invalid position.

Start at 1, go for 5 is more clear!

Thanks again for finding this. I’m submitting this fix and hope it goes into a new version.

1 Like