Search Postgresql Archives

Re: atoi-like function: is there a better way to do this?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mar 4, 2012, at 22:31, Chris Angelico <rosuav@xxxxxxxxx> wrote:

> (Hoping you meant for that reply to be on-list as I'm here responding on-list.)
> 
> On Mon, Mar 5, 2012 at 2:16 PM, A.M. <agentm@xxxxxxxxxxxxxxxxxxxxx> wrote:
>> 
>> On Mar 4, 2012, at 9:13 PM, Chris Angelico wrote:
>> 
>>> One of our tables has a few columns that may be interpreted as strings
>>> or may be numbers (data type is varchar, numbers are stored as
>>> decimal). Generally, operations are performed on the string, but
>>> sometimes we need to parse out a number - without it failing on error.
>>> I wrote the following function to approximate to the semantics of
>>> atoi:
>> 
>> I would recommend against such a schema since different data types should warrant their own columns, but if you are left with no choice...
> 
> The values have to be strings for other reasons (eg '' is valid
> everywhere, and this is subsequently processed by a script that
> expects all strings). So yeah, no choice there. But I agree that
> normally you DO want integers stored in integer columns, and we're
> paying a performance penalty for this.
> 
>>> 
>>> create or replace function str2int(val varchar) returns bigint immutable as $$
>>> begin
>>>       val=substring(val from '[0-9]*');
>>>       if length(val) between 1 and 19 then return val::bigint; end if;
>>>       return 0;
>>> end;
>>> $$ language plpgsql;
>> 
>> This can be written as:
>> select substring('35345345.45645654' from '\d{1,19}')::bigint;
>> 
>> Be aware that this does not account for:
>> 3dogs
>> 3 dogs
>> 3,dogs
>> 3.5.6.7
>> 
>> all of which will return 3::bigint, but I assume that the column is not completely free-form or maybe this is what you want. (?)
>> 
>> Cheers,
>> M
> 
> That is in fact the behaviour I want. Trouble is, that simpler version
> returns NULL if given 'asdf' as its input - I want it to return 0.

COALESCE is your friend


> It's also about the same performance (slightly slower in a quick test)
> than the original; it's still doing a regular expression parse. I was
> hoping very much to avoid the regex altogether.
> 
> ChrisA
> 

Any efficient, non-RegEx, alternative would require more context to evaluate than you provide.  Mainly, would it be faster to have a separate field to store the parsed (at input) number and then query that field directly (even if it is a text field as well)?  Basically cache the parse.

David J..



-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux