Re: Issues with generate_series using integer boundaries

Thom Brown <thom@xxxxxxxxx> · Thu, 3 Feb 2011 11:34:45 +0000

On 3 February 2011 11:31, Thom Brown <thom@xxxxxxxxx> wrote:
> On 1 February 2011 23:08, Thom Brown <thom@xxxxxxxxx> wrote:
>> On 1 February 2011 21:32, Alban Hertroys
>> <dalroi@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
>>> On 1 Feb 2011, at 21:26, Thom Brown wrote:
>>>
>>>> On 1 February 2011 01:05, Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
>>>>> Thom Brown <thom@xxxxxxxxx> writes:
>>>>>> I've noticed that if I try to use generate_series to include the upper
>>>>>> boundary of int4, it never returns:
>>>>>
>>>>> I'll bet it's testing "currval > bound" without considering the
>>>>> possibility that incrementing currval caused an overflow wraparound.
>>>>> We fixed a similar problem years ago in plpgsql FOR-loops...
>>>>
>>>> Yes, you're right.  Internally, the current value is checked against
>>>> the finish.  If it hasn't yet passed it, the current value is
>>>> increased by the step.  When it reaches the upper bound, since it
>>>> hasn't yet exceeded the finish, it proceeds to increment it again,
>>>> resulting in the iterator wrapping past the upper bound to become the
>>>> lower bound.  This then keeps it looping from the lower bound upward,
>>>> so the current value stays well below the end.
>>>
>>>
>>> That could actually be used as a feature to create a repeating series. A bit more control would be useful though :P
>>
>> I don't quite understand why the code works.  As I see it, it always
>> returns a set with values 1 higher than the corresponding result.  So
>> requesting 1 to 5 actually returns 2 to 6 internally, but somehow it
>> correctly shows 1 to 5 in the query output.  If there were no such
>> discrepancy, the upper-bound/lower-bound problem wouldn't exist, so
>> not sure how those output values result in the correct query result
>> values.
>
> Okay, I've attached a patch which fixes it.  It allows ranges up to
> upper and down to lower bounds as well as accounting for the
> possibility for the step to cause misalignment of the iterating value
> with the end value.  The following now works which would usually get
> stuck in a loop:
>
> postgres=# SELECT x FROM generate_series(2147483643::int4,
> 2147483647::int4) AS a(x);
>     x
> ------------
>  2147483643
>  2147483644
>  2147483645
>  2147483646
>  2147483647
> (5 rows)
>
> postgres=# SELECT x FROM generate_series(2147483642::int4,
> 2147483647::int4, 2) AS a(x);
>     x
> ------------
>  2147483642
>  2147483644
>  2147483646
> (3 rows)
>
> postgres=# SELECT x FROM generate_series(2147483643::int4,
> 2147483647::int4, 6) AS a(x);
>     x
> ------------
>  2147483643
> (1 row)
>
>
> It's probably safe to assume the changes in the patch aren't up to
> scratch and it's supplied for demonstration purposes only, so could
> someone please use the same principals and code in the appropriate
> changes?
>
> Thanks
>

And I see I accidentally included a doc change in there.  Removed and
reattached:

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935
Attachment:
generate_series_fix.v2.patch

Description: Binary data
-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general