Search Postgresql Archives

Re: to_date() and to_timestamp() with negative years

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/4/21 10:29 AM, Bryn Llewellyn wrote:
/adrian.klaver@xxxxxxxxxxx <mailto:adrian.klaver@xxxxxxxxxxx> wrote:/

Not sure how this can be handled in a totally predictable way given the unpredictable ways in which datetime strings are formatted?

The only thing I can say it is it points out that when working with datetimes settling on a standard format is your best defense against unpredictable results.

Thank you very much, again, for your help with my seemingly endless nitpicking questions on this matter, Adrian. Here's the most favorable conclusion that I can draw:


3. The rules are hard to understand and the PG doc gives insufficient detail to allow the outcomes in corner cases like you just showed us to be predicted confidently. Some users seek to understand the rules by reading PG's source code.

I would say that is because datetimes in string formats are often hard to understand as a result of folks inventing their own formats.

As an example a commit message of mine from some years ago:

"
Fix issue with date parsing of Javascript dates coming from browser
on Windows. This occurred in both Firefox and Chrome. The issue being
that the date had a timezone of Pacific Standard Time instead of PST like it does from a Linux machine. dateutils choked on that timezone and therefore the date was not parsed. The fix was to add fuzzy=True to the dateutils.parse(). This allows dateutils to skip over anything it does not understand and parse the rest. This works as the date does include the correct tz offset. For the record the date format of concern is-Tue Dec 08 2015 00:00:00 GMT-0800 (Pacific Standard Time)
"

Can't remember what version of Windows this was. The dates ended up in a Postgres database via Python code in a Django app. This fix is one of the reasons I really like the Python dateutils library. The solution being:

from dateutil.parser import parse

parse('Tue Dec 08 2015 00:00:00 GMT-0800 (Pacific Standard Time)', fuzzy=True)
datetime.datetime(2015, 12, 8, 0, 0, tzinfo=tzoffset(None, 28800))

To see what it is doing:

parse('Tue Dec 08 2015 00:00:00 GMT-0800 (Pacific Standard Time)', fuzzy_with_tokens=True)

(datetime.datetime(2015, 12, 8, 0, 0, tzinfo=tzoffset(None, 28800)),
 (' ', ' ', ' ', ' (Pacific Standard Time)'))

where everything after the datetime are tokens it ignored.

I include this mainly as illustration that data clean up maybe more practical before it ever hits the database and in a library that is specialized for the task at hand.


4. Certainly, the rules set a significant parsing challenge. You hint that they might even prevent a complete practical solution to be implemented.

Yes that would depend on someone coding an AI that can fully understand people and what there intentions where from a string value. Given the answers I get when asking people directly what they intended, I'm not holding my breath.


5. None of this matters when the app designer has the freedom to define how date-time values will be provided, as text values, by user interfaces or external systems. In these cases, the complexity can be controlled by edict and correct solutions can be painlessly implemented and tested. Not a day goes by that I don't have to enter a date value at a UI. And in every case, a UI gadget constrains my input and makes its individual fields available to the programmer without the need for parsing—so there's reason to be optimistic. The programmer can easily build the text representation of the date-time value to conform to the simple rules that the overall application design specified.

Yep, constraining the imagination of the end user solves a lot of problems.


6. In some cases text data that's to be converted arrives in a format that cannot be influenced. And it might contain locutions like we've been discussing ("zero" meaning "1 BC", unfavorable paradigms for separating tokens, and the like). In these cases, the diligent programmer might, just, be able to use the full arsenal of available tools to implement a scheme that faultlessly parses the input. But the likelihood of bugs seems to be pretty big.

From my experience that pretty much defines all aspects of programming.

I'll say "case closed, now" — from my side, at least.


--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux