On Fri, 21 Jan 2005, Bruno Wolff III wrote:
On Fri, Jan 21, 2005 at 12:02:09 +0100, If you are going to another system that uses the same floating point representation, you should get the same number. pg_dump writes out enough digits that the exact number can be recovered when the dump has been reloaded. This has been the case since 7.3.
If you move the data to a machine with a different floating point representation you might get a different number even if the original number could be represented exactly in the new representation.
So the same pg_dump file _may_ lead to different databases on different platforms, even right now. So the issue of 'identical' databases is not serious.
Note that the float case is worse than the multiline text one. With multiline text there is always a way to convert it w/o loss or change of information. All you need is to treat it as a "sequence of lines". "sequences of lines" are differently represented on different platforms, just like floats. Whenever you use it on a platform, use the platform dependent format. When you export it, use a defined external format. On import, convert the defined external format to the internal one. Does this lead to different internal formats on different platforms? Yeah but what's the problem?
The problem originated from a Windows application storing a multiline
text (python function body) and this text being handed to a unix
program that expect a multiline text as input (the python interpreter).
This is a particular case only. _Any_ windows client that inserts
a multiline text is likely to use \r\n as separator, while any unix
client is likely to insert text with \n. For the same input (same
sequence of lines typed by the user), the result is different.
There's no way to write a server-side application that handles that
correctly, right now. Of course 'Hello\r\nWorld\r\n' is different
from 'Hello\n\World\n', as far as the server is concerned. But if
you think of what the users typed, you realize they should be equal. It _is_ the same data (line 1 is 'Hello' and line 2 is 'World'), just
in different formats. Either the client library should handle that
transparently (converting to an on-the-wire format), or the server should
be aware of what convention the client is using.
Right now the application developer should take care of it, since PostgreSQL (including client library) treats text as opaque binary data.
(I'm not arguing we should change that. I'm just saying it's not a python bug.)
.TM. -- ____/ ____/ / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _____/ _____/ _/ Colombo@xxxxxx
---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster