Re: Multiline plpython procedure

Marco Colombo <pgsql@xxxxxxxxxx> · Fri, 21 Jan 2005 19:19:25 +0100 (CET)

On Fri, 21 Jan 2005, Bruno Wolff III wrote:

On Fri, Jan 21, 2005 at 12:02:09 +0100,
If you are going to another system that uses the same floating point
representation, you should get the same number. pg_dump writes out
enough digits that the exact number can be recovered when the dump
has been reloaded. This has been the case since 7.3.

If you move the data to a machine with a different floating point
representation you might get a different number even if the original number
could be represented exactly in the new representation.

So the same pg_dump file _may_ lead to different databases on different
platforms, even right now. So the issue of 'identical' databases is
not serious.

Note that the float case is worse than the multiline text one. With multiline
text there is always a way to convert it w/o loss or change of information.
All you need is to treat it as a "sequence of lines".
"sequences of lines" are differently represented on different platforms,
just like floats. Whenever you use it on a platform, use the platform
dependent format. When you export it, use a defined external format.
On import, convert the defined external format to the internal one.
Does this lead to different internal formats on different platforms?
Yeah but what's the problem?

The problem originated from a Windows application storing a multiline

text (python function body) and this text being handed to a unix

program that expect a multiline text as input (the python interpreter).

This is a particular case only. _Any_ windows client that inserts

a multiline text is likely to use \r\n as separator, while any unix

client is likely to insert text with \n. For the same input (same

sequence of lines typed by the user), the result is different.

There's no way to write a server-side application that handles that

correctly, right now. Of course 'Hello\r\nWorld\r\n' is different

from 'Hello\n\World\n', as far as the server is concerned. But if

you think of what the users typed, you realize they should be equal. 
It _is_ the same data (line 1 is 'Hello' and line 2 is 'World'), just

in different formats. Either the client library should handle that

transparently (converting to an on-the-wire format), or the server should

be aware of what convention the client is using.

Right now the application developer should take care of it, since
PostgreSQL (including client library) treats text as opaque binary data.

(I'm not arguing we should change that. I'm just saying it's not a python bug.)

.TM.
--
      ____/  ____/   /
     /      /       /			Marco Colombo
    ___/  ___  /   /		      Technical Manager
   /          /   /			 ESI s.r.l.
 _____/ _____/  _/		       Colombo@xxxxxx

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster