On Wed, 19 Jan 2005, Martijn van Oosterhout wrote:
On Wed, Jan 19, 2005 at 12:20:23PM +0100, Marco Colombo wrote:I think you're missing that vendors define what a 'text file' is on their platform, not Guido. Guido just says that a Python program is a text file, which is a very sound decision, since it makes perfectlty sense to be able to edit it with native tools (text editors which do not support alien textfile formats).
Sure, some text editors don't. Some text editors do. But the C compiler accepts programs in any of these formats. And consider multiple machines working off the same file server. There is no "standard" text format and everyone should just get along.
Exaclty. Or, one could say: the "standard" text format is the one the platform you are running on dictates. Which is what python does. Multiple machine from a file server had better to agree on what a text file is. Or do runtime conversions. Or let the server do that. The issue affects _any_ text file (this email to name one) not only python programs. [aside note: for e-mail there actually is a well defined "on the wire" format, and applications are expected to make the conversion when needed]
The C standard explicitly defines \r and \n as whitespace, thus neatly avoiding the entire issue. Many other languages do the same. The fact is the python is the odd one out.
You're missing the point. The C source file is not a text file, it's a binary sequence of bytes (which is quite unfortunate, you may d/l a .c file and be not able to see/read it on your platform, while the C compliler groks it happily). There's no _line_ separator in C. If you've ever heard of obfuscated-C contexts, you know that you can write a complete C program that actually does something in one line, since C uses a _statements_ separator (';') and not a line separator. So C is precisely an example of what you should not do: use a binary file as source, pretending it's a text file. This may actually make sense, historically, but definitely it's against python attitude. Python source files are, like it or not, well formed text file, and the parser even requires correct indentation.
"Be very picky in what you accept"... after all, you're a _formal_
language. You already put a thousand requirements (a whole grammar)
in what you receive, why not adding also a few ones that force an
improved readability. Think of how hard it is for newbies to spot
a missing ; in C. Compare to how easy is to spot a missing line
break (actually, I think any newbie gets line breaking naturally
right from the start). Having the source of your programs be
line-oriented (opposed to statement oriented) is big win for a language designer. And correclty indended from the start is even
better.
You may not agree with the last statements, but that's the python
way, a design (and general attitude) decision. There's no point
in sending a bug report about it.
Be liberal in what you receive. After, what's the benefit of having python source that's not runnable on every computer. Without conversion.
Python source of course is runnable on every computer, provided that the source file is a real text file for that platform. If you downloaded any text file (not just python source files) by the _wrong_ mean (e.g. FTP binary mode from a Unix server) on Windows you'll have problems in handling it. You cannot view it (notepad) you - very likely - cannot print it. (Yeah your <insert favorite 3rd party editor> may be able perform both operations, but that's not the point). Are you expecting your python interpreter on windows to be able to handle it? Why? It's not a text file, it's binary garbage, the same you see with notepad or on your printer when you try and print it. See the point? (It's subtle: python somehow requires a program to be human readable, and that means it has to be a text file, correctly formatted for the platform).
I can see only two ways to address the issue:
- convert the string that represents the python program to a correct multi-line string (according to the rules of the platform we're running on) before we pass it to the python interpreter;
- explicitly set one format as the right one for our purpose ("embedded python in PostgreSQL"), and have the python interpreter we use comply, no matter of the platform we're running on.
Of course, setting the rule:
- python scripts should be correctly formatted multi-line strings according to _server_ platform,
will work as well, but places extra burden on the clients (and/or users).
Note that an option or env. variable like:
$ python -T dos file.py
$ export PYTHONTEXTFORMAT=dos $ python file.py
would be great to have, of course (and that can be suggested).
.TM. -- ____/ ____/ / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _____/ _____/ _/ Colombo@xxxxxx
---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings