Re: Spreadsheet_Excel_Reader problem

Ashley Sheridan <ash@xxxxxxxxxxxxxxxxxxxx> · Thu, 18 Mar 2010 17:00:24 +0000

On Thu, 2010-03-18 at 12:57 -0400, Paul M Foster wrote:

> On Thu, Mar 18, 2010 at 04:15:33PM +0000, Ashley Sheridan wrote:
> 
> > On Thu, 2010-03-18 at 12:12 -0400, Paul M Foster wrote:
> > 
> >     On Thu, Mar 18, 2010 at 08:57:00AM -0700, Tommy Pham wrote:
> > 
> >     <snip>
> > 
> >     >
> >     > Personally, I find working with fixed widths is best.  The text file
> >     > might be larger but I don't have worry about escaping any type of
> >     > characters ;)
> > 
> >     I find this impossible, since I never know the largest width of all the
> >     fields in a file. And a simple explode() call allows pulling all the
> >     fields into an array, based on a common delimiter.
> > 
> >     Paul
> > 
> >     --
> >     Paul M. Foster
> > 
> > 
> > 
> > Explode won't work in the case of a comma in a field value.
> 
> That's why I convert the files to tab-delimited first. explode() does
> work in that case.
> 
> > 
> > Also, newlines can exist within a field value, so a line in the file doesn't
> > equate to a row of data
> 
> I've never seen this in the files I receive.
> 
> > 
> > The best way is just to start parsing at the beginning of the file and break it
> > into fields one by one from there.
> > 
> > The bit I don't like about characters other than a comma being used in a "comma
> > separated values" file is that you can't automatically tell what character has
> > been used as the delimiter. Hence being asked by spreadsheet programs what the
> > delimiter is if a comma doesn't give up what it recognises as valid fields.
> 
> I've honestly never seen a "CSV" or "Comma-separated Values" which used
> tabs for delimiters. At that point, it's really not a *comma* separated
> value file.
> 
> My application for all this is accepting mailing lists from customers
> which I have to convert into DBFs for a commercial mailing list program.
> Because most of my customers can barely find the on/off switch on their
> computers, I never know what I'm going to get. So before I string
> together the filters to process the file, I have to actually look at and
> analyze the file to find out what it is. Could be a fixed-field length
> file, a CSV, a tab-delimited file, or anything in between. Once I've
> selected the filters, the sequence they will be put together in, and the
> fields from the file I want to capture, I hit the button. After it's all
> done, I now have to look at the result to ensure that the requested
> fields ended up where they were supposed to.
> 
> Paul
> 
> -- 
> Paul M. Foster
> 

But surely whatever character is used as the delimiter could be part of
the fields value?

I hadn't even known that newlines would exist in the fields, until it
broke a script of mine!

And I believe that when MS Office saves a CSV out with a character other
than a comma as the delimiter, it still saves it as a .csv by default.

Thanks,
Ash
http://www.ashleysheridan.co.uk