in place file splitter

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Igor Gueths staggered into view and mumbled:
>
>Hi Chuck. I think you're probably right, as the file contents will have to
>be stored in RAM until written to outfiles.


I am about to take this a bit off topic for speakup, so if you are
not interested in a programming technique, you might want to delete
this article now--sorry if I stepped on anyone's toes with this
discussion.

Actually, a technique can be used to read chunks of the input file,
truncating it as you go.  This technique will require more disk I/O,
but will not require storing massive files in memory.  This technique
is not as necessary nowadays as it used to be given the low cost and
massive size of RAM available, but it might be of some use somewhere.
Here is an algorithm which describes the basics of how this technique
works:

Open the input file.
Open an output file.
Read a chunk of the input file.
while the end of the file hasnot been reached, do:
  Write the chunk to the output file.
  Close the output file.
  Move all of the remaining input file data to the beginning of the
    input file.
  Get the current position in the input file.
  Close the input file.
  Truncate the input file at the current position.
  Open the input file.
  Open an output file.
  Read a chunk of data from the input file.
End of while loop.
If a chunk of data has been read which has not been written do:
  Write the chunk of data to the output file.
  Close the output file.
Else
  Close the empty output file.
  Delete the empty output file.
End of if-else statement.
Close the input file.
Delete the remainder of the input file.

See the man page for the C function `truncate'.  Once working
properly, this technique will chop the input file down in chunks equal
to the amount of data written to the output files.  Because the input
file overwrites itself over and over again in ever shrinking amounts,
lots of disk I/O will be necessary, especially for large files which
are to be split into many smaller ones.  All of this disk I/O will of
course require much more time than loading the entire input file into
memory and writing the output files from there, but any size input
files can be handled this way even if memory size is limited.  You
may or may not find this technique useful.  I am not too sure what
this technique has to do with speakup though;);).

Have a _great_ day!

-- 
Ralph.  N6BNO.  Wisdom comes from central processing, not from I/O.
rreid at sunset.net  http://personalweb.sunset.net/~rreid
Opinions herein are either mine or they are flame bait.
SEC (x) / COSEC (x) = (TAN (x) / COTAN (x)) ^ 2




[Index of Archives]     [Linux for the Blind]     [Fedora Discussioin]     [Linux Kernel]     [Yosemite News]     [Big List of Linux Books]
  Powered by Linux