and see duplicate files... http://cluster.biodiversitylibrary.org/n/naturalistslibra30jardrich/ P On Wed, Oct 27, 2010 at 12:23 PM, phil cryer <phil at cryer.us> wrote: > We're building our cluster of data, downloading book data from > Internet Archive. I've come across one that looks like this: > http://cluster.biodiversitylibrary.org/n/naturwissenschaft19deut/ > > Almost all the files appear to be there twice, but have the same name, > timestamp and inode! What could be causing this, and how can we fix > it? At issue is space; it appears that we're using far more space than > we should, and an `du -h` or `ls -lsh` both say this directory takes > 3.9G when it should really be about 1/2 that. If it has done this on > many of the directories, it could explain how we're using 78T of 97T > of space already. > > P > -- > http://philcryer.com > -- http://philcryer.com