On Sun, Jan 22, 2012 at 2:56 PM, Joseph L. Casale <jcasale@xxxxxxxxxxxxxxxxx > wrote: > >I have a CentOS 5.7 machine hosting a 16 TB XFS partition used to house > >backups. The backups are run via rsync/rsnapshot and are large in terms of > >the number of files: over 10 million each. > > > >Now the machine is not particularly powerful: it is 64-bit machine, dual > >core CPU, 3 GB RAM. So perhaps this is a factor in why I am having the > >following problem: once in awhile that XFS partition starts generating > >multiple I/O errors, files that had content become 0 byte, directories > >disappear, etc. Every time a reboot fixes that, however. So far I've > looked > >at logs but could not find a cause of precipitating event. > > > >Hence the question: has anyone experienced anything along those lines? > What > >could be the cause of this? > > In every situation like this that I have seen, it was hardware that never > had > adequate memory provisioned. > > Another consideration is you almost certainly wont be able to run a repair > on that > fs with so little ram. > > Finally, it would be interesting to know how you architected the storage > hardware. > Hardware raid, BBC, drive cache status, barrier status etc... > > Joseph, If I remember correctly I pretty much went with the defaults when I created this XFS on top of a 16-drive RAID6 configuration. Now as far as memory - I think for the purpose of XFS repair RAM and swap ought to be the same. And I've got plenty of swap on this system. I also host an 5 TB XFS in a file there and I ran XFS repair on it and it ran within no more than 5 minutes. Now this is 20% of the larger XFS, roughly speaking. I should try to collect the info you mentioned, though - that was a good thought, some clue might be contained in there for sure. Thanks for your input. Boris. _______________________________________________ CentOS mailing list CentOS@xxxxxxxxxx http://lists.centos.org/mailman/listinfo/centos