On 10/10/12 8:17 AM, Stan Hoeppner wrote: > On 10/10/2012 3:51 AM, Marcin Deranek wrote: >> Hi, >> >> We are running XFS filesystem on one of out machines which is a big >> store (~3TB) of different data files (mostly images). Quite recently we >> experienced some performance problems - machine wasn't able to keep up >> with updates. After some investigation it turned out that open() >> syscalls (open for writing) were taking significantly more time than >> they should eg. 15-20ms vs 100-150us. >> Some more info about our workload as I think it's important here: >> our XFS filesystem is exclusively used as data store, so we only >> read and write our data (we mostly write). When new update comes it's >> written to a temporary file eg. >> >> /mountpoint/some/path/.tmp/file >> >> When file is completely stored we move it to final location eg. >> >> /mountpoint/some/path/different/subdir/newname >> >> That means that we create lots of files in /mountpoint/some/path/.tmp >> directory, but directory is empty as they are moved (rename() syscall) >> shortly after file creation to a different directory on the same >> filesystem. >> The workaround which I found so far is to remove that directory >> (/mountpoint/some/path/.tmp in our case) with its content and re-create >> it. After this operation open() syscall goes down to 100-150us again. >> Is this a known problem ? >> Information regarding our system: >> CentOS 5.8 / kernel 2.6.18-308.el5 / kmod-xfs-0.4-2 >> Let me know if you need to know anything more. > > Hi Marcin, > > I'll begin where you ended: kmod-xfs. DO NOT USE THAT. Use the kernel > driver. Eric Sandeen can point you to the why. AIUI that XFS module > hasn't been supported for many many years. Yep. Ditch that; it overrides the maintained module that comes with the kernel itself. See if that helps, first, I suppose. I've been asking Centos for a while to find some way to deprecate that, but it's like night of the living dead xfs modules. (modinfo xfs will tell you for sure which xfs.ko is getting loaded I suppose). > Regarding your problem, I can't state some of the following with > authority, though it might read that way. I'm making an educated guess > based on what I do know of XFS and the behavior you're seeing. Dave > will clobber and correct me if I'm wrong here. ;) > > XFS filesystems are divided into multiple equal sized allocation groups > on the underlying storage device (single disk, RAID, LVM volume, etc). > With inode32 each directory that is created has its files store in only > one AG, with some exceptions, which you appear to bumping up against. > If you're using inode64 the directories, along with their files, go into > the AGs round robin. Agreed that it would be good to know whether inode64 is in use. Let's start there (and with a modern xfs.ko) before we speculate further. > Educated guessing: When you use rename(2) to move the files, the file > contents are not being moved, only the directory entry, as with EXTx > etc. Thus the file data is still in the ".tmp" directory AG, but that > AG is no longer its home. Once this temp dir AG gets full of these > "phantom" file contents (you can only see them with XFS tools), the AG > spills over. At that point XFS starts moving the phantom contents of > the rename(2) files into the AG which owns the directory of the > rename(2) target. I believe this is the source of your additional > latency. Each time you do an open(2) call to write a new file, XFS is > moving a file's contents (extents) to its new/correct parent AG, causing > much additional IO, especially if these are large files. Nope, don't think so ;) Nothing is going to be moving file contents behind your back on a rename. <snip> -Eric _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs