On Sun, 2011-03-13 at 11:48 +1100, Dave Chinner wrote: Thanks for your response, Dave. <snip> > As i said before, the debug check is known to be racy. Having it > trigger is not necessarily a sign of a problem. I have only ever > tripped it once since the way the check operates was changed. > There's no point in spending time trying to analyse it and explain > it as we already know why and how it can trigger in a racy manner. Oh, may be I misunderstood. In your earlier reply you mentioned that you wanted to know if the problem is consistently reproducible. Since it was, I went on to debug the problem. If it is not an issue, it will be a good idea to reduce that ASSERT to WARN_ON_ONCE() as you mentioned. > > > Then I started comparing the behavioral difference bet the two ARCHs, > > and I found that in POWER I see more number of threads at a time (max of > > 4 threads) in the function xlog_grant_log_space(), whereas in x86_64 I > > see max of only two and mostly it is only one. > > > > I also noted that in POWER test case 011 takes about 8 seconds whereas > > in x86_64, it takes about 165 seconds. > > > > So, I ventured into the core of test case 011, dirstress, and found that > > simply creating 1000s of files under a directory takes very long time in > > x86_64 compare to POWER(1 min 15s Vs 2s) > > On my x86-64 boxes, test 011 takes 3s with CONFIG_XFS_DEBUG=y, all > lock checking turned on, memory poisoning active, etc. With a > prodution kernel, it usually takes 1s. Even on a single SATA drive. > > So, without knowing anything about your x86-64 machine, I'd say > there's something wrong with it or it's configuration. Try turning > off barriers and seeing if that makes it go faster.... Slowness happened in two x86_64 blades. In the blade where the storage is a SSD device, nobarrier helped drastically. ========== [root@test27 chandra]# mount -o nobarrier /dev/disk/by-id/wwn-0x5000a7203002f7e4-part1 /mnt/xfsMntPt/ [root@test27 chandra]# time ./b /mnt/xfsMntPt/d1/ 10000 1 i 0 real 0m1.983s user 0m0.026s sys 0m1.365s =================== Whereas, in the blade where the storage is a SAN disk, it didn't help much. Note that I verified the disk is performing fine by using a ext4 filesystem. =================== [root@test65 chandra]# mount /dev/sdb1 /mnt/xfs [root@test65 chandra]# mount /dev/sdb2 /mnt/ext4 [root@test65 chandra]# tail -2 /proc/mounts /dev/sdb1 /mnt/xfs xfs rw,seclabel,relatime,attr2,noquota 0 0 /dev/sdb2 /mnt/ext4 ext4 rw,seclabel,relatime,barrier=1,data=ordered 0 0 [root@test65 chandra]# time ./b /mnt/ext4/d1 10000 1 i 0 real 0m0.332s user 0m0.006s sys 0m0.264s [root@test65 chandra]# time ./b /mnt/xfs/d1 10000 1 i 0 real 1m35.620s user 0m0.012s sys 0m0.735s [root@test65 chandra]# mount -o nobarrier /dev/sdb1 /mnt/xfs [root@test65 chandra]# tail -2 /proc/mounts /dev/sdb2 /mnt/ext4 ext4 rw,seclabel,relatime,barrier=1,data=ordered 0 0 /dev/sdb1 /mnt/xfs xfs rw,seclabel,relatime,attr2,nobarrier,noquota 0 0 [root@test65 chandra]# time ./b /mnt/xfs/d1 10000 1 i 0 real 1m6.772s user 0m0.011s sys 0m0.739s ======================== What else could affect the behavior like this ? Also, note that in power I get the fast performace with barrier on. Thanks, chandra <snip> _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs