On Wed, 2011-06-29 at 11:04 +1000, Dave Chinner wrote: > On Tue, Jun 28, 2011 at 04:39:21PM -0700, Chandra Seetharaman wrote: > > On Thu, 2011-06-16 at 16:29 -0500, Alex Elder wrote: > > > On Tue, 2011-06-14 at 11:51 -0700, Chandra Seetharaman wrote: > > > > Hello All, > > > > > > > > test case 180 fails often (4 out of 5) in my x86_64 system. > > > > Any suggestions on how to proceed to debug ? > > > > > > I have been seeing failures like that sometimes > > > (more often recently I think) for a while. I > > > have not had the chance to really chase it down. > > > > > > If you can reproduce it pretty relibly you could > > > use "git bisect" to try to find out whether the > > > failures started to occur after a particular > > > commit. > > > > I tried git bisect and it ended up in a qla2xxx fix (and I do not even > > have qlogic card in that system). > > > > I did it couple more times and landed on different patches. > > That indicates your test case is not 100% reliable. :/ > Agreed. That is why I tested the "supposedly"(thru git bisect) good one for about 500 iterations to verfify the failing patch. OTOH, can you suggest a test that does what 180 does in a reliable way ? > I haven't seen a failure in 180 on any of my test machines for some > time (32 or 64 bit). > > > My latest (fourth ot fifth, I forgot :) bisect landed on the patch with > > commit 546a1924224078c6f582e68f890b05b387b42653 ( writeback: > > write_cache_pages doesn't terminate at nr_to_write <= 0) > > That was merged in 2.6.36-rc2, and shouldn't have any sync > implications at all.... > > > I verified that this is valid patch by running the test script 180 for > > nearly 500 times on the tree just prior to this patch. > > Ok, more details about your test setup is needed. What kernel are > you running? What storage are you using? How much RAM/CPU, etc? > Kernel: mainline with up to commit #546a1924224078c6f582e68f890b05b387b42653 Storage: 2TB megaraid (IBM ServeRAID M1015) local storage. Partition: only 20GB RAM: 25GB Proc: Intel(R) Xeon(R) CPU E5607 @ 2.27GHz #of procs: 4 > Also, what are the sizes of the files that had reported incorrect > size? It failed with varied sizes. Here are the 10 failures from 3.0.0-rc5 kernel: +file /mnt/xfsScratchMntPt/966 has incorrect size - sync failed +-rw-------. 1 root root 8663040 Jun 29 13:46 /mnt/xfsScratchMntPt/966 +file /mnt/xfsScratchMntPt/644 has incorrect size - sync failed +-rw-------. 1 root root 8724480 Jun 29 13:53 /mnt/xfsScratchMntPt/644 +file /mnt/xfsScratchMntPt/381 has incorrect size - sync failed +-rw-------. 1 root root 10096640 Jun 29 14:03 /mnt/xfsScratchMntPt/381 +file /mnt/xfsScratchMntPt/569 has incorrect size - sync failed +-rw-------. 1 root root 10383360 Jun 29 14:04 /mnt/xfsScratchMntPt/569 +file /mnt/xfsScratchMntPt/650 has incorrect size - sync failed +-rw-------. 1 root root 9216000 Jun 29 14:04 /mnt/xfsScratchMntPt/650 +file /mnt/xfsScratchMntPt/947 has incorrect size - sync failed +-rw-------. 1 root root 8663040 Jun 29 14:04 /mnt/xfsScratchMntPt/947 +file /mnt/xfsScratchMntPt/569 has incorrect size - sync failed +-rw-------. 1 root root 7761920 Jun 29 14:10 /mnt/xfsScratchMntPt/569 +file /mnt/xfsScratchMntPt/905 has incorrect size - sync failed +-rw-------. 1 root root 8417280 Jun 29 14:11 /mnt/xfsScratchMntPt/905 +file /mnt/xfsScratchMntPt/617 has incorrect size - sync failed +-rw-------. 1 root root 10403840 Jun 29 14:13 /mnt/xfsScratchMntPt/617 +file /mnt/xfsScratchMntPt/654 has incorrect size - sync failed +-rw-------. 1 root root 9216000 Jun 29 14:15 /mnt/xfsScratchMntPt/654 +file /mnt/xfsScratchMntPt/569 has incorrect size - sync failed +-rw-------. 1 root root 7802880 Jun 29 14:17 /mnt/xfsScratchMntPt/569 +file /mnt/xfsScratchMntPt/740 has incorrect size - sync failed +-rw-------. 1 root root 9216000 Jun 29 14:17 /mnt/xfsScratchMntPt/740 +file /mnt/xfsScratchMntPt/574 has incorrect size - sync failed +-rw-------. 1 root root 10260480 Jun 29 14:26 /mnt/xfsScratchMntPt/574 +file /mnt/xfsScratchMntPt/655 has incorrect size - sync failed +-rw-------. 1 root root 9216000 Jun 29 14:26 /mnt/xfsScratchMntPt/655 +file /mnt/xfsScratchMntPt/952 has incorrect size - sync failed +-rw-------. 1 root root 8663040 Jun 29 14:27 /mnt/xfsScratchMntPt/952 +file /mnt/xfsScratchMntPt/575 has incorrect size - sync failed +-rw-------. 1 root root 10260480 Jun 29 14:28 /mnt/xfsScratchMntPt/575 +file /mnt/xfsScratchMntPt/656 has incorrect size - sync failed +-rw-------. 1 root root 9216000 Jun 29 14:28 /mnt/xfsScratchMntPt/656 +file /mnt/xfsScratchMntPt/926 has incorrect size - sync failed +-rw-------. 1 root root 8663040 Jun 29 14:29 /mnt/xfsScratchMntPt/926 +file /mnt/xfsScratchMntPt/941 has incorrect size - sync failed +-rw-------. 1 root root 8417280 Jun 29 14:31 /mnt/xfsScratchMntPt/941 +file /mnt/xfsScratchMntPt/544 has incorrect size - sync failed +-rw-------. 1 root root 7413760 Jun 29 14:35 /mnt/xfsScratchMntPt/544 > > Cheers, > > Dave. > > PS: Please don't top post replies. Please quote and reply inline so > that the thread flow is easy to follow. sorry :( _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs