On 07/02/2012 07:51 PM, Dave Chinner wrote: > On Mon, Jul 02, 2012 at 09:33:24AM -0400, Brian Foster wrote: >> On 07/01/2012 08:07 PM, Dave Chinner wrote: >>> On Thu, Jun 28, 2012 at 06:52:56AM -0400, Brian Foster wrote: >>>> xfsaild idle mode logic currently leads to a couple hangs: >>>> >>>> 1.) If xfsaild is rescheduled in during an incremental scan >>>> (i.e., tout != 0) and the target has been updated since >>>> the previous run, we can hit the new target and go into >>>> idle mode with a still populated ail. >>>> 2.) A wake up is only issued when the target is pushed forward. >>>> The wake up can race with xfsaild if it is currently in the >>>> process of entering idle mode, causing future wake up >>>> events to be lost. >>>> >>>> These hangs have been reproduced and verified as fixed by >>>> running xfstests 273 in a loop on a slightly modified upstream >>>> kernel. The kernel is modified to re-enable idle mode as >>>> previously implemented (when count == 0) and with a revert of >>>> commit 670ce93f, which includes performance improvements that >>>> make this harder to reproduce. >>>> >>>> The solution, the algorithm for which has been outlined by >>>> Dave Chinner, is to modify xfsaild to enter idle mode only when >>>> the ail is empty and the push target has not been moved forward >>>> since the last push. >>>> >>>> Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx> >>> >>> Looks OK to me, and hasn't caused any problems here. >>> >>> Final question - did you confirm with powertop that the xfsaild is >>> no longer causing wakeups a minute or two after you stop writing to >>> the filesystem? (I haven't yet) >>> >> >> I hadn't tested with powertop, but I had some tracepoints hacked in >> around the idle/wake cases to verify the thread was actually scheduling >> out. > > If you've added tracepoints that were useful for > debugging/verification, then send that as a patch as well. If users > have trouble then simply asking them for event traces is very easy > to do and gives us much better insight into what is happening.... > > You can't have enough tracepoints when things are going wrong ;) > Ok, duly noted. What I have right now is scattered about a few branches and not immediately presentable. When I get some time I'll fix them up and post. If I remember correctly, I had covered: xfsaild end (count, skip, target, etc.), xfsaild idle, xa_target update (xfs_ail_push()) and xfsaild wake (which might be extraneous at this point). Brian >> FWIW, I just gave powertop a quick test and it appears to work as >> expected... >> >> With current upstream on my rhel6.3 VM, I see the following after >> running a 'touch /mnt/file;sync' and letting the fs idle for a bit: >> >> 0.5% ( 19.9) xfsaild/vdb1 : xfsaild (process_timeout) >> >> and this drops off completely with the patch applied. Thanks for the tip. > > Great, then it is working exactly as expected. > > Cheers, > > Dave. > _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs