On Tue, Mar 25, 2014 at 09:05:04AM -0700, Eric Sandeen wrote: > >> Out of curiosity, is there any major reason we don't use 0 here > >> unconditionally? Are we worried about I/O completing before we have a > >> chance to decrement the reference? > > > > I think this should unconditionally avoid the schedule, and while we're > > at it we should kill _xfs_buf_ioend and opencode it here and at the > > other callsite. > > And then remove the flag from xfs_buf_ioend which is always 0 at that > point ... Is it? xfs_buf_bio_end_io should stil be passing 1, the bio end_io handler is the place we really need the workqueue for anyway. > Yeah I have a patch to do that as well; I wanted to separate the > bugfix from the more invasive cleanup, though - and I wanted to > get the fix out for review sooner. Sure, feel free to leave all the cleanups to another patch. > But yeah, I was unsure about whether or not to schedule at all here. > We come here from a lot of callsites and I'm honestly not sure what > the implications are yet. I think the the delayed completion is always wrong from the submission path. The error path is just a special case of a completion happening before _xfs_buf_ioapply returns. The combination of incredibly fast hardware and bad preemption could cause the same bug you observed. _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs