On Fri, Sep 15, 2017 at 12:41 AM, Andreas Dilger <adilger@xxxxxxxxx> wrote: > I don't think a reproducer is needed. It looks like the fsync callpath > is happening from an IRQ context due to IO completion, and then re-entering > the filesystem while a transaction is already started. It looks like the > original IO was submitted with AIO based on the functions on the IRQ stack, > which is likely why nobody has hit it (AIO isn't very commonly used). > > That said, I don't follow the reasoning behind the convoluted series of AIO > callbacks that has IO _completion_ calling vfs_fsync_range() and re-entering > the filesystem to flush out more data? Thanks for analyzing, and I do think the syzkaller reproducer(in fact, log) may also answer your question and help positioning the precise issue trigger in-depth. Moreover, for me, I am not professional enough to analyze such a complex problem with call trace and code only :) - ChunYu