> Il giorno 25 apr 2018, alle ore 19:06, Jens Axboe <axboe@xxxxxxxxx> ha scritto: > > On 4/25/18 11:03 AM, Paolo Valente wrote: >> >> >>> Il giorno 25 apr 2018, alle ore 18:50, Jens Axboe <axboe@xxxxxxxxx> ha scritto: >>> >>> Hi Linus, >>> >>> I ended up sitting on this about a week longer than I wanted to, >>> since we were hashing out details with a timeout change. I've now >>> killed that patch, so we can flush the existing queue in due time. >>> >>> This pull request contains: >>> >>> - Fix for an old regression, where entering the queue can be disturbed >>> by a signal to the process. This can cause spurious EIO. Fix from Alan >>> Jenkins. >>> >>> - cdrom information leak fix from Dan. >>> >>> - Trivial helper for testing queue FUA from Dave Chinner, part of his >>> O_DIRECT FUA series. >>> >>> - Series of swim fixes from Finn that actually makes it work again. >>> >>> - Loop O_DIRECT corruption fix, which caused data corruption in >>> production for us. From me. >>> >>> - BFQ crash fix from me. >>> >> >> For what it's worth, I disagree with this patch. This change is based >> on an apparently buggy motivation, as I wrote in my reply in the >> thread where Jens proposed it. As such, I think it might even bury >> more deeply the actual bug that causes the crash (although of course >> this patch does eliminate the crash for the use case reported in that >> thread). > > The patch fixes the issue and I've explained why. Definitely, but I wrote you that your explanation seems wrong. > If you have a > motivation to fix it differently, for whatever reason, then by all > means submit a patch. So far I haven't seen it, and we still have > the known crash that people are actually hitting. > Unfortunately I don't have a solution, because I don't know what the bug is. I only know that there's a bug in your explanation for the bug you want to fix. > I'll be happy to work with you on that later in the week, when > the LSFMM conference has wound down. > I do thank you for that. If you can, please start by answering my concerns on your explanation. Maybe your explanation can be fixed (or I'm simply wrong), and all will be fine. Or, if your explanation is really buggy and we don't find the actual bug in the code, we can just enrich your proposed change with a comment, and state that this is a crutch to walk on, while waiting for the actual bug to show up. Thanks, Paolo > -- > Jens Axboe