Hi Jack, I tested it in 4.9.102 and I checked the latest code from elixir (versions 4.19 and 4.20) and the error in code is still present there. More on the scenario and the bug: I experienced data corruption in my application (nvme based storage). The issue was caused because of faulty hardware, but the real problem is I got a correct number of bytes in io_getevents thus couldn't recognize it correctly the error. Looking at the /var/log/messages and I saw the following errors in time of coruption: Oct 11 14:55:15 block01-node05 kernel: [19272.951015] blk_update_request: I/O error, dev nvme2n3, sector 117359360 Oct 11 14:55:15 block01-node05 kernel: [19272.952786] blk_update_request: I/O error, dev nvme2n3, sector 117359872 Oct 11 14:55:16 block01-node05 kernel: [19273.544374] blk_update_request: I/O error, dev nvme2n3, sector 117360384 ... So the block level does receive information about the error, but I don't see it in the application. running ftrace and doing code reading I find out that dio error status is overridden. In dio_complete it is propagated in (dio->io_error and if dio->io_error is not zero in we are in async write the status is overridden by transferred. static ssize_t dio_complete(struct dio *dio, ssize_t ret, bool is_async) { ... if (ret == 0) ret = dio->page_errors; if (ret == 0) ret = dio->io_error; if (ret == 0) ret = transferred; ... if (is_async) { /* * generic_write_sync expects ki_pos to have been updated * already, but the submission path only does this for * synchronous I/O. */ dio->iocb->ki_pos += transferred; if (dio->op == REQ_OP_WRITE) ret = generic_write_sync(dio->iocb, transferred); dio->iocb->ki_complete(dio->iocb, ret, 0); For your convenience I am attaching ftrace log to for easier tracking the flow in the code: 26) | nvme_complete_rq [nvme_core]() { 26) | blk_mq_end_request() { 26) | blk_update_request() { <---- log is from here 26) 0.563 us | blk_account_io_completion(); 26) 0.263 us | bio_advance(); 26) | bio_endio() { 26) | dio_bio_end_aio() { 26) | dio_bio_complete() { 26) | bio_check_pages_dirty() { 26) | bio_put() { 26) | bio_free() { 26) | __bio_free() { 26) 0.045 us | bio_disassociate_task(); 26) 0.497 us | } 26) 0.042 us | bvec_free(); 26) | mempool_free() { 26) | mempool_free_slab() { 26) 0.264 us | kmem_cache_free(); 26) 0.606 us | } 26) 1.125 us | } 26) 2.588 us | } 26) 2.920 us | } 26) 3.979 us | } 26) 4.712 us | } 26) 0.040 us | _raw_spin_lock_irqsave(); 26) 0.048 us | _raw_spin_unlock_irqrestore(); 26) | dio_complete() { dio_complete(dio, 0, true); 26) | aio_complete() { dio->iocb->ki_complete(dio->iocb, ret, 0); <<aio_complete(struct kiocb *kiocb, long res, long res2)>> 26) 0.073 us | _raw_spin_lock_irqsave(); 26) 0.114 us | refill_reqs_available(); 26) 0.048 us | _raw_spin_unlock_irqrestore(); 26) | kiocb_free() { 26) 0.171 us | fput(); 26) 0.102 us | kmem_cache_free(); 26) 0.902 us | } } On Tue, Nov 6, 2018 at 9:29 AM Jack Wang <jack.wang.usish@xxxxxxxxx> wrote: > > Gregory Shapiro <shapiro.gregory@xxxxxxxxx> 于2018年11月5日周一 下午4:19写道: > > > > Hello, my name is Gregory Shapiro and I am a newbie on this list. > > I recently encountered data corruption as I got a kernel to > > acknowledge write ("io_getevents" system call with a correct number of > > bytes) but undergoing write to disk failed. > > After investigating the problem I found it is identical to issue found > > in direct-io.c mentioned the bellow thread. > > https://lore.kernel.org/lkml/20160921141539.GA17898@xxxxxxxxxxxxx/ > > Is there a reason proposed patch didn't apply to the kernel? > > When can I expect it to be applied? > > Thanks, > > Gregory > > Hi Gregory, > > Thanks for your info. > Have you tried with latest kernel other than 4.7, is the problem still there? > > Could you share your test case? > > Regards, > Jack Wang