On Tue, Apr 30, 2019 at 3:41 PM Guilherme G. Piccoli <gpiccoli@xxxxxxxxxxxxx> wrote: > > > On 19/04/2019 14:08, Song Liu wrote: > > [...] > > I read through the discussion in V1, and I would agree with Neil that > > current behavior is reasonable. > > > > For the following example: > > > > fd = open("file", "w"); > > write(fd, buf, size); > > ret = fsync(fd); > > > > If "size" is big enough, the write is not expected to be atomic for > > md or other drives. If we remove the underlining block device > > after write() and before fsync(), the file could get corrupted. This > > is the same for md or NVMe/SCSI drives. > > > > The application need to check "ret" from fsync(), the data is safe > > only when fsync() returns 0. > > > > Does this make sense? > > > > Hi Song, thanks for your quick response, and sorry for my delay. > I've noticed after v4.18 kernel started to crash when we remove one > raid0 member while writing, so I was investigating this > before perform your test (in fact, found 2 issues [0]), hence my delay. > > Your test does make sense; in fact I've tested your scenario with the > following code (with the patches from [0]): > https://pastebin.ubuntu.com/p/cyqpDqpM7x/ > > Indeed, fsync returns -1 in this case. > Interestingly, when I do a "dd if=<some_file> of=<raid0_mount>" and try > to "sync -f <some_file>" and "sync", it succeeds and the file is > written, although corrupted. I guess this is some issue with sync command, but I haven't got time to look into it. How about running dd with oflag=sync or oflag=direct? > > Do you think this behavior is correct? In other devices, like a pure > SCSI disk or NVMe, the 'dd' write fails. > Also, what about the status of the raid0 array in mdadm - it shows as > "clean" even after the member is removed, should we change that? I guess this is because the kernel hasn't detect the array is gone? In that case, I think reducing the latency would be useful for some use cases. Thanks, Song > > > > Also, could you please highlight changes from V1 (if more than > > just rebase)? > > No changes other than rebase. Worth mentioning here that a kernel bot > (and Julia Lawall) found an issue in my patch; I forgot a > "mutex_lock(&mddev->open_mutex);" in line 6053, which caused the first > caveat (hung mdadm and persistent device in /dev). Thanks for pointing > this silly mistake from me! in case this patch gets some traction, I'll > re-submit with that fixed. > > Cheers, > > > Guilherme > > [0] https://marc.info/?l=linux-block&m=155666385707413 > > > > > Thanks, > > Song > >