On 1/12/23 16:33, Johannes Thumshirn wrote: > On 12.01.23 06:18, Damien Le Moal wrote: >> On 1/12/23 07:08, Damien Le Moal wrote: >>> On 1/11/23 21:23, Johannes Thumshirn wrote: >>>> On 10.01.23 14:08, Damien Le Moal wrote: >>>>> Using REQ_OP_ZONE_APPEND operations for synchronous writes to sequential >>>>> files succeeds regardless of the zone write pointer position, as long as >>>>> the target zone is not full. This means that if an external (buggy) >>>>> application writes to the zone of a sequential file underneath the file >>>>> system, subsequent file write() operation will succeed but the file size >>>>> will not be correct and the file will contain invalid data written by >>>>> another application. >>>>> >>>>> Modify zonefs_file_dio_append() to check the written sector of an append >>>>> write (returned in bio->bi_iter.bi_sector) and return -EIO if there is a >>>>> mismatch with the file zone wp offset field. This change triggers a call >>>>> to zonefs_io_error() and a zone check. Modify zonefs_io_error_cb() to >>>>> not expose the unexpected data after the current inode size when the >>>>> errors=remount-ro mode is used. Other error modes are correctly handled >>>>> already. >>>> >>>> This only happens on ZNS and null_blk, doesn't it? On SCSI the Zone Append >>>> emulation should catch this error before. >>> >>> Yes. The zone append will fail with scsi sd emulation so this change is >>> not useful in that case. But null_blk and ZNS drives will not fail the >>> zone append, resulting in a bad file size without this check. >> >> Let me retract that... For scsi sd, the zone append emulation will do its >> job if an application writes to a zone under the file system. Then zonefs >> issuing a zone append will also succeed and we end up with the bad inode size. >> >> We would get a zone append failure in zonefs with scsi sd if the >> corruption to the zone was done with a passthrough command as these will >> not result in sd_zbc zone write pointer tracking doing its job. >> > > But then the error recovery procedures in sd_zbc should come into place. Yes, for the passthrough case, because it ends up generating a failed write, which zonefs catches and handles. All good in that case. It remains that the patch is also necessary for scsi if the corruption happens with something like dd, which will be seem by sd_zbc. So subsequent zone append from zonefs will succeed, even though zonefs is not aware that the wp changed... > > Anyways for ZNS and null_blk this is an improvement: > Reviewed-by: Johannes Thumshirn <johannes.thumshirn@xxxxxxx> -- Damien Le Moal Western Digital Research