On Wed 24-11-21 17:01:12, yangerkun wrote: > On 2021/11/23 17:27, Jan Kara wrote: > > Hello, > > > > On Sun 26-09-21 19:35:01, yangerkun wrote: > > > Rethink about this problem. Should we consider other place which call > > > ext4_issue_zeroout? Maybe it can trigger the problem too(in theory, not > > > really happened)... > > > > > > How about include follow patch which not only transfer ENOSPC to EIO. But > > > also stop to overwrite the error return by ext4_ext_insert_extent in > > > ext4_split_extent_at. > > > > > > Besides, 308c57ccf431 ("ext4: if zeroout fails fall back to splitting the > > > extent node") can work together with this patch. > > > > I've got back to this. The ext4_ext_zeroout() calls in > > ext4_split_extent_at() seem to be there as fallback when insertion of a new > > extent fails due to ENOSPC / EDQUOT. If even ext4_ext_zeroout(), then I > > think returning an error as the code does now is correct and we don't have > > much other option. Also we are really running out of disk space so I think > > returning ENOSPC is fine. What exact scenario are you afraid of? > > I am afraid about the EDQUOT from ext4_ext_insert_extent may be overwrite by > ext4_ext_zeroout with ENOSPC. And this may lead to dead loop since > ext4_writepages will retry once get ENOSPC? Maybe I am wrong... OK, so passing back original error instead of the error from ext4_ext_zeroout() makes sense. But I don't think doing much more is needed - firstly, ENOSPC or EDQUOT should not happen in ext4_split_extent_at() called from ext4_writepages() because we should have reserved enough space for extent splits when writing data. So hitting that is already unexpected. Committing transaction holding blocks that are expected to be free is the most likely reason for us seeing ENOSPC and returning EIO in that case would be bug. Secondly, returning EIO instead of ENOSPC is IMO a bit confusing for upper layers and makes it harder to analyze where the real problem is... Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR