Re: [PATCH] ext4: if zeroout fails fall back to splitting the extent node

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2021/11/24 18:37, Jan Kara wrote:
On Wed 24-11-21 17:01:12, yangerkun wrote:
On 2021/11/23 17:27, Jan Kara wrote:
Hello,

On Sun 26-09-21 19:35:01, yangerkun wrote:
Rethink about this problem. Should we consider other place which call
ext4_issue_zeroout? Maybe it can trigger the problem too(in theory, not
really happened)...

How about include follow patch which not only transfer ENOSPC to EIO. But
also stop to overwrite the error return by ext4_ext_insert_extent in
ext4_split_extent_at.

Besides, 308c57ccf431 ("ext4: if zeroout fails fall back to splitting the
extent node") can work together with this patch.

I've got back to this. The ext4_ext_zeroout() calls in
ext4_split_extent_at() seem to be there as fallback when insertion of a new
extent fails due to ENOSPC / EDQUOT. If even ext4_ext_zeroout(), then I
think returning an error as the code does now is correct and we don't have
much other option. Also we are really running out of disk space so I think
returning ENOSPC is fine. What exact scenario are you afraid of?

I am afraid about the EDQUOT from ext4_ext_insert_extent may be overwrite by
ext4_ext_zeroout with ENOSPC. And this may lead to dead loop since
ext4_writepages will retry once get ENOSPC? Maybe I am wrong...

OK, so passing back original error instead of the error from
ext4_ext_zeroout() makes sense. But I don't think doing much more is needed
- firstly, ENOSPC or EDQUOT should not happen in ext4_split_extent_at()
called from ext4_writepages() because we should have reserved enough
space for extent splits when writing data. So hitting that is already

ext4_da_write_begin
  ext4_da_get_block_prep
    ext4_insert_delayed_block
      ext4_da_reserve_space

It seems we will only reserve space for data, no for metadata...


unexpected. Committing transaction holding blocks that are expected to be
free is the most likely reason for us seeing ENOSPC and returning EIO in
that case would be bug.

Agree. EIO from ext4_ext_zeroout that overwrite the ENOSPC from
ext4_ext_insert_extent seems buggy too. Maybe we should ignore the error
from ext4_ext_zeroout and return the error from ext4_ext_insert_extent
once ext4_ext_zeroout in ext4_split_extent_at got a error. Something
like this:

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 0ecf819bf189..56cc00ee42a1 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3185,6 +3185,7 @@ static int ext4_split_extent_at(handle_t *handle,
        struct ext4_extent *ex2 = NULL;
        unsigned int ee_len, depth;
        int err = 0;
+       int err1;

BUG_ON((split_flag & (EXT4_EXT_DATA_VALID1 | EXT4_EXT_DATA_VALID2)) ==
               (EXT4_EXT_DATA_VALID1 | EXT4_EXT_DATA_VALID2));
@@ -3255,7 +3256,7 @@ static int ext4_split_extent_at(handle_t *handle,
        if (EXT4_EXT_MAY_ZEROOUT & split_flag) {
if (split_flag & (EXT4_EXT_DATA_VALID1|EXT4_EXT_DATA_VALID2)) {
                        if (split_flag & EXT4_EXT_DATA_VALID1) {
-                               err = ext4_ext_zeroout(inode, ex2);
+                               err1 = ext4_ext_zeroout(inode, ex2);
                                zero_ex.ee_block = ex2->ee_block;
                                zero_ex.ee_len = cpu_to_le16(

ext4_ext_get_actual_len(ex2));
@@ -3270,7 +3271,7 @@ static int ext4_split_extent_at(handle_t *handle,
                                                      ext4_ext_pblock(ex));
                        }
                } else {
-                       err = ext4_ext_zeroout(inode, &orig_ex);
+                       err1 = ext4_ext_zeroout(inode, &orig_ex);
                        zero_ex.ee_block = orig_ex.ee_block;
                        zero_ex.ee_len = cpu_to_le16(

ext4_ext_get_actual_len(&orig_ex));
@@ -3278,7 +3279,7 @@ static int ext4_split_extent_at(handle_t *handle,
                                              ext4_ext_pblock(&orig_ex));
                }

-               if (!err) {
+               if (!err1) {
/* update the extent length and mark as initialized */
                        ex->ee_len = cpu_to_le16(ee_len);
                        ext4_ext_try_to_merge(handle, inode, path, ex);



Secondly, returning EIO instead of ENOSPC is IMO a
bit confusing for upper layers and makes it harder to analyze where the
real problem is...

								Honza




[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux