Re: [PATCH] ext4: if zeroout fails fall back to splitting the extent node

yangerkun <yangerkun@xxxxxxxxxx> · Wed, 24 Nov 2021 17:01:12 +0800

On 2021/11/23 17:27, Jan Kara wrote:
Hello,

On Sun 26-09-21 19:35:01, yangerkun wrote:
Rethink about this problem. Should we consider other place which call
ext4_issue_zeroout? Maybe it can trigger the problem too(in theory, not
really happened)...

How about include follow patch which not only transfer ENOSPC to EIO. But
also stop to overwrite the error return by ext4_ext_insert_extent in
ext4_split_extent_at.

Besides, 308c57ccf431 ("ext4: if zeroout fails fall back to splitting the
extent node") can work together with this patch.

I've got back to this. The ext4_ext_zeroout() calls in
ext4_split_extent_at() seem to be there as fallback when insertion of a new
extent fails due to ENOSPC / EDQUOT. If even ext4_ext_zeroout(), then I
think returning an error as the code does now is correct and we don't have
much other option. Also we are really running out of disk space so I think
returning ENOSPC is fine. What exact scenario are you afraid of?

I am afraid about the EDQUOT from ext4_ext_insert_extent may be 
overwrite by ext4_ext_zeroout with ENOSPC. And this may lead to dead 
loop since ext4_writepages will retry once get ENOSPC? Maybe I am wrong...


								Honza

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index c0de30f25185..66767ede235f 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3218,16 +3218,18 @@ static int ext4_split_extent_at(handle_t *handle,
                 goto out;

         if (EXT4_EXT_MAY_ZEROOUT & split_flag) {
+               int ret = 0;
+
                 if (split_flag &
(EXT4_EXT_DATA_VALID1|EXT4_EXT_DATA_VALID2)) {
                         if (split_flag & EXT4_EXT_DATA_VALID1) {
-                               err = ext4_ext_zeroout(inode, ex2);
+                               ret = ext4_ext_zeroout(inode, ex2);
                                 zero_ex.ee_block = ex2->ee_block;
                                 zero_ex.ee_len = cpu_to_le16(

ext4_ext_get_actual_len(ex2));
                                 ext4_ext_store_pblock(&zero_ex,

ext4_ext_pblock(ex2));
                         } else {
-                               err = ext4_ext_zeroout(inode, ex);
+                               ret = ext4_ext_zeroout(inode, ex);
                                 zero_ex.ee_block = ex->ee_block;
                                 zero_ex.ee_len = cpu_to_le16(

ext4_ext_get_actual_len(ex));
@@ -3235,7 +3237,7 @@ static int ext4_split_extent_at(handle_t *handle,
                                                       ext4_ext_pblock(ex));
                         }
                 } else {
-                       err = ext4_ext_zeroout(inode, &orig_ex);
+                       ret = ext4_ext_zeroout(inode, &orig_ex);
                         zero_ex.ee_block = orig_ex.ee_block;
                         zero_ex.ee_len = cpu_to_le16(

ext4_ext_get_actual_len(&orig_ex));
@@ -3243,7 +3245,7 @@ static int ext4_split_extent_at(handle_t *handle,
                                               ext4_ext_pblock(&orig_ex));
                 }

-               if (!err) {
+               if (!ret) {
                         /* update the extent length and mark as initialized
*/
                         ex->ee_len = cpu_to_le16(ee_len);
                         ext4_ext_try_to_merge(handle, inode, path, ex);
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index d18852d6029c..95b970581864 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -427,6 +427,9 @@ int ext4_issue_zeroout(struct inode *inode, ext4_lblk_t
lblk, ext4_fsblk_t pblk,
         if (ret > 0)
                 ret = 0;

+       if (ret == -ENOSPC)
+               ret = -EIO;
+
         return ret;
  }



在 2021/8/14 5:27, Theodore Ts'o 写道:
If the underlying storage device is using thin-provisioning, it's
possible for a zeroout operation to return ENOSPC.

Commit df22291ff0fd ("ext4: Retry block allocation if we have free blocks
left") added logic to retry block allocation since we might get free block
after we commit a transaction. But the ENOSPC from thin-provisioning
will confuse ext4, and lead to an infinite loop.

Since using zeroout instead of splitting the extent node is an
optimization, if it fails, we might as well fall back to splitting the
extent node.

Reported-by: yangerkun <yangerkun@xxxxxxxxxx>
Signed-off-by: Theodore Ts'o <tytso@xxxxxxx>
---

I've run this through my battery of tests, and it doesn't cause any
regressions.  Yangerkun, can you test this and see if this works for
you?

   fs/ext4/extents.c | 5 +++--
   1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 92ad64b89d9b..501516cadc1b 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3569,7 +3569,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle,
   				split_map.m_len - ee_block);
   			err = ext4_ext_zeroout(inode, &zero_ex1);
   			if (err)
-				goto out;
+				goto fallback;
   			split_map.m_len = allocated;
   		}
   		if (split_map.m_lblk - ee_block + split_map.m_len <
@@ -3583,7 +3583,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle,
   						      ext4_ext_pblock(ex));
   				err = ext4_ext_zeroout(inode, &zero_ex2);
   				if (err)
-					goto out;
+					goto fallback;
   			}
   			split_map.m_len += split_map.m_lblk - ee_block;
@@ -3592,6 +3592,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle,
   		}
   	}
+fallback:
   	err = ext4_split_extent(handle, inode, ppath, &split_map, split_flag,
   				flags);
   	if (err > 0)