Re: pvresize will cause a meta-data corruption with error message "Error writing device at 4096 length 512"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



For the issue in bcache_flush, it's related with cache->errored.

I give my fix. I believe there should have better solution than my.

Solution:
To keep cache->errored, but this list only use to save error data,
and the error data never resend.
So bcache_flush check the cache->errored, when the errored list is not empty,
bcache_flush return false, it will trigger caller/upper to do the clean jobs.

```
commit 17e959c0ba58edc67b6caa7669444ecffa40a16f (HEAD -> master)
Author: Zhao Heming <heming.zhao@xxxxxxxx>
Date:   Mon Oct 14 10:57:54 2019 +0800

     The fd in cache->errored may already be closed before calling bcache_flush,
     so bcache_flush shouldn't rewrite data in cache->errored. Currently
     solution is return error to caller when cache->errored is not empty, and
     caller should do all the clean jobs.
     
     Signed-off-by: Zhao Heming <heming.zhao@xxxxxxxx>

diff --git a/lib/device/bcache.c b/lib/device/bcache.c
index cfe01bac2f..2eb3f0ee34 100644
--- a/lib/device/bcache.c
+++ b/lib/device/bcache.c
@@ -897,16 +897,20 @@ static bool _wait_io(struct bcache *cache)
   * High level IO handling
   *--------------------------------------------------------------*/
  
-static void _wait_all(struct bcache *cache)
+static bool _wait_all(struct bcache *cache)
  {
+       bool ret = true;
         while (!dm_list_empty(&cache->io_pending))
-               _wait_io(cache);
+               ret = _wait_io(cache);
+       return ret;
  }
  
-static void _wait_specific(struct block *b)
+static bool _wait_specific(struct block *b)
  {
+       bool ret = true;
         while (_test_flags(b, BF_IO_PENDING))
-               _wait_io(b->cache);
+               ret = _wait_io(b->cache);
+       return ret;
  }
  
  static unsigned _writeback(struct bcache *cache, unsigned count)
@@ -1262,10 +1266,7 @@ void bcache_put(struct block *b)
  
  bool bcache_flush(struct bcache *cache)
  {
-       // Only dirty data is on the errored list, since bad read blocks get
-       // recycled straight away.  So we put these back on the dirty list, and
-       // try and rewrite everything.
-       dm_list_splice(&cache->dirty, &cache->errored);
+       bool ret = true;
  
         while (!dm_list_empty(&cache->dirty)) {
                 struct block *b = dm_list_item(_list_pop(&cache->dirty), struct block);
@@ -1275,11 +1276,18 @@ bool bcache_flush(struct bcache *cache)
                 }
  
                 _issue_write(b);
+               if (b->error) ret = false;
         }
  
-       _wait_all(cache);
+       ret = _wait_all(cache);
  
-       return dm_list_empty(&cache->errored);
+       // merge the errored list to dirty, return false to trigger caller to
+       // clean them.
+       if (!dm_list_empty(&cache->errored)) {
+               dm_list_splice(&cache->dirty, &cache->errored);
+               ret = false;
+       }
+       return ret;
  }
  
  //----------------------------------------------------------------
```


_______________________________________________
linux-lvm mailing list
linux-lvm@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/



[Index of Archives]     [Gluster Users]     [Kernel Development]     [Linux Clusters]     [Device Mapper]     [Security]     [Bugtraq]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]

  Powered by Linux