Re: [PATCH v4 0/6] BTT error clearing rework

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2017-07-31 at 23:15 +0000, Kani, Toshimitsu wrote:
> On Wed, 2017-07-26 at 17:35 -0600, Vishal Verma wrote:
>  :
> > 
> > Clearing errors or badblocks during a BTT write requires sending an
> > ACPI DSM, which means potentially sleeping. Since a BTT IO happens
> > in
> > atomic context (preemption disabled, spinlocks may be held), we
> > cannot perform error clearing in the course of an IO. Due to this
> > error clearing for BTT IOs has hitherto been disabled.
> > 
> > This series fixes these problems by moving the error clearing out of
> > the atomic sections in the BTT.
> > 
> > Also fix a potential deadlock that can occur while clearing errors
> > from either BTT or pmem due to memory allocations in the IO path.
> 
> Hi Vishal,
> 
> I just tested the series (sorry for the delay).  It works nicely when
> doing I/Os to a block device directly.  But I am seeing a lot of write
> errors with filesystem.
> 
> Here is what I did for the testing.
> 
> 1. 'mkfs.ext /dev/pmem0s' and 'mount /dev/pmem0s /mnt/pmem0s'.
> 2. Inject an error to somewhere in the pmem0s device, but not in the
> metadata area at beginning.
> 3. Run the following script.
> ===
> DEV=pmem0s
> set -x
> dd if=/dev/zero of=/mnt/$DEV/1Gfile bs=1M count=1024
> while true; do
> cp /mnt/$DEV/1Gfile /mnt/$DEV/file-1
> cp /mnt/$DEV/1Gfile /mnt/$DEV/file-2
> cp /mnt/$DEV/1Gfile /mnt/$DEV/file-3
> cp /mnt/$DEV/1Gfile /mnt/$DEV/file-4
> cp /mnt/$DEV/1Gfile /mnt/$DEV/file-5
> cp /mnt/$DEV/1Gfile /mnt/$DEV/file-6
> cp /mnt/$DEV/1Gfile /mnt/$DEV/file-7
> cp /mnt/$DEV/1Gfile /mnt/$DEV/file-8
> cp /mnt/$DEV/1Gfile /mnt/$DEV/file-9
> cp /mnt/$DEV/1Gfile /mnt/$DEV/file-10
> done
> ===
> 
> Step 3 clears an error and runs fine with raw and memory modes.  With
> sector mode, however, it ends up with continuous write errors like
> below and does not clear the error.  Do you have any thoughts?
> 
>  EXT4-fs warning (device pmem0s): ext4_end_bio:322: I/O error 10
> writing to inode 17 (offset 1023410176 size 8388608 starting block
> 1834752)
>  Buffer I/O error on device pmem0s, logical block 1834752
>  Buffer I/O error on device pmem0s, logical block 1834753
>  Buffer I/O error on device pmem0s, logical block 1834754
>  :
>  nd_pmem btt0.0: io error in WRITE sector 14680064, len 4096,
>  EXT4-fs warning (device pmem0s): ext4_end_bio:322: I/O error 10
> writing to inode 17 (offset 1031798784 size 1052672 starting block
> 1835008)
>  nd_pmem btt0.0: io error in WRITE sector 14682112, len 4096,
>  EXT4-fs warning (device pmem0s): ext4_end_bio:322: I/O error 10
> writing to inode 17 (offset 1031798784 size 2101248 starting block
> 1835264)
>  :
>  nd_pmem btt0.0: io error in WRITE sector 14698496, len 4096,
>  nd_pmem btt0.0: io error in WRITE sector 14700544, len 4096,
>  nd_pmem btt0.0: io error in WRITE sector 14702592, len 4096,
>  nd_pmem btt0.0: io error in WRITE sector 14704640, len 4096,
>  :

Thanks for the test Toshi, I will try and reproduce it.
My first guess is - are the injected errors potentially in the BTT
metadata area towards the end?

->rw_bytes can only clear errors on properly aligned writes, and the btt
metadata writes will be too small to clear metadata errors..

> 
> Thanks,
> -Toshi��.n��������+%������w��{.n�����{�����ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f




[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux