Re: Bad/Unaligned block number requested

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2018/5/8 6:41 AM, Eric Wheeler wrote:
> On Sun, 6 May 2018, Coly Li wrote:
> 
>> On 2018/5/6 3:28 AM, Eric Wheeler wrote:
>>> On Sun, 6 May 2018, Coly Li wrote:
>>>
>>>> On 2018/4/27 4:32 AM, Eric Wheeler wrote:
>>>>>> Hi Coly,
>>>>>>
>>>>>> I'm sure you've been busy with the v4.17 merge but I thought I 
>>>>>> would check in:  
>>>>>>
>>>>>> Have you had a chance to look at this?  It is an opportunity to fix this 
>>>>>> 4k bug for the future since we can still reproduce the error.
>>>>>>
>>>>>>>>> bcache: bch_count_io_errors() dm-6: IO error on reading from cache, recovering <<<
>>>>>>
>>>>>> What do you think, is there data corruption exposure here since 4.1.49 
>>>>>> still has the dirty-cache-recorvery bug? 
>>>>>
>>>>> I just noticed that "bcache: only permit to recovery read error when cache 
>>>>> device is clean" is in v4.1.49, but would this recover gracefully in the 
>>>>> 4k error situation?
>>>>>
>>>>>> Also, would your failure-recovery patch series address this type of 
>>>>>> failure?
>>>>
>>>> Hi Eric,
>>>>
>>>> I just find a time slot to compose a patch checking 4K alignment of I/Os
>>>> to backing device. After testing and first glance at the messages, I
>>>> will post it out.
>>>>
>>>
>>> Thank you Coly, I appreciate your help!  
>>
>> Hi Eric,
>>
>> Please check the attached patch, it checks bcache key offset, if the
>> offset is not 4KB aligned, a call trace will be printed out.
>>
>> I also run it with my own hardware, here is some information I may share.
>>
>> It seems bcache just tries to cache bio with any offset, no 4K alignment
>> required. When I use fio with directIO, non-4K aligned bio can be sent
>> into bcache code and it is just cached.
>>
>> I can see numerous call trace when I use directIO with 512B/1KB/2KB
>> block size. But if I use 4KB block size in fio, or set block alignment
>> to 4KB, no warning call trace printed, not at all.
>>
>> Then when I set block alignment to 2K in fio, even blocksize is 4KB,  I
>> can see non-4k-aligned warning.
>>
>> Therefore I guess the most probably reason is, the upper layer code
>> sends non-4k-aligned bio into bcache code.
>>
>> I also tried to set bcache block size to 4K with make-bcache -w, when
>> fio blocksize >= 4KB, no non-4k-aligned warning. But fio does not work
>> if its blocksize < bcache block size, I am not sure whether setting
>> bcache block size to 4K works to your situation.
>>
> 
> Hi Coly,
> 
> We are already running 4k blocks with bcache. We tried your new patch, but 
> there aren't any new backtraces. This is the only information we get:
> 
> [ 1174.675229] sd 0:0:0:1: [sdb] Unaligned block number requested: sector_size=4096, block=15724561783, blk_rq=41
> [ 1174.676077] sd 0:0:0:2: [sdc] Unaligned block number requested: sector_size=4096, block=353041024, blk_rq=23
> [ 1174.676958] bcache: bch_count_io_errors() dm-6: IO error on reading from cache, recovering
> [ 1174.710818] block drbd8065: read: error=-5 s=19281408s
> [ 1174.711654] block drbd8065: Local IO failed in drbd_endio_read_sec_final.
> [ 1175.278348] sd 0:0:0:1: [sdb] Unaligned block number requested: sector_size=4096, block=15725385535, blk_rq=97
> [ 1175.279612] sd 0:0:0:2: [sdc] Unaligned block number requested: sector_size=4096, block=387084760, blk_rq=31
> [ 1175.280834] bcache: bch_count_io_errors() dm-6: IO error on reading from cache, recovering
> [ 1175.282232] block drbd8065: read: error=-5 s=19391360s
> [ 1175.283335] block drbd8065: Local IO failed in drbd_endio_read_sec_final.
> 
> 
> Note that we run this without your original patch, so the original 
> backtrace still stands.
> 
> Does this mean there is something in my cache that is less than 4k?

Hi Eric,

The last patch I posted was to check LBA alignment for requests to
backing device. It seems all the checking places does not find
non-4k-aligned LBA.

Let me compose an update version, to check LBA address 4k alignment when
bio submitting to backing device. Then let's see what will happen.

Thanks for the information !

Coly Li
--
To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux ARM Kernel]     [Linux Filesystem Development]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux