Re: [PATCH V3 1/2] RAID1: a new I/O barrier implementation to remove resync window

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2017/2/25 上午1:17, Shaohua Li wrote:
> On Sat, Feb 25, 2017 at 01:06:22AM +0800, Coly Li wrote:
>> On 2017/2/24 上午7:14, NeilBrown wrote:
>>> On Thu, Feb 23 2017, Coly Li wrote:
>>>
>>>>
>>>> I tried to set up a 4 layer stacked md raid1, and reduce I/O
>>>> barrier bucket size to 8MB, running for 10 hours, there is no
>>>> deadlock observed,
>>>
>>> Try setting BARRIER_BUCKETS_NR to '1' and BARRIER_UNIT_SECTOR_BITS
>>> to 3 and make sure the write requests are larger than 1 page (and
>>> have resync happen at the same time as writes).
>>
>> Hi Neil,
>>
>> Yes, the above method triggers deadlock easily. After come to
>> understand how bios are handled in stacked raid1 and the relationship
>> between current->bio_list, plug->pending and conf->pending_bio_list, I
>> think I come to understand what you worried and the meaning of your fix.
>>
>> I totally agree and understand there will be hash conflict sooner or
>> later now. Yes we need this fix.
>>
>> Thanks to you and Shaohua, explaining the details to me, and help me
>> to catch up your mind :-)
> 
> I'm confused. So the deadlock is real? How is it triggered?

Let me explain,

There is no deadlock now, because,
1) If there is hash conflict existing, a deadlock is possible.
2) In current Linux kernel, hash conflict won't happen in real life
   2.1) regular bio maximum size is 2MB, it can only be split into 2
bios in raid1_make_request() of my new I/O barrier patch
   2.2) DISCARD bio maximum size is 4GB, it can be split into 65 bios in
raid1_make_request() of my new I/O barrier patch.
   2.3) I verified that, for any consecutive  512 integers in [0,
1<<63], there is no hash conflict by calling sector_to_idx().
   2.4) Currently there is almost no device provides LBA range exceeds
(1<<63) bytes. So in current Linux kernel with my new I/O barrier patch,
no dead lock will happen. The patch in current Linux kernel is deadlock
clean from all conditions we discussed before.

The reason why I suggest to still have your patch is,
1) If one day bi_iter.bi_size is extended from 32bit (unsigned int) to
64bit (unsigned long), a DISCARD bio will be split to more than 1024
smaller bios.
2) If a DISCARD bio is split into more then 1024 smaller bios, that
means sector_to_idx() is called by 1024+ consecutive integers. It is
100% possible to have hash conflict.
3) If hash conflict exists, the deadlock described by Neil will be passible.


What I mean is, currently there is no deadlock, because bi_iter.bi_size
is 32 bit; if in future bi_iter.bi_size extended to 64 bit, we will have
deadlock. Your fix almost does not hurt performance, improves code
readability and will avoid a potential deadlock in future (when
bi_iter.bi_size extended to 64 bit), why not have it ?

Coly
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux