Re: [PATCH v5 3/4] md: raid10 add nowait support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/16/21 9:45 AM, Vishal Verma wrote:
> 
> On 12/16/21 9:42 AM, Jens Axboe wrote:
>> On 12/15/21 5:30 PM, Vishal Verma wrote:
>>> On 12/15/21 3:20 PM, Vishal Verma wrote:
>>>> On 12/15/21 1:42 PM, Song Liu wrote:
>>>>> On Tue, Dec 14, 2021 at 10:09 PM Vishal Verma
>>>>> <vverma@xxxxxxxxxxxxxxxx> wrote:
>>>>>> This adds nowait support to the RAID10 driver. Very similar to
>>>>>> raid1 driver changes. It makes RAID10 driver return with EAGAIN
>>>>>> for situations where it could wait for eg:
>>>>>>
>>>>>> - Waiting for the barrier,
>>>>>> - Too many pending I/Os to be queued,
>>>>>> - Reshape operation,
>>>>>> - Discard operation.
>>>>>>
>>>>>> wait_barrier() fn is modified to return bool to support error for
>>>>>> wait barriers. It returns true in case of wait or if wait is not
>>>>>> required and returns false if wait was required but not performed
>>>>>> to support nowait.
>>>>>>
>>>>>> Signed-off-by: Vishal Verma <vverma@xxxxxxxxxxxxxxxx>
>>>>>> ---
>>>>>>    drivers/md/raid10.c | 57
>>>>>> +++++++++++++++++++++++++++++++++++----------
>>>>>>    1 file changed, 45 insertions(+), 12 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
>>>>>> index dde98f65bd04..f6c73987e9ac 100644
>>>>>> --- a/drivers/md/raid10.c
>>>>>> +++ b/drivers/md/raid10.c
>>>>>> @@ -952,11 +952,18 @@ static void lower_barrier(struct r10conf *conf)
>>>>>>           wake_up(&conf->wait_barrier);
>>>>>>    }
>>>>>>
>>>>>> -static void wait_barrier(struct r10conf *conf)
>>>>>> +static bool wait_barrier(struct r10conf *conf, bool nowait)
>>>>>>    {
>>>>>>           spin_lock_irq(&conf->resync_lock);
>>>>>>           if (conf->barrier) {
>>>>>>                   struct bio_list *bio_list = current->bio_list;
>>>>>> +
>>>>>> +               /* Return false when nowait flag is set */
>>>>>> +               if (nowait) {
>>>>>> + spin_unlock_irq(&conf->resync_lock);
>>>>>> +                       return false;
>>>>>> +               }
>>>>>> +
>>>>>>                   conf->nr_waiting++;
>>>>>>                   /* Wait for the barrier to drop.
>>>>>>                    * However if there are already pending
>>>>>> @@ -988,6 +995,7 @@ static void wait_barrier(struct r10conf *conf)
>>>>>>           }
>>>>>>           atomic_inc(&conf->nr_pending);
>>>>>>           spin_unlock_irq(&conf->resync_lock);
>>>>>> +       return true;
>>>>>>    }
>>>>>>
>>>>>>    static void allow_barrier(struct r10conf *conf)
>>>>>> @@ -1101,17 +1109,25 @@ static void raid10_unplug(struct blk_plug_cb
>>>>>> *cb, bool from_schedule)
>>>>>>    static void regular_request_wait(struct mddev *mddev, struct
>>>>>> r10conf *conf,
>>>>>>                                    struct bio *bio, sector_t sectors)
>>>>>>    {
>>>>>> -       wait_barrier(conf);
>>>>>> +       /* Bail out if REQ_NOWAIT is set for the bio */
>>>>>> +       if (!wait_barrier(conf, bio->bi_opf & REQ_NOWAIT)) {
>>>>>> +               bio_wouldblock_error(bio);
>>>>>> +               return;
>>>>>> +       }
>>>>> I think we also need regular_request_wait to return bool and handle
>>>>> it properly.
>>>>>
>>>>> Thanks,
>>>>> Song
>>>>>
>>>> Ack, will fix it. Thanks!
>>> Ran into this while running with io_uring. With the current v5 (raid10
>>> patch) on top of md-next branch.
>>> ./t/io_uring -a 0 -d 256 </dev/raid10>
>>>
>>> It didn't trigger with aio (-a 1)
>>>
>>> [  248.128661] BUG: kernel NULL pointer dereference, address:
>>> 00000000000000b8
>>> [  248.135628] #PF: supervisor read access in kernel mode
>>> [  248.140762] #PF: error_code(0x0000) - not-present page
>>> [  248.145903] PGD 0 P4D 0
>>> [  248.148443] Oops: 0000 [#1] PREEMPT SMP NOPTI
>>> [  248.152800] CPU: 49 PID: 9461 Comm: io_uring Kdump: loaded Not
>>> tainted 5.16.0-rc3+ #2
>>> [  248.160629] Hardware name: Dell Inc. PowerEdge R650xs/0PPTY2, BIOS
>>> 1.3.8 08/31/2021
>>> [  248.168279] RIP: 0010:raid10_end_read_request+0x74/0x140 [raid10]
>>> [  248.174373] Code: 48 60 48 8b 58 58 48 c1 e2 05 49 03 55 08 48 89 4a
>>> 10 40 84 f6 75 48 f0 41 80 4c 24 18 01 4c 89 e7 e8 e0 b8 ff ff 49 8b 4d
>>> 00 <48> 8b 83 b8 00 00 00 f0 ff 8b f0 00 00 00 0f 94 c2 a8 01 74 04 84
>>> [  248.193120] RSP: 0018:ffffb1c38d598ce8 EFLAGS: 00010086
>>> [  248.198344] RAX: ffff8e5da2a1a100 RBX: 0000000000000000 RCX:
>>> ffff8e5d89747000
>>> [  248.205479] RDX: 000000008040003a RSI: 0000000080400039 RDI:
>>> ffff8e1e00044900
>>> [  248.212611] RBP: ffffb1c38d598d30 R08: 0000000000000000 R09:
>>> 0000000000000001
>>> [  248.219744] R10: ffff8e5da2a1ae00 R11: 000000411bab9000 R12:
>>> ffff8e5da2a1ae00
>>> [  248.226877] R13: ffff8e5d8973fc00 R14: 0000000000000000 R15:
>>> 0000000000001000
>>> [  248.234009] FS:  00007fc26b07d700(0000) GS:ffff8e9c6e600000(0000)
>>> knlGS:0000000000000000
>>> [  248.242096] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [  248.247843] CR2: 00000000000000b8 CR3: 00000040b25d4005 CR4:
>>> 0000000000770ee0
>>> [  248.254973] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>>> 0000000000000000
>>> [  248.262107] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
>>> 0000000000000400
>>> [  248.269240] PKRU: 55555554
>>> [  248.271953] Call Trace:
>>> [  248.274406]  <IRQ>
>>> [  248.276425]  bio_endio+0xf6/0x170
>>> [  248.279743]  blk_update_request+0x12d/0x470
>>> [  248.283931]  ? sbitmap_queue_clear_batch+0xc7/0x110
>>> [  248.288809]  blk_mq_end_request_batch+0x76/0x490
>>> [  248.293429]  ? dma_direct_unmap_sg+0xdd/0x1a0
>>> [  248.297786]  ? smp_call_function_single_async+0x46/0x70
>>> [  248.303015]  ? mempool_kfree+0xe/0x10
>>> [  248.306680]  ? mempool_kfree+0xe/0x10
>>> [  248.310345]  nvme_pci_complete_batch+0x26/0xb0
>>> [  248.314792]  nvme_irq+0x298/0x2f0
>>> [  248.318110]  ? nvme_unmap_data+0xf0/0xf0
>>> [  248.322038]  __handle_irq_event_percpu+0x3f/0x190
>>> [  248.326744]  handle_irq_event_percpu+0x33/0x80
>>> [  248.331190]  handle_irq_event+0x39/0x60
>>> [  248.335028]  handle_edge_irq+0xbe/0x1e0
>>> [  248.338869]  __common_interrupt+0x6b/0x110
>>> [  248.342967]  common_interrupt+0xbd/0xe0
>>> [  248.346808]  </IRQ>
>>> [  248.348912]  <TASK>
>>> [  248.351018]  asm_common_interrupt+0x1e/0x40
>>> [  248.355206] RIP: 0010:_raw_spin_unlock_irqrestore+0x1e/0x37
>>> [  248.360780] Code: 02 5d c3 0f 1f 44 00 00 5d c3 66 90 0f 1f 44 00 00
>>> 55 48 89 e5 c6 07 00 0f 1f 40 00 f7 c6 00 02 00 00 74 01 fb bf 01 00 00
>>> 00 <e8> ed 8e 5b ff 65 8b 05 66 7e 52 78 85 c0 74 02 5d c3 0f 1f 44 00
>>>
>>> [  248.379525] RSP: 0018:ffffb1c3a429b958 EFLAGS: 00000206
>>> [  248.384749] RAX: 0000000000000001 RBX: ffff8e5d8973fd08 RCX:
>>> ffff8e5d8973fd10
>>> [  248.391884] RDX: 0000000000000001 RSI: 0000000000000246 RDI:
>>> 0000000000000001
>>> [  248.399017] RBP: ffffb1c3a429b958 R08: 0000000000000000 R09:
>>> ffffb1c3a429b970
>>> [  248.406148] R10: 0000000000000c00 R11: 0000000000000001 R12:
>>> 0000000000000001
>>> [  248.413280] R13: 0000000000000246 R14: 0000000000000000 R15:
>>> 0000000000000003
>>> [  248.420415]  __wake_up_common_lock+0x8a/0xc0
>>> [  248.424686]  __wake_up+0x13/0x20
>>> [  248.427919]  raid10_make_request+0x101/0x170 [raid10]
>>> [  248.432971]  md_handle_request+0x179/0x1e0
>>> [  248.437071]  ? submit_bio_checks+0x1f6/0x5a0
>>> [  248.441345]  md_submit_bio+0x6d/0xa0
>>> [  248.444924]  __submit_bio+0x94/0x140
>>> [  248.448504]  submit_bio_noacct+0xe1/0x2a0
>>> [  248.452515]  submit_bio+0x48/0x120
>>> [  248.455923]  blkdev_direct_IO+0x220/0x540
>>> [  248.459935]  ? __fsnotify_parent+0xff/0x330
>>> [  248.464121]  ? __fsnotify_parent+0x10f/0x330
>>> [  248.468393]  ? common_interrupt+0x73/0xe0
>>> [  248.472408]  generic_file_read_iter+0xa5/0x160
>>> [  248.476852]  blkdev_read_iter+0x38/0x70
>>> [  248.480693]  io_read+0x119/0x420
>>> [  248.483923]  ? sbitmap_queue_clear_batch+0xc7/0x110
>>> [  248.488805]  ? blk_mq_end_request_batch+0x378/0x490
>>> [  248.493684]  io_issue_sqe+0x7ec/0x19c0
>>> [  248.497436]  ? io_req_prep+0x6a9/0xe60
>>> [  248.501190]  io_submit_sqes+0x2a0/0x9f0
>>> [  248.505030]  ? __fget_files+0x6a/0x90
>>> [  248.508693]  __x64_sys_io_uring_enter+0x1da/0x8c0
>>> [  248.513401]  do_syscall_64+0x38/0x90
>>> [  248.516979]  entry_SYSCALL_64_after_hwframe+0x44/0xae
>>> [  248.522033] RIP: 0033:0x7fc26b19b89d
>>> [  248.525611] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa
>>> 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f
>>> 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c3 f5 0c 00 f7 d8 64 89 01 48
>>> [  248.544360] RSP: 002b:00007fc26b07ce98 EFLAGS: 00000246 ORIG_RAX:
>>> 00000000000001aa
>>> [  248.551925] RAX: ffffffffffffffda RBX: 00007fc26b3f2fc0 RCX:
>>> 00007fc26b19b89d
>>> [  248.559058] RDX: 0000000000000020 RSI: 0000000000000020 RDI:
>>> 0000000000000004
>>> [  248.566189] RBP: 0000000000000020 R08: 0000000000000000 R09:
>>> 0000000000000000
>>> [  248.573322] R10: 0000000000000001 R11: 0000000000000246 R12:
>>> 00005623a4b7a2a0
>>> [  248.580456] R13: 0000000000000020 R14: 0000000000000020 R15:
>>> 0000000000000020
>>> [  248.587591]  </TASK>
>> Do you have:
>>
>> commit 75feae73a28020e492fbad2323245455ef69d687
>> Author: Pavel Begunkov <asml.silence@xxxxxxxxx>
>> Date:   Tue Dec 7 20:16:36 2021 +0000
>>
>>      block: fix single bio async DIO error handling
>>
>> in your tree?
>>
> Nope. I will get it in and test. Thanks!

Might be worth re-running with KASAN enabled in your config to see if
that triggers anything.

-- 
Jens Axboe




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux