Re: [PATCH v5 3/4] md: raid10 add nowait support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/15/21 5:30 PM, Vishal Verma wrote:
> 
> On 12/15/21 3:20 PM, Vishal Verma wrote:
>>
>> On 12/15/21 1:42 PM, Song Liu wrote:
>>> On Tue, Dec 14, 2021 at 10:09 PM Vishal Verma 
>>> <vverma@xxxxxxxxxxxxxxxx> wrote:
>>>> This adds nowait support to the RAID10 driver. Very similar to
>>>> raid1 driver changes. It makes RAID10 driver return with EAGAIN
>>>> for situations where it could wait for eg:
>>>>
>>>> - Waiting for the barrier,
>>>> - Too many pending I/Os to be queued,
>>>> - Reshape operation,
>>>> - Discard operation.
>>>>
>>>> wait_barrier() fn is modified to return bool to support error for
>>>> wait barriers. It returns true in case of wait or if wait is not
>>>> required and returns false if wait was required but not performed
>>>> to support nowait.
>>>>
>>>> Signed-off-by: Vishal Verma <vverma@xxxxxxxxxxxxxxxx>
>>>> ---
>>>>   drivers/md/raid10.c | 57 
>>>> +++++++++++++++++++++++++++++++++++----------
>>>>   1 file changed, 45 insertions(+), 12 deletions(-)
>>>>
>>>> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
>>>> index dde98f65bd04..f6c73987e9ac 100644
>>>> --- a/drivers/md/raid10.c
>>>> +++ b/drivers/md/raid10.c
>>>> @@ -952,11 +952,18 @@ static void lower_barrier(struct r10conf *conf)
>>>>          wake_up(&conf->wait_barrier);
>>>>   }
>>>>
>>>> -static void wait_barrier(struct r10conf *conf)
>>>> +static bool wait_barrier(struct r10conf *conf, bool nowait)
>>>>   {
>>>>          spin_lock_irq(&conf->resync_lock);
>>>>          if (conf->barrier) {
>>>>                  struct bio_list *bio_list = current->bio_list;
>>>> +
>>>> +               /* Return false when nowait flag is set */
>>>> +               if (nowait) {
>>>> + spin_unlock_irq(&conf->resync_lock);
>>>> +                       return false;
>>>> +               }
>>>> +
>>>>                  conf->nr_waiting++;
>>>>                  /* Wait for the barrier to drop.
>>>>                   * However if there are already pending
>>>> @@ -988,6 +995,7 @@ static void wait_barrier(struct r10conf *conf)
>>>>          }
>>>>          atomic_inc(&conf->nr_pending);
>>>>          spin_unlock_irq(&conf->resync_lock);
>>>> +       return true;
>>>>   }
>>>>
>>>>   static void allow_barrier(struct r10conf *conf)
>>>> @@ -1101,17 +1109,25 @@ static void raid10_unplug(struct blk_plug_cb 
>>>> *cb, bool from_schedule)
>>>>   static void regular_request_wait(struct mddev *mddev, struct 
>>>> r10conf *conf,
>>>>                                   struct bio *bio, sector_t sectors)
>>>>   {
>>>> -       wait_barrier(conf);
>>>> +       /* Bail out if REQ_NOWAIT is set for the bio */
>>>> +       if (!wait_barrier(conf, bio->bi_opf & REQ_NOWAIT)) {
>>>> +               bio_wouldblock_error(bio);
>>>> +               return;
>>>> +       }
>>> I think we also need regular_request_wait to return bool and handle 
>>> it properly.
>>>
>>> Thanks,
>>> Song
>>>
>> Ack, will fix it. Thanks!
> 
> Ran into this while running with io_uring. With the current v5 (raid10 
> patch) on top of md-next branch.
> ./t/io_uring -a 0 -d 256 </dev/raid10>
> 
> It didn't trigger with aio (-a 1)
> 
> [  248.128661] BUG: kernel NULL pointer dereference, address: 
> 00000000000000b8
> [  248.135628] #PF: supervisor read access in kernel mode
> [  248.140762] #PF: error_code(0x0000) - not-present page
> [  248.145903] PGD 0 P4D 0
> [  248.148443] Oops: 0000 [#1] PREEMPT SMP NOPTI
> [  248.152800] CPU: 49 PID: 9461 Comm: io_uring Kdump: loaded Not 
> tainted 5.16.0-rc3+ #2
> [  248.160629] Hardware name: Dell Inc. PowerEdge R650xs/0PPTY2, BIOS 
> 1.3.8 08/31/2021
> [  248.168279] RIP: 0010:raid10_end_read_request+0x74/0x140 [raid10]
> [  248.174373] Code: 48 60 48 8b 58 58 48 c1 e2 05 49 03 55 08 48 89 4a 
> 10 40 84 f6 75 48 f0 41 80 4c 24 18 01 4c 89 e7 e8 e0 b8 ff ff 49 8b 4d 
> 00 <48> 8b 83 b8 00 00 00 f0 ff 8b f0 00 00 00 0f 94 c2 a8 01 74 04 84
> [  248.193120] RSP: 0018:ffffb1c38d598ce8 EFLAGS: 00010086
> [  248.198344] RAX: ffff8e5da2a1a100 RBX: 0000000000000000 RCX: 
> ffff8e5d89747000
> [  248.205479] RDX: 000000008040003a RSI: 0000000080400039 RDI: 
> ffff8e1e00044900
> [  248.212611] RBP: ffffb1c38d598d30 R08: 0000000000000000 R09: 
> 0000000000000001
> [  248.219744] R10: ffff8e5da2a1ae00 R11: 000000411bab9000 R12: 
> ffff8e5da2a1ae00
> [  248.226877] R13: ffff8e5d8973fc00 R14: 0000000000000000 R15: 
> 0000000000001000
> [  248.234009] FS:  00007fc26b07d700(0000) GS:ffff8e9c6e600000(0000) 
> knlGS:0000000000000000
> [  248.242096] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  248.247843] CR2: 00000000000000b8 CR3: 00000040b25d4005 CR4: 
> 0000000000770ee0
> [  248.254973] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
> 0000000000000000
> [  248.262107] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
> 0000000000000400
> [  248.269240] PKRU: 55555554
> [  248.271953] Call Trace:
> [  248.274406]  <IRQ>
> [  248.276425]  bio_endio+0xf6/0x170
> [  248.279743]  blk_update_request+0x12d/0x470
> [  248.283931]  ? sbitmap_queue_clear_batch+0xc7/0x110
> [  248.288809]  blk_mq_end_request_batch+0x76/0x490
> [  248.293429]  ? dma_direct_unmap_sg+0xdd/0x1a0
> [  248.297786]  ? smp_call_function_single_async+0x46/0x70
> [  248.303015]  ? mempool_kfree+0xe/0x10
> [  248.306680]  ? mempool_kfree+0xe/0x10
> [  248.310345]  nvme_pci_complete_batch+0x26/0xb0
> [  248.314792]  nvme_irq+0x298/0x2f0
> [  248.318110]  ? nvme_unmap_data+0xf0/0xf0
> [  248.322038]  __handle_irq_event_percpu+0x3f/0x190
> [  248.326744]  handle_irq_event_percpu+0x33/0x80
> [  248.331190]  handle_irq_event+0x39/0x60
> [  248.335028]  handle_edge_irq+0xbe/0x1e0
> [  248.338869]  __common_interrupt+0x6b/0x110
> [  248.342967]  common_interrupt+0xbd/0xe0
> [  248.346808]  </IRQ>
> [  248.348912]  <TASK>
> [  248.351018]  asm_common_interrupt+0x1e/0x40
> [  248.355206] RIP: 0010:_raw_spin_unlock_irqrestore+0x1e/0x37
> [  248.360780] Code: 02 5d c3 0f 1f 44 00 00 5d c3 66 90 0f 1f 44 00 00 
> 55 48 89 e5 c6 07 00 0f 1f 40 00 f7 c6 00 02 00 00 74 01 fb bf 01 00 00 
> 00 <e8> ed 8e 5b ff 65 8b 05 66 7e 52 78 85 c0 74 02 5d c3 0f 1f 44 00
> 
> [  248.379525] RSP: 0018:ffffb1c3a429b958 EFLAGS: 00000206
> [  248.384749] RAX: 0000000000000001 RBX: ffff8e5d8973fd08 RCX: 
> ffff8e5d8973fd10
> [  248.391884] RDX: 0000000000000001 RSI: 0000000000000246 RDI: 
> 0000000000000001
> [  248.399017] RBP: ffffb1c3a429b958 R08: 0000000000000000 R09: 
> ffffb1c3a429b970
> [  248.406148] R10: 0000000000000c00 R11: 0000000000000001 R12: 
> 0000000000000001
> [  248.413280] R13: 0000000000000246 R14: 0000000000000000 R15: 
> 0000000000000003
> [  248.420415]  __wake_up_common_lock+0x8a/0xc0
> [  248.424686]  __wake_up+0x13/0x20
> [  248.427919]  raid10_make_request+0x101/0x170 [raid10]
> [  248.432971]  md_handle_request+0x179/0x1e0
> [  248.437071]  ? submit_bio_checks+0x1f6/0x5a0
> [  248.441345]  md_submit_bio+0x6d/0xa0
> [  248.444924]  __submit_bio+0x94/0x140
> [  248.448504]  submit_bio_noacct+0xe1/0x2a0
> [  248.452515]  submit_bio+0x48/0x120
> [  248.455923]  blkdev_direct_IO+0x220/0x540
> [  248.459935]  ? __fsnotify_parent+0xff/0x330
> [  248.464121]  ? __fsnotify_parent+0x10f/0x330
> [  248.468393]  ? common_interrupt+0x73/0xe0
> [  248.472408]  generic_file_read_iter+0xa5/0x160
> [  248.476852]  blkdev_read_iter+0x38/0x70
> [  248.480693]  io_read+0x119/0x420
> [  248.483923]  ? sbitmap_queue_clear_batch+0xc7/0x110
> [  248.488805]  ? blk_mq_end_request_batch+0x378/0x490
> [  248.493684]  io_issue_sqe+0x7ec/0x19c0
> [  248.497436]  ? io_req_prep+0x6a9/0xe60
> [  248.501190]  io_submit_sqes+0x2a0/0x9f0
> [  248.505030]  ? __fget_files+0x6a/0x90
> [  248.508693]  __x64_sys_io_uring_enter+0x1da/0x8c0
> [  248.513401]  do_syscall_64+0x38/0x90
> [  248.516979]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [  248.522033] RIP: 0033:0x7fc26b19b89d
> [  248.525611] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 
> 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 
> 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c3 f5 0c 00 f7 d8 64 89 01 48
> [  248.544360] RSP: 002b:00007fc26b07ce98 EFLAGS: 00000246 ORIG_RAX: 
> 00000000000001aa
> [  248.551925] RAX: ffffffffffffffda RBX: 00007fc26b3f2fc0 RCX: 
> 00007fc26b19b89d
> [  248.559058] RDX: 0000000000000020 RSI: 0000000000000020 RDI: 
> 0000000000000004
> [  248.566189] RBP: 0000000000000020 R08: 0000000000000000 R09: 
> 0000000000000000
> [  248.573322] R10: 0000000000000001 R11: 0000000000000246 R12: 
> 00005623a4b7a2a0
> [  248.580456] R13: 0000000000000020 R14: 0000000000000020 R15: 
> 0000000000000020
> [  248.587591]  </TASK>

Do you have:

commit 75feae73a28020e492fbad2323245455ef69d687
Author: Pavel Begunkov <asml.silence@xxxxxxxxx>
Date:   Tue Dec 7 20:16:36 2021 +0000

    block: fix single bio async DIO error handling

in your tree?

-- 
Jens Axboe




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux