Re: dm-crypt: fix softlockup in dmcrypt_write

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Tue, 28 Feb 2023, yangerkun wrote:

> 
> 
> 在 2023/2/28 2:06, Mike Snitzer 写道:
> > On Mon, Feb 27 2023 at  1:03P -0500,
> > Mike Snitzer <snitzer@xxxxxxxxxx> wrote:
> > 
> >> On Mon, Feb 27 2023 at 12:55P -0500,
> >> Mike Snitzer <snitzer@xxxxxxxxxx> wrote:
> >>
> >>> On Sun, Feb 26 2023 at  8:31P -0500,
> >>> yangerkun <yangerkun@xxxxxxxxxxxxxxx> wrote:
> >>>
> >>>>
> >>>>
> >>>> 在 2023/2/26 10:01, Bart Van Assche 写道:
> >>>>> On 2/22/23 19:19, yangerkun wrote:
> >>>>>> @@ -1924,6 +1926,10 @@ static int dmcrypt_write(void *data)
> >>>>>>            BUG_ON(rb_parent(write_tree.rb_node));
> >>>>>> +        if (time_is_before_jiffies(start_time + HZ)) {
> >>>>>> +            schedule();
> >>>>>> +            start_time = jiffies;
> >>>>>> +        }
> >>>>>
> >>>>> Why schedule() instead of cond_resched()?
> >>>>
> >>>> cond_resched may not really schedule, which may trigger the problem too,
> but
> >>>> it seems after 1 second, it may never happend?
> >>>
> >>> I had the same question as Bart when reviewing your homegrown
> >>> conditional schedule().  Hopefully you can reproduce this issue?  If
> >>> so, please see if simply using cond_resched() fixes the issue.
> >>
> >> This seems like a more appropriate patch:
> >>
> >> diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
> >> index 87c5706131f2..faba1be572f9 100644
> >> --- a/drivers/md/dm-crypt.c
> >> +++ b/drivers/md/dm-crypt.c
> >> @@ -1937,6 +1937,7 @@ static int dmcrypt_write(void *data)
> >>   			io = crypt_io_from_node(rb_first(&write_tree));
> >>   			rb_erase(&io->rb_node, &write_tree);
> >>   			kcryptd_io_write(io);
> >> +			cond_resched();
> >>   		} while (!RB_EMPTY_ROOT(&write_tree));
> >>   		blk_finish_plug(&plug);
> >>   	}
> > 
> > 
> > or:
> > 
> > diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
> > index 87c5706131f2..3ba2fd3e4358 100644
> > --- a/drivers/md/dm-crypt.c
> > +++ b/drivers/md/dm-crypt.c
> > @@ -1934,6 +1934,7 @@ static int dmcrypt_write(void *data)
> >   		 */
> >   		blk_start_plug(&plug);
> >   		do {
> > +			cond_resched();
> >   			io = crypt_io_from_node(rb_first(&write_tree));
> >   			rb_erase(&io->rb_node, &write_tree);
> >   			kcryptd_io_write(io);
> 
> Hi,
> 
> Thanks a lot for your review!
> 
> It's ok to fix the softlockup, but for async write encrypt, 
> kcryptd_crypt_write_io_submit will add bio to write_tree, and once we 
> call cond_resched before every kcryptd_io_write, the write performance 
> may be poor while we meet a high cpu usage scene.

Hi

To fix this problem, find the PID of the process "dmcrypt_write" and 
change its priority to -20, for example "renice -n -20 -p 34748".

This is the proper way how to fix it; locking up the process for one 
second is not.

We used to have high-priority workqueues by default, but it caused audio 
playback skipping, so we had to revert it - see 
f612b2132db529feac4f965f28a1b9258ea7c22b.

Perhaps we should add an option to have high-priority kernel threads?

Mikulas

> kcryptd_crypt_write_io_submit will wakeup write_thread once there is a 
> empty write_tree, and dmcrypt_write will peel the old write_tree to 
> submit bio, so there can not exist too many bio in write_tree. Then I 
> choose yield cpu before the 'while' that submit bio...
> 
> Thanks,
> Kun.
> 
> --
> dm-devel mailing list
> dm-devel@xxxxxxxxxx
> https://listman.redhat.com/mailman/listinfo/dm-devel
> 
--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://listman.redhat.com/mailman/listinfo/dm-devel

[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux