On Tue, 28 Feb 2023, yangerkun wrote: > > > 在 2023/2/28 2:06, Mike Snitzer 写道: > > On Mon, Feb 27 2023 at 1:03P -0500, > > Mike Snitzer <snitzer@xxxxxxxxxx> wrote: > > > >> On Mon, Feb 27 2023 at 12:55P -0500, > >> Mike Snitzer <snitzer@xxxxxxxxxx> wrote: > >> > >>> On Sun, Feb 26 2023 at 8:31P -0500, > >>> yangerkun <yangerkun@xxxxxxxxxxxxxxx> wrote: > >>> > >>>> > >>>> > >>>> 在 2023/2/26 10:01, Bart Van Assche 写道: > >>>>> On 2/22/23 19:19, yangerkun wrote: > >>>>>> @@ -1924,6 +1926,10 @@ static int dmcrypt_write(void *data) > >>>>>> BUG_ON(rb_parent(write_tree.rb_node)); > >>>>>> + if (time_is_before_jiffies(start_time + HZ)) { > >>>>>> + schedule(); > >>>>>> + start_time = jiffies; > >>>>>> + } > >>>>> > >>>>> Why schedule() instead of cond_resched()? > >>>> > >>>> cond_resched may not really schedule, which may trigger the problem too, > but > >>>> it seems after 1 second, it may never happend? > >>> > >>> I had the same question as Bart when reviewing your homegrown > >>> conditional schedule(). Hopefully you can reproduce this issue? If > >>> so, please see if simply using cond_resched() fixes the issue. > >> > >> This seems like a more appropriate patch: > >> > >> diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c > >> index 87c5706131f2..faba1be572f9 100644 > >> --- a/drivers/md/dm-crypt.c > >> +++ b/drivers/md/dm-crypt.c > >> @@ -1937,6 +1937,7 @@ static int dmcrypt_write(void *data) > >> io = crypt_io_from_node(rb_first(&write_tree)); > >> rb_erase(&io->rb_node, &write_tree); > >> kcryptd_io_write(io); > >> + cond_resched(); > >> } while (!RB_EMPTY_ROOT(&write_tree)); > >> blk_finish_plug(&plug); > >> } > > > > > > or: > > > > diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c > > index 87c5706131f2..3ba2fd3e4358 100644 > > --- a/drivers/md/dm-crypt.c > > +++ b/drivers/md/dm-crypt.c > > @@ -1934,6 +1934,7 @@ static int dmcrypt_write(void *data) > > */ > > blk_start_plug(&plug); > > do { > > + cond_resched(); > > io = crypt_io_from_node(rb_first(&write_tree)); > > rb_erase(&io->rb_node, &write_tree); > > kcryptd_io_write(io); > > Hi, > > Thanks a lot for your review! > > It's ok to fix the softlockup, but for async write encrypt, > kcryptd_crypt_write_io_submit will add bio to write_tree, and once we > call cond_resched before every kcryptd_io_write, the write performance > may be poor while we meet a high cpu usage scene. Hi To fix this problem, find the PID of the process "dmcrypt_write" and change its priority to -20, for example "renice -n -20 -p 34748". This is the proper way how to fix it; locking up the process for one second is not. We used to have high-priority workqueues by default, but it caused audio playback skipping, so we had to revert it - see f612b2132db529feac4f965f28a1b9258ea7c22b. Perhaps we should add an option to have high-priority kernel threads? Mikulas > kcryptd_crypt_write_io_submit will wakeup write_thread once there is a > empty write_tree, and dmcrypt_write will peel the old write_tree to > submit bio, so there can not exist too many bio in write_tree. Then I > choose yield cpu before the 'while' that submit bio... > > Thanks, > Kun. > > -- > dm-devel mailing list > dm-devel@xxxxxxxxxx > https://listman.redhat.com/mailman/listinfo/dm-devel >
-- dm-devel mailing list dm-devel@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/dm-devel