On Wed, Mar 02, 2022 at 08:24:48AM +0000, Shinichiro Kawasaki wrote: > On Mar 02, 2022 / 06:26, Sidong Yang wrote: > > On Wed, Mar 02, 2022 at 04:43:35AM +0000, Shinichiro Kawasaki wrote: > > > > Hi, Shinichiro. > > > > Thanks for reply! > > > > > Hi Sidong, > > > > > > I tried this patch and observed that it recreates the hang and confirms the fix. > > > Thanks. Here's my comments for improvements. > > > > > > On Mar 01, 2022 / 15:19, Sidong Yang wrote: > > > > Test enabling/disable quota and creating/destroying qgroup repeatedly > > > > > > nit: gerund (...ing) and base form are mixed. Base form would be the better to > > > be same as the code comment. > > > > Yeah, 'disable' should be disabling. > > > > > > > in asynchronous and confirm it does not cause kernel hang. This is a > > > > regression test for the problem reported to linux-btrfs list [1]. > > > > > > > > The hang was recreated using the test case and fixed by kernel patch > > > > titled > > > > > > > > btrfs: qgroup: fix deadlock between rescan worker and remove qgroup > > > > > > > > [1] https://lore.kernel.org/linux-btrfs/20220228014340.21309-1-realwakka@xxxxxxxxx/ > > > > > > > > Signed-off-by: Sidong Yang <realwakka@xxxxxxxxx> > > > > --- > > (snip) > > > > > + done > > > > + > > > > + for pid in "${pids[@]}"; do > > > > + wait $pid > > > > + done > > > > > > I think simple "wait" command does what the for loop does. > > > > I didn't know that "wait" command with no parameter waits all background > > processes to finish. So it seems that we don't need pids it can be > > deleted. Thanks. > > > > Actually I've been agony about this. Does it needs timeout? When I tried > > to command like this "timeout 10s wait", This command couldn't be > > executed becase "wait" command is not binary. How can I insert timeout? > > I think recent discussion on the list is a good reference [1]. A patch was > posted to add timeout to btrfs/255. > > More importantly, it was discussed that such timeout of user space program will > not help. Eryu pointed out that once "the kernel already deadlocked, and > filesystem and/or device can't be used by next test either". IMHO, your new case > will not require timeout either with same reasoning. > > [1] https://lore.kernel.org/fstests/20220223171126.GQ12643@xxxxxxxxxxxxx/T/#me349d62ff367a0a6a28076bdd5b89263fc8109c0 > Thanks for a good example. I understand that there is no help in userspace for the kernel deadlocked. In this case, Just "wait" is enough. Thanks, Sidong > -- > Best Regards, > Shin'ichiro Kawasaki