On Thu, Jul 13, 2023 at 11:33:24AM -0400, Joel Fernandes wrote: > On Thu, Jul 13, 2023 at 10:34 AM Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx> wrote: > > On 2023/7/13 22:07, Joel Fernandes wrote: > > > On Thu, Jul 13, 2023 at 12:59 AM Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx> wrote: > > >> On 2023/7/13 12:52, Paul E. McKenney wrote: > > >>> On Thu, Jul 13, 2023 at 12:41:09PM +0800, Gao Xiang wrote: > > >> > > >> ... > > >> > > >>>> > > >>>> There are lots of performance issues here and even a plumber > > >>>> topic last year to show that, see: > > >>>> > > >>>> [1] https://lore.kernel.org/r/20230519001709.2563-1-tj@xxxxxxxxxx > > >>>> [2] https://lore.kernel.org/r/CAHk-=wgE9kORADrDJ4nEsHHLirqPCZ1tGaEPAZejHdZ03qCOGg@xxxxxxxxxxxxxx > > >>>> [3] https://lore.kernel.org/r/CAB=BE-SBtO6vcoyLNA9F-9VaN5R0t3o_Zn+FW8GbO6wyUqFneQ@xxxxxxxxxxxxxx > > >>>> [4] https://lpc.events/event/16/contributions/1338/ > > >>>> and more. > > >>>> > > >>>> I'm not sure if it's necessary to look info all of that, > > >>>> andSandeep knows more than I am (the scheduling issue > > >>>> becomes vital on some aarch64 platform.) > > >>> > > >>> Hmmm... Please let me try again. > > >>> > > >>> Assuming that this approach turns out to make sense, the resulting > > >>> patch will need to clearly state the performance benefits directly in > > >>> the commit log. > > >>> > > >>> And of course, for the approach to make sense, it must avoid breaking > > >>> the existing lockdep-RCU debugging code. > > >>> > > >>> Is that more clear? > > >> > > >> Personally I'm not working on Android platform any more so I don't > > >> have a way to reproduce, hopefully Sandeep could give actually > > >> number _again_ if dm-verity is enabled and trigger another > > >> workqueue here and make a comparsion why the scheduling latency of > > >> the extra work becomes unacceptable. > > >> > > > > > > Question from my side, are we talking about only performance issues or > > > also a crash? It appears z_erofs_decompress_pcluster() takes > > > mutex_lock(&pcl->lock); > > > > > > So if it is either in an RCU read-side critical section or in an > > > atomic section, like the softirq path, then it may > > > schedule-while-atomic or trigger RCU warnings. > > > > > > z_erofs_decompressqueue_endio > > > -> z_erofs_decompress_kickoff > > > ->z_erofs_decompressqueue_work > > > ->z_erofs_decompress_queue > > > -> z_erofs_decompress_pcluster > > > -> mutex_lock > > > > > > > Why does the softirq path not trigger a workqueue instead? > > I said "if it is". I was giving a scenario. mutex_lock() is not > allowed in softirq context or in an RCU-reader. > > > > Per Sandeep in [1], this stack happens under RCU read-lock in: > > > > > > #define __blk_mq_run_dispatch_ops(q, check_sleep, dispatch_ops) \ > > > [...] > > > rcu_read_lock(); > > > (dispatch_ops); > > > rcu_read_unlock(); > > > [...] > > > > > > Coming from: > > > blk_mq_flush_plug_list -> > > > blk_mq_run_dispatch_ops(q, > > > __blk_mq_flush_plug_list(q, plug)); > > > > > > and __blk_mq_flush_plug_list does this: > > > q->mq_ops->queue_rqs(&plug->mq_list); > > > > > > This somehow ends up calling the bio_endio and the > > > z_erofs_decompressqueue_endio which grabs the mutex. > > > > > > So... I have a question, it looks like one of the paths in > > > __blk_mq_run_dispatch_ops() uses SRCU. Where are as the alternate > > > path uses RCU. Why does this alternate want to block even if it is not > > > supposed to? Is the real issue here that the BLK_MQ_F_BLOCKING should > > > be set? It sounds like you want to block in the "else" path even > > > though BLK_MQ_F_BLOCKING is not set: > > > > BLK_MQ_F_BLOCKING is not a flag that a filesystem can do anything with. > > That is block layer and mq device driver stuffs. filesystems cannot set > > this value. > > > > As I said, as far as I understand, previously, > > .end_io() can only be called without RCU context, so it will be fine, > > but I don't know when .end_io() can be called under some RCU context > > now. > > >From what Sandeep described, the code path is in an RCU reader. My > question is more, why doesn't it use SRCU instead since it clearly > does so if BLK_MQ_F_BLOCKING. What are the tradeoffs? IMHO, a deeper > dive needs to be made into that before concluding that the fix is to > use rcu_read_lock_any_held(). How can this be solved? 1. Always use a workqueue. Simple, but is said to have performance issues. 2. Pass a flag in that indicates whether or not the caller is in an RCU read-side critical section. Conceptually simple, but might or might not be reasonable to actually implement in the code as it exists now. (You tell me!) 3. Create a function in z_erofs that gives you a decent approximation, maybe something like the following. 4. Other ideas here. The following is untested, and is probably quite buggy, but it should provide you with a starting point. static bool z_erofs_wq_needed(void) { if (IS_ENABLED(CONFIG_PREEMPTION) && rcu_preempt_depth()) return true; // RCU reader if (IS_ENABLED(CONFIG_PREEMPT_COUNT) && !preemptible()) return true; // non-preemptible if (!IS_ENABLED(CONFIG_PREEMPT_COUNT)) return true; // non-preeemptible kernel, so play it safe return false; } You break it, you buy it! ;-) Thanx, Paul