Re: [PATCH] zram: do not lookup algorithm in backends table

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On (22/06/22 12:19), Andrew Morton wrote:
> > On (22/06/22 11:35), Sergey Senozhatsky wrote:
> > > Always use crypto_has_comp() so that crypto can lookup module,
> > > call usermodhelper to load the modules, wait for usermodhelper
> > > to finish and so on. Otherwise crypto will do all of these steps
> > > under CPU hot-plug lock and this looks like too much stuff to
> > > handle under the CPU hot-plug lock. Besides this can end up in
> > > a deadlock when usermodhelper triggers a code path that attempts
> > > to lock the CPU hot-plug lock, that zram already holds.
> > 
> > And we think that we (not exactly "we", our partners) actually
> > see a deadlock. It goes something like this:
> > 
> > - path A. zram grabs CPU hot-plug lock, execs /sbin/modprobe from crypto
> >   and waits for modprobe to finish
> 
> Nope, can't do that.
> 
> > disksize_store
> >  zcomp_create
> >   __cpuhp_state_add_instance
> >    __cpuhp_state_add_instance_cpuslocked
> >     zcomp_cpu_up_prepare
> >      crypto_alloc_base
> >       crypto_alg_mod_lookup
> >        call_usermodehelper_exec
> >         wait_for_completion_killable
> >          do_wait_for_common
> >           schedule
> 
> The usermode helper is free to do anything it wants, including
> operations that take the CPU hotplug lock.  Or operations which might
> in the future be changed to take that lock.

Agreed.

> > - path B. async work kthread that brings in scsi device. It wants to
> >   register CPUHP states at some point, and it needs the CPU hot-plug
> >   lock for that, which is owned by zram.
> > 
> > async_run_entry_fn
> >  scsi_probe_and_add_lun
> >   scsi_mq_alloc_queue
> >    blk_mq_init_queue
> >     blk_mq_init_allocated_queue
> >      blk_mq_realloc_hw_ctxs
> >       __cpuhp_state_add_instance
> >        __cpuhp_state_add_instance_cpuslocked
> >         mutex_lock
> >          schedule
> > 
> > - path C. modprobe sleeps, waiting for all aync works to finish.
> > 
> > load_module
> >  do_init_module
> >   async_synchronize_full
> >    async_synchronize_cookie_domain
> >     schedule
> > 
> > And none can make any progress.
> > 
> > So I think we need to move crypto_alg_mod_lookup()->call_usermodehelper_exec()
> > out of CPU hot-plug lock and pre-load modules in advance, before we grab the
> > hot-plug lock.
> 
> If the locking is fixed, why is there still a need to preload modules?

We "fix" locking by doing initial crypto compression algorithm lookup
outside of hot-plug lock (pre-load).

Crypto API handles a list of preloaded modules internally. What we do
currently, we call crypto_alloc_base() under hot-plug lock, which calls
crypto_alg_mod_lookup(), which figures out that crypto modules list does
not contain that module yet so then it modprobes it.

With this patch we do the first crypto_alg_mod_lookup() outside of
hot-plug lock, so that it safely modprobes compression module. Then
when we grab the hot-plug lock and setup per-CPU streams,
crypto_alloc_base()->crypto_alg_mod_lookup() figures that module is
already on the list so no modprobe is needed.



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux