On Tue, Oct 05, 2021 at 01:55:35PM -0700, Kees Cook wrote: > On Mon, Sep 27, 2021 at 09:38:04AM -0700, Luis Chamberlain wrote: > > Provide a simple state machine to fix races with driver exit where we > > remove the CPU multistate callbacks and re-initialization / creation of > > new per CPU instances which should be managed by these callbacks. > > > > The zram driver makes use of cpu hotplug multistate support, whereby it > > associates a struct zcomp per CPU. Each struct zcomp represents a > > compression algorithm in charge of managing compression streams per > > CPU. Although a compiled zram driver only supports a fixed set of > > compression algorithms, each zram device gets a struct zcomp allocated > > per CPU. The "multi" in CPU hotplug multstate refers to these per > > cpu struct zcomp instances. Each of these will have the CPU hotplug > > callback called for it on CPU plug / unplug. The kernel's CPU hotplug > > multistate keeps a linked list of these different structures so that > > it will iterate over them on CPU transitions. > > > > By default at driver initialization we will create just one zram device > > (num_devices=1) and a zcomp structure then set for the now default > > lzo-rle comrpession algorithm. At driver removal we first remove each > > zram device, and so we destroy the associated struct zcomp per CPU. But > > since we expose sysfs attributes to create new devices or reset / > > initialize existing zram devices, we can easily end up re-initializing > > a struct zcomp for a zram device before the exit routine of the module > > removes the cpu hotplug callback. When this happens the kernel's CPU > > hotplug will detect that at least one instance (struct zcomp for us) > > exists. This can happen in the following situation: > > > > CPU 1 CPU 2 > > > > disksize_store(...); > > class_unregister(...); > > idr_for_each(...); > > zram_debugfs_destroy(); > > > > idr_destroy(...); > > unregister_blkdev(...); > > cpuhp_remove_multi_state(...); > > So this is strictly separate from the sysfs/module unloading race? It is only related in the sense that the sysfs/module unloading race happened *after* this other issue, but addressing these through separate threads created a break in conversation and focus. For instance, a theoretical race was mentioned in one thread, which I worked to prove/disprove and then I disproved it was not possible. But at this point, yes, this is a purely separate issue, and this patch *should* be picked up already. Andrew, can you merge this? It already has the respective maintainer Ack, and I can continue to work on the rest of the patches. The only issue I can think of would be a conflict with the last patch but that's a oneliner, I think chances are low that would create a conflict if its all merged separately, and if so, it should be an easy fix for a merge conflict. Luis