On Wed, 7 Nov 2018 15:43:36 -0800 Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: > On Tue, 23 Oct 2018 08:30:04 +0800 kernel test robot <rong.a.chen@xxxxxxxxx> wrote: > > > Greetings, > > > > 0day kernel testing robot got the below dmesg and the first bad commit is > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master > > > > commit d50d82faa0c964e31f7a946ba8aba7c715ca7ab0 > > Author: Mikulas Patocka <mpatocka@xxxxxxxxxx> > > AuthorDate: Wed Jun 27 23:26:09 2018 -0700 > > Commit: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> > > CommitDate: Thu Jun 28 11:16:44 2018 -0700 > > > > slub: fix failure when we delete and create a slab cache > > This is ugly. Is there an alternative way of fixing the race which > Mikulas attempted to address? Possibly cancel the work and reuse the > existing sysfs file, or is that too stupid to live? > > 3b7b314053d021 ("slub: make sysfs file removal asynchronous") was > pretty lame, really. As mentioned, > > : It'd be the cleanest to deal with the issue by removing sysfs files > : without holding slab_mutex before the rest of shutdown; however, given > : the current code structure, it is pretty difficult to do so. > > Would be a preferable approach. > > > > > This uncovered a bug in the slub subsystem - if we delete a cache and > > immediatelly create another cache with the same attributes, it fails > > because of duplicate filename in /sys/kernel/slab/. The slub subsystem > > offloads freeing the cache to a workqueue - and if we create the new > > cache before the workqueue runs, it complains because of duplicate > > filename in sysfs. Alternatively, could we flush the workqueue before attempting to (re)create the sysfs file? Extra points for only doing this if the first (re)creation attempt returned -EEXIST?