Re: [PATCH v4] sysfs: fix kobject refcount to address races with kobject removal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 22, 2021 at 02:31:37PM -0700, Luis Chamberlain wrote:
> On Wed, Jul 21, 2021 at 01:30:29PM +0200, Greg KH wrote:
> > On Thu, Jul 01, 2021 at 03:48:16PM -0700, Luis Chamberlain wrote:
> > > On Fri, Jun 25, 2021 at 02:56:03PM -0700, Luis Chamberlain wrote:
> > > > On Thu, Jun 24, 2021 at 01:09:03PM +0200, Greg KH wrote:
> > > > > thanks for making this change and sticking with it!
> > > > > 
> > > > > Oh, and with this change, does your modprobe/rmmod crazy test now work?
> > > > 
> > > > It does but I wrote a test_syfs driver and I believe I see an issue with
> > > > this. I'll debug a bit more and see what it was, and I'll then also use
> > > > the driver to demo the issue more clearly, and then verification can be
> > > > an easy selftest test.
> > > 
> > > OK my conclusion based on a new selftest driver I wrote is we can drop
> > > this patch safely. The selftest will cover this corner case well now.
> > > 
> > > In short: the kernfs active reference will ensure the store operation
> > > still exists. The kernfs mutex is not enough, but if the driver removes
> > > the operation prior to getting the active reference, the write will just
> > > fail. The deferencing inside of the sysfs operation is abstract to
> > > kernfs, and while kernfs can't do anything to prevent a driver from
> > > doing something stupid, it at least can ensure an open file ensure the
> > > op is not removed until the operation completes.
> > 
> > Ok, so all is good?
> 
> It would seem to be the case.
> 
> > Then why is your zram test code blowing up so badly?
> 
> I checked the logs for the backtrace where the crash did happen
> and we did see clear evidence of the race we feared here. The *first*
> bug that happened was the CPU hotplug race:
> 
> [132004.787099] Error: Removing state 61 which has instances left.
> [132004.787124] WARNING: CPU: 17 PID: 9307 at ../kernel/cpu.c:1879 __cpuhp_remove_state_cpuslocked+0x1c4/0x1d0

I do not understand what this issue is, is it fixed?  Why is a cpu being
hot unplugged at the same time a zram?

thanks,

greg k-h



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux