> > > > > > )On Tue, Feb 16, 2021 at 7:26 PM Tomas Winkler > > <tomas.winkler@xxxxxxxxx> > > wrote: > > > Because the graphic card may undergo reset at any time and basically > > > hot unplug all its child devices, this series also provides a fix to > > > the mtd framework to make the reset graceful. > > > > Well, just because MTD does not work as you expect, it is not broken. > > :-) > I'm not saying it's broken by design it just didn't fit this use case. > > > > In your case i915_spi_remove() blindly removes the MTD, this is not > allowed. > > You may remove the MTD only if there are no more users. > > I'm not sure it's good idea to stall the removal on user space. > This is just asking for a deadlock as user space is not getting what it needs and > may stall I think it's better the user space will fail gracefully the hw is not > accessible in that stage anyway. > > > > The current model in MTD is that the driver is in charge of all life > > cycle management. > > Using ->_get_device() and ->_put_device() a driver can implement > > refcounting and deny new users if the MTD is about to disappear. > > Please note that this use case you are describing is still valid, I haven't > removed _get_device() _put_device() handlers, You can still stall the > removal of mtd, If this is not that way it's a bug > > > > > In the upcoming MUSE driver that mechanism is used too. > > MUSE allows to implement a MTD in userspace. So the FUSE server can > > disappear at > > *any* time. Just like in your case. Even worse, it can be hostile. > > In MUSE the MTD life time is tied to the FUSE connection object, > > muse_mtd_get_device() > > increments the FUSE connection refcount, and muse_mtd_put_device() > > decrements it. > > That means if the FUSE server disappears all of a sudden but the MTD > > still has users, the MTD will stay. But in this state no new > > references are allowed and all MTD operations of existing users will fail > with -ENOTCONN (via FUSE). > > As soon the last user is gone (can be userspace via /dev/mtd* or a > > in-kernel user such as UBIFS), the MTD will be removed. > > But in our case whole i915 is taken hostage, it cannot reset because of > misbehaving user space. > > > For the full details, please see: > > https://git.kernel.org/pub/scm/linux/kernel/git/rw/misc.git/tree/fs/fu > > se/m > > use.c?h=muse_v3#n1034 > > > > Is in your case *really* not possible to do it that way? > > Maybe it's possible but I don't think it's good to stall i915 removal. Also It's > very easily to crash the kernel. > I've posted a sniped to the mailing list that tried to do that, the kernel still has > crashed. Can you looked at? > > > On the other hand, your last patch moves some part of the life cycle > > management into MTD core. > > The MTD will stay as long it has users. > > But that's only one part. The driver is still in charge to make sure > > that all operations fail immediately and that no new users arrive. > > I think that case I would need to validate every HW access to make sure it's > still valid. > > > If we want to do all in MTD core we'd have to do it like SCSI disks. > > That means having devices states such as SDEV_RUNNING, SDEV_CANCEL, > > SDEV_OFFLINE, .... > > That way the MTD could be shutdown gracefully, first no new users are > > allowed, then ongoing operations will be cancelled, next all operation > > will fail with -EIO or such, then the device is being removed from > > sysfs and finally if the last user is gone, the MTD can be removed. > > Isn't that already that way? You cannot open new handler. That I would need > more of your insights. > > > > I'm not sure whether we want to take that path. Hi Richard is there any way we can try to unclutter this ? Thanks Tomas _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx