> > )On Tue, Feb 16, 2021 at 7:26 PM Tomas Winkler <tomas.winkler@xxxxxxxxx> > wrote: > > Because the graphic card may undergo reset at any time and basically > > hot unplug all its child devices, this series also provides a fix to > > the mtd framework to make the reset graceful. > > Well, just because MTD does not work as you expect, it is not broken. :-) I'm not saying it's broken by design it just didn't fit this use case. > > In your case i915_spi_remove() blindly removes the MTD, this is not allowed. > You may remove the MTD only if there are no more users. I'm not sure it's good idea to stall the removal on user space. This is just asking for a deadlock as user space is not getting what it needs and may stall I think it's better the user space will fail gracefully the hw is not accessible in that stage anyway. > > The current model in MTD is that the driver is in charge of all life cycle > management. > Using ->_get_device() and ->_put_device() a driver can implement > refcounting and deny new users if the MTD is about to disappear. Please note that this use case you are describing is still valid, I haven't removed _get_device() _put_device() handlers, You can still stall the removal of mtd, If this is not that way it's a bug > > In the upcoming MUSE driver that mechanism is used too. > MUSE allows to implement a MTD in userspace. So the FUSE server can > disappear at > *any* time. Just like in your case. Even worse, it can be hostile. > In MUSE the MTD life time is tied to the FUSE connection object, > muse_mtd_get_device() > increments the FUSE connection refcount, and muse_mtd_put_device() > decrements it. > That means if the FUSE server disappears all of a sudden but the MTD still has > users, the MTD will stay. But in this state no new references are allowed and > all MTD operations of existing users will fail with -ENOTCONN (via FUSE). > As soon the last user is gone (can be userspace via /dev/mtd* or a in-kernel > user such as UBIFS), the MTD will be removed. But in our case whole i915 is taken hostage, it cannot reset because of misbehaving user space. > For the full details, please see: > https://git.kernel.org/pub/scm/linux/kernel/git/rw/misc.git/tree/fs/fuse/m > use.c?h=muse_v3#n1034 > > Is in your case *really* not possible to do it that way? Maybe it's possible but I don't think it's good to stall i915 removal. Also It's very easily to crash the kernel. I've posted a sniped to the mailing list that tried to do that, the kernel still has crashed. Can you looked at? > On the other hand, your last patch moves some part of the life cycle > management into MTD core. > The MTD will stay as long it has users. > But that's only one part. The driver is still in charge to make sure that all > operations fail immediately and that no new users arrive. I think that case I would need to validate every HW access to make sure it's still valid. > If we want to do all in MTD core we'd have to do it like SCSI disks. > That means having devices states such as SDEV_RUNNING, SDEV_CANCEL, > SDEV_OFFLINE, .... > That way the MTD could be shutdown gracefully, first no new users are > allowed, then ongoing operations will be cancelled, next all operation will fail > with -EIO or such, then the device is being removed from sysfs and finally if > the last user is gone, the MTD can be removed. Isn't that already that way? You cannot open new handler. That I would need more of your insights. > > I'm not sure whether we want to take that path. Thanks Tomas _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx