On Fri, Dec 09, 2022 at 03:23:12PM +0100, Sergei Shtepa wrote: > + blk_mq_freeze_queue(bdev->bd_queue); > + blk_mq_quiesce_queue(bdev->bd_queue); I don't think we need the quiesce here (or in the detach path). quiesce works as the low-level blk-mq dispatch level, and we sit way above that. > +EXPORT_SYMBOL(bdev_filter_attach); Please use EXPORT_SYMBOL_GPL for new low-level block layer features. > +int bdev_filter_detach(struct block_device *bdev) > +{ > + int ret = 0; > + struct bdev_filter *flt = NULL; > + > + blk_mq_freeze_queue(bdev->bd_queue); > + blk_mq_quiesce_queue(bdev->bd_queue); > + > + flt = bdev->bd_filter; > + if (flt) > + bdev->bd_filter = NULL; > + else > + ret = -ENOENT; Not having a filter is a grave error that the caller can't do anything about. So pleas just WARN_ON_ONCE for that case, and change the function to a void return. > + if (bio->bi_bdev->bd_filter && !bio_flagged(bio, BIO_FILTERED)) { > + bool pass; > + > + pass = bio->bi_bdev->bd_filter->fops->submit_bio_cb(bio); > + bio_set_flag(bio, BIO_FILTERED); > + if (!pass) { > + bio->bi_status = BLK_STS_OK; > + bio_endio(bio); > + return; > + } Instead of ending the bio here for the !pass case it seems to me it would make more sense to just pass ownership to the filter driver and let the driver complete it. That would allow error returns for that case, or handling it from a workqueue. I'd also change the polarity so that true means "I've taken ownership". I.e.: if (bio->bi_bdev->bd_filter && !bio_flagged(bio, BIO_FILTERED)) { bio_set_flag(bio, BIO_FILTERED); if (bio->bi_bdev->bd_filter->fops->submit_bio_cb(bio)) return; } > +struct bdev_filter_operations { > + bool (*submit_bio_cb)(struct bio *bio); > + void (*release_cb)(struct kref *kref); > +}; Nit: I don't think these _cb postfixes are very useful. All methods in a method table are sort of callbacks. > +/** > + * bdev_filter_get - Increment reference counter. > + * @flt: > + * Pointer to the &struct bdev_filter. > + * > + * Allows to ensure that the filter will not be released as long as there are > + * references to it. > + */ > +static inline void bdev_filter_get(struct bdev_filter *flt) > +{ > + kref_get(&flt->kref); > +} Looking at the callers in blksnap I'm not sure this works. The pattern seems to be the driver has a block device reference, and then uses bdev_filter_get to grab a filter reference. But what prevents the filter from going away just bdev_filter_get is called? At a minimum we'd need a something: static inline struct bdev_filter *bdev_filter_get(struct block_device *bdev) { struct bdev_filter *flt; rcu_read_lock(); flt = rcu_dereference(bdev->bd_filter); if (!refcount_inc_not_zero(&flt->refs)) flt = NULL; rcu_read_unlock(); return flt; } with bd_filter switched to __rcu annotation, the kref replaced with a refcount_t, updates switched to cmpxchg and a rcu_synchronize() after clearing bd_filter.