On Sat, Nov 19, 2022 at 02:19:43AM +0000, Al Viro wrote: > On Mon, Nov 14, 2022 at 05:26:32AM +0100, Christoph Hellwig wrote: > > Hi Jens, > > > > this series cleans up the registration of the "queue/" kobject, and given > > untangles it from the request_queue refcounting. > > > > Changes since v1: > > - also change the blk_crypto_sysfs_unregister prototype > > - add two patches to fix the error handling in blk_register_queue > > Umm... Do we ever want access to queue parameters of the stuff that has > a queue, but no associated gendisk? SCSI tape, for example... > > Re refcounting: AFAICS, blk_mq_alloc_disk_for_queue() is broken. [snip] > can't be right - we might fail in blk_get_queue(), returning NULL with > unchanged refcount, we might succeed and return the new gendisk that > has consumed the extra reference grabbed by blk_get_queue() *OR* > we might grab an extra reference, fail in __alloc_disk_node() and > return NULL with refcount on q bumped. No way for caller to tell these > failure modes from each other... The callers (both sd and sr) treat > both as "no reference grabbed", i.e. leak the queue refcount if they > fail past grabbing the queue. Speaking of leaks, how can this q = blk_mq_init_queue(&sdev->host->tag_set); if (IS_ERR(q)) { /* release fn is set up in scsi_sysfs_device_initialise, so * have to free and put manually here */ put_device(&starget->dev); kfree(sdev); goto out; } kref_get(&sdev->host->tagset_refcnt); sdev->request_queue = q; q->queuedata = sdev; __scsi_init_queue(sdev->host, q); depth = sdev->host->cmd_per_lun ?: 1; /* * Use .can_queue as budget map's depth because we have to * support adjusting queue depth from sysfs. Meantime use * default device queue depth to figure out sbitmap shift * since we use this queue depth most of times. */ if (scsi_realloc_sdev_budget_map(sdev, depth)) { put_device(&starget->dev); kfree(sdev); goto out; } ... out: if (display_failure_msg) printk(ALLOC_FAILURE_MSG, __func__); return NULL; in scsi_alloc_sdev() possibly avoid leaking sdev->request_queue on the second failure exit? AFAICS scsi_realloc_sdev_budget_map() will see NULL in sdev->budget_map.map, attempt ret = sbitmap_init_node(&sdev->budget_map, scsi_device_max_queue_depth(sdev), new_shift, GFP_KERNEL, sdev->request_queue->node, false, true); and if that fails - return without having even looked at sdev->request_queue. Then we drop startget->dev (which has no way to observe sdev or anything in it) and kfree sdev, which leaves q the only place where we have the address of queue. And we don't look at q after that point... Shouldn't we do blk_mq_destroy_queue()/blk_put_queue() on that failure exit?