On 2021/02/17 15:42, Johannes Thumshirn wrote: > On 17/02/2021 00:33, Damien Le Moal wrote: >> On 2021/02/17 4:42, Dan Carpenter wrote: >>> Hello Johannes Thumshirn, >>> >>> The patch 5795eb443060: "scsi: sd_zbc: emulate ZONE_APPEND commands" >>> from May 12, 2020, leads to the following static checker warning: >>> >>> drivers/scsi/sd_zbc.c:741 sd_zbc_revalidate_zones() >>> error: kvmalloc() only makes sense with GFP_KERNEL >>> >>> drivers/scsi/sd_zbc.c >>> 721 /* >>> 722 * There is nothing to do for regular disks, including host-aware disks >>> 723 * that have partitions. >>> 724 */ >>> 725 if (!blk_queue_is_zoned(q)) >>> 726 return 0; >>> 727 >>> 728 /* >>> 729 * Make sure revalidate zones are serialized to ensure exclusive >>> 730 * updates of the scsi disk data. >>> 731 */ >>> 732 mutex_lock(&sdkp->rev_mutex); >>> 733 >>> 734 if (sdkp->zone_blocks == zone_blocks && >>> 735 sdkp->nr_zones == nr_zones && >>> 736 disk->queue->nr_zones == nr_zones) >>> 737 goto unlock; >>> 738 >>> 739 sdkp->zone_blocks = zone_blocks; >>> 740 sdkp->nr_zones = nr_zones; >>> 741 sdkp->rev_wp_offset = kvcalloc(nr_zones, sizeof(u32), GFP_NOIO); >>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >>> We're passing GFP_NOIO here so it just defaults to kcalloc() and will >>> not vmalloc() the memory. >> >> Indeed... And the allocation can get a little too big for kmalloc(). >> >> Johannes, I think we need to move that allocation before the rev_mutex locking, >> using a local var for the allocated address, and then using GFP_KERNEL should be >> safe... But not entirely sure. Using kmalloc would be simpler but on large SMR >> drives, that allocation will soon need to be 400K or so (i.e. 100,000 zones or >> even more), too large for kmalloc to succeed reliably. >> > > > No I don't think so. A mutex isn't a spinlock so we can sleep on the allocation. > We can't use GFP_KERNEL as we're about to do I/O. blk_revalidate_disk_zones() called > a few line below also does the memalloc_noio_{save,restore}() dance. Yes, but blk_revalidate_disk_zones() only allocates the zone bitmaps and these are much smaller. So kmalloc is fine and GFP_NOIO is natural. For the wp array, I think we really need kvmalloc() due to the potential very large size (and growing with new drive models) and GFP_NOIO does not work for that. Not sure if memalloc_noio_{save,restore}() can change that in vmalloc context (I do not think so). > Would a kmem_cache for these revalidations help us in any way? I am not familiar with that... Would need to dig into it. For this to be safe, we only need to guarantee forward progress, and in this case this means not causing problems if a GFP_KERNEL allocation causes us to reenter the scsi driver for I/Os. Since (I think) revalidation never happens in FS or I/O context, GFP_KERNEL allocation should be safe if done outside of the rev_mutex lock. Not 100% sure here, just a hunch... We may need to check the block layer level to check if there are any locks being held when revalidation triggers. -- Damien Le Moal Western Digital Research