On Wed, Feb 19, 2020 at 03:38:00PM +0900, Damien Le Moal wrote: > The block layer generic blk_revalidate_disk_zones() checks the validity > of zone descriptors reported by a disk using the > blk_revalidate_zone_cb() callback function executed for each zone > descriptor. If a ZBC disk reports invalid zone descriptors, > blk_revalidate_disk_zones() returns an error and sd_zbc_read_zones() > changes the disk capacity to 0, which in turn results in the gendisk > structure capacity to be set to 0. This all works well for the first > revalidate pass on a disk and the block layer detects the capactiy > change. > > On the second revalidate pass, blk_revalidate_disk_zones() is called > again and sd_zbc_report_zones() executed to check the zones a second > time. However, for this second pass, the gendisk capacity is now 0, > which results in sd_zbc_report_zones() to do nothing and to report > success and no zones. blk_revalidate_disk_zones() in turn returns > success and sets the disk queue chunk_sectors limit with zero as > no zones were checked, causing a oops to trigger on the > BUG_ON(!is_power_of_2(chunk_sectors)) in blk_queue_chunk_sectors(). > > Fix this by using the sdkp capacity field rather than the gendisk > capacity for the report zones loop in sd_zbc_report_zones(). Also add a > check to return immediately an error if the sdkp capacity is 0. > With this fix, invalid/buggy ZBC disk scan does not trigger a oops and > are exposed with a 0 capacity. This change also preserve the chance for > the disk to be correctly revalidated on the second revalidate pass as > the scsi disk structure capacity field is always set to the disk > reported value when sd_zbc_report_zones() is called. > > Fixes: d41003513e61 ("block: rework zone reporting") > Cc: Cc: <stable@xxxxxxxxxxxxxxx> # v5.5 > Signed-off-by: Damien Le Moal <damien.lemoal@xxxxxxx> Looks good, Reviewed-by: Christoph Hellwig <hch@xxxxxx>