Darrick, On 2019/12/24 13:28, Darrick J. Wong wrote: >> [...] >>>> + >>>> +static int zonefs_get_zone_info(struct zonefs_zone_data *zd) >>>> +{ >>>> + struct block_device *bdev = zd->sb->s_bdev; >>>> + int ret; >>>> + >>>> + zd->zones = kvcalloc(blkdev_nr_zones(bdev->bd_disk), >>>> + sizeof(struct blk_zone), GFP_KERNEL); >>> >>> Hmm, so one 64-byte blk_zone structure for each zone on the disk? >>> >>> I have a 14TB SMR disk with ~459,000x 32M zones on it. That's going to >>> require a contiguous 30MB memory allocation to hold all the zone >>> information. Even your 15T drive from the commit message will need a >>> contiguous 3.8MB memory allocation for all the zone info. >>> >>> I wonder if each zone should really be allocated separately and then >>> indexed with an xarray or something like that to reduce the chance of >>> failure when memory is fragmented or tight. >>> >>> That could be subsequent work though, since in the meantime that just >>> makes zonefs mounts more likely to run out of memory and fail. I >>> suppose you don't hang on to the huge allocation for very long. >> >> No, this memory allocation is only for mount. It is dropped as soon as >> all the zone file inodes are created. Furthermore, this allocation is a >> kvalloc, not a kmalloc. So there is no memory continuity requirement. >> This is only an array of structures and that is not used to do IOs for >> the report zone itself. >> >> I debated trying to optimize (I mean reducing the mount temporary memory >> use) by processing mount in small chunks of zones instead of all zones >> in one go. I kept simple, but rather brutal, approach to keep the code >> simple. This can be rewritten and optimized at any time if we see >> problems appearing. > > <nod> vmalloc space is quite limited on 32-bit platforms, so that's the > most likely place you'll get complaints. Yes, agreed. But the main use case for host-managed zoned drives (HDDs or SSDs) being enterprise servers, 32-bits arch are unlikely to be an issue. So for now, if there is no strong opposition, I would like to keep the initialization as it is and revisit later if problems are reported. -- Damien Le Moal Western Digital Research