Re: [RFC PATCH v2] bcache: export zoned information for backing device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Coli,

On 2020/05/11 2:24, Coly Li wrote:
> On 2020/5/11 00:52, Coly Li wrote:
>> This is a very basic zoned device support. With this patch, bcache
>> device is able to,
>> - Export zoned device attribution via sysfs
>> - Response report zones request, e.g. by command 'blkzone report'
>> But the bcache device is still NOT able to,
>> - Response any zoned device management request or IOCTL command
>>
>> Here are the testings I have done,
>> - read /sys/block/bcache0/queue/zoned, content is 'host-managed'
>> - read /sys/block/bcache0/queue/nr_zones, content is number of zones
>>   including all zone types.
>> - read /sys/block/bcache0/queue/chunk_sectors, content is zone size
>>   in sectors.
>> - run 'blkzone report /dev/bcache0', all zones information displayed.
>> - run 'blkzone reset /dev/bcache0', operation is rejected with error
>>   information: "blkzone: /dev/bcache0: BLKRESETZONE ioctl failed:
>>   Operation not supported"
>> - Sequential writes by dd, I can see some zones' write pointer 'wptr'
>>   values updated.
>>
>> All of these are very basic testings, if you have better testing
>> tools or cases, please offer me hint.
>>
>> Thanks in advance for your review and comments.
>>
>> Signed-off-by: Coly Li <colyli@xxxxxxx>
>> CC: Hannes Reinecke <hare@xxxxxxxx>
>> CC: Damien Le Moal <damien.lemoal@xxxxxxx>
>> CC: Johannes Thumshirn <johannes.thumshirn@xxxxxxx>
>> ---
> 
> Hi Damien and Johannes,
> 
> With this patch the bcache device with a SMR drive can export the zone
> information and format zonefs on top of it. Writeback mode does not work
> at this moment (it requires on-disk format change and on my to-do list),
> writethrough and writearound mode can be used on the bcache device to
> accelerate random read when hitting.
> 
> During my testing, there are 2 things needs to fix.
> 
> 1, mkzonefs report the first zone size does not match.
>    Because bcache occupies the first 8KB of the backing SMR drive, so
> the first zone size is 8KB less. By ignoring unmatched zone 0 size,
> mkzonefs works and the bcache device is formated.

Hannes was faster than me to comment on this. I can only repeat: this will not
work as there is the assumptions that all zones are the same size, except
eventually for the last one. I am even surprised that mkzonefs worked... Looks I
have some patching to do :)

The simplest solution is as Hannes pointed out: use the first zone in its
entirety for bcache super block and nothing else. And do not show this zone in
the zone report so that the bcache device user cannot reset or overwrite it.
That will mean as Hannes pointed out that the report zones device method for
bcache will need to remap zone start and wp sectors to start from sector 0.
Basically, substracting zone size to all zone start and to zone wp for non-full
zones will do.

> 2, Direct I/O on files under seq/ directory can not be accessed.
>    I need help here. It seems direct I/O write fails with -EINVAL. I
> found the failure happens in fs/iomap/direct-io.c:iomap_dio_bio_actor(),
> 211         if ((pos | length | align) & ((1 << blkbits) - 1))
> 212                 return -EINVAL;
> 
> When I write to seq/1 file on offset 0 with 4096 bytes, in the above
> line, align is 205427296, and  (pos | length | align) & ((1 << blkbits)
> - 1) is non-zero. Then all writes to files under seq/ fail with -EINVAL.

205427296 is 195MB, so offset 0 in file seq/1 does not align to zone 1 start
sector. This is why you see the  error (unaligned write). This is due to the
first zone not being the same size, which zonefs assumes.

If you could do all that, it probably also mean that you forgot to call
blk_revalidate_disk_zones() when setting up the bcache device. That is necessary
to set all queue limits and check the zone sizes are equal.

I will check your patch shortly.

> I guess there should be something missing when I do the direct I/O
> write. Could you all give me some hint ?
> 
> Thanks in advance.
> 
> Coly Li
> 


-- 
Damien Le Moal
Western Digital Research




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux