Re: [PATCH 00/13] bcache patches for Linux v5.13 -- 2nd wave

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/16/21 8:02 PM, Jens Axboe wrote:
> On 4/13/21 11:46 PM, Coly Li wrote:
>> Hi Jens,
>>
>> This is the 2nd wave of bcache patches for Linux v5.13. This series are
>> patches to use NVDIMM to store bcache journal, which is the first effort
>> to support NVDIMM for bcache [EXPERIMENTAL].
>>
>> All concerns from Linux v5.12 merge window are fixed, especially the
>> data type defined in include/uapi/linux/bcache-nvm.h. And in this
>> series, all the lists defined in bcache-nvm.h uapi file are stored and
>> accessed directly on NVDIMM as memory objects.
>>
>> Intel developers Jianpeng Ma and Qiaowei Ren compose the initial code of
>> nvm-pages, the related patches are,
>> - bcache: initialize the nvm pages allocator
>> - bcache: initialization of the buddy
>> - bcache: bch_nvm_alloc_pages() of the buddy
>> - bcache: bch_nvm_free_pages() of the buddy
>> - bcache: get allocated pages from specific owner
>> All the code depends on Linux libnvdimm and dax drivers, the bcache nvm-
>> pages allocator can be treated as user of these two drivers.
>>
>> The nvm-pages allocator is a buddy-like allocator, which allocates size
>> in power-of-2 pages from the NVDIMM namespace. User space tool 'bcache'
>> has a new added '-M' option to format a NVDIMM namespace and register it
>> via sysfs interface as a bcache meta device. The nvm-pages kernel code
>> does a DAX mapping to map the whole namespace into system's memory
>> address range, and allocating the pages to requestion like typical buddy
>> allocator does. The major difference is nvm-pages allocator maintains
>> the pages allocated to each requester by a owner list which stored on
>> NVDIMM too. Owner list of different requester is tracked by a pre-
>> defined UUID, all the pages tracked in all owner lists are treated as
>> allocated busy pages and won't be initialized into buddy system after
>> the system reboot.
>>
>> I modify the bcache code to recognize the nvm meta device feature,
>> initialize journal on NVDIMM, and do journal I/Os on NVDIMM in the
>> following patches,
>> - bcache: add initial data structures for nvm pages
>> - bcache: use bucket index to set GC_MARK_METADATA for journal buckets
>>   in bch_btree_gc_finish()
>> - bcache: add BCH_FEATURE_INCOMPAT_NVDIMM_META into incompat feature set
>> - bcache: initialize bcache journal for NVDIMM meta device
>> - bcache: support storing bcache journal into NVDIMM meta device
>> - bcache: read jset from NVDIMM pages for journal replay
>> - bcache: add sysfs interface register_nvdimm_meta to register NVDIMM
>>   meta device
>> - bcache: use div_u64() in init_owner_info()
>>
>> The bcache journal code may request a block of power-of-2 size pages
>> from the nvm-pages allocator, normally it is a range of 256MB or 512MB
>> continuous pages range. During meta data journaling, the in-memory jsets
>> go into the calculated nvdimm pages location by kernel memcpy routine.
>> So the journaling I/Os won't go into block device (e.g. SSD) anymore,
>> the write and read for journal jsets happen on NVDIMM.
>>
>> The whole series is testing for a while and all addressed issues are
>> verified to be fixed. Now it is time to consider this series as an
>> initial code base of a commnity cooperation and have them in bcache
>> upstream for future development.
>>
>> Thanks in advance for taking this. 
> 
> Applied, with 13/13 folded in.
> 

Thank you for doing this.

Coly Li



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux ARM Kernel]     [Linux Filesystem Development]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux