Re: [PATCH 00/13] bcache patches for Linux v5.13 -- 2nd wave

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/13/21 11:46 PM, Coly Li wrote:
> Hi Jens,
> 
> This is the 2nd wave of bcache patches for Linux v5.13. This series are
> patches to use NVDIMM to store bcache journal, which is the first effort
> to support NVDIMM for bcache [EXPERIMENTAL].
> 
> All concerns from Linux v5.12 merge window are fixed, especially the
> data type defined in include/uapi/linux/bcache-nvm.h. And in this
> series, all the lists defined in bcache-nvm.h uapi file are stored and
> accessed directly on NVDIMM as memory objects.
> 
> Intel developers Jianpeng Ma and Qiaowei Ren compose the initial code of
> nvm-pages, the related patches are,
> - bcache: initialize the nvm pages allocator
> - bcache: initialization of the buddy
> - bcache: bch_nvm_alloc_pages() of the buddy
> - bcache: bch_nvm_free_pages() of the buddy
> - bcache: get allocated pages from specific owner
> All the code depends on Linux libnvdimm and dax drivers, the bcache nvm-
> pages allocator can be treated as user of these two drivers.
> 
> The nvm-pages allocator is a buddy-like allocator, which allocates size
> in power-of-2 pages from the NVDIMM namespace. User space tool 'bcache'
> has a new added '-M' option to format a NVDIMM namespace and register it
> via sysfs interface as a bcache meta device. The nvm-pages kernel code
> does a DAX mapping to map the whole namespace into system's memory
> address range, and allocating the pages to requestion like typical buddy
> allocator does. The major difference is nvm-pages allocator maintains
> the pages allocated to each requester by a owner list which stored on
> NVDIMM too. Owner list of different requester is tracked by a pre-
> defined UUID, all the pages tracked in all owner lists are treated as
> allocated busy pages and won't be initialized into buddy system after
> the system reboot.
> 
> I modify the bcache code to recognize the nvm meta device feature,
> initialize journal on NVDIMM, and do journal I/Os on NVDIMM in the
> following patches,
> - bcache: add initial data structures for nvm pages
> - bcache: use bucket index to set GC_MARK_METADATA for journal buckets
>   in bch_btree_gc_finish()
> - bcache: add BCH_FEATURE_INCOMPAT_NVDIMM_META into incompat feature set
> - bcache: initialize bcache journal for NVDIMM meta device
> - bcache: support storing bcache journal into NVDIMM meta device
> - bcache: read jset from NVDIMM pages for journal replay
> - bcache: add sysfs interface register_nvdimm_meta to register NVDIMM
>   meta device
> - bcache: use div_u64() in init_owner_info()
> 
> The bcache journal code may request a block of power-of-2 size pages
> from the nvm-pages allocator, normally it is a range of 256MB or 512MB
> continuous pages range. During meta data journaling, the in-memory jsets
> go into the calculated nvdimm pages location by kernel memcpy routine.
> So the journaling I/Os won't go into block device (e.g. SSD) anymore,
> the write and read for journal jsets happen on NVDIMM.
> 
> The whole series is testing for a while and all addressed issues are
> verified to be fixed. Now it is time to consider this series as an
> initial code base of a commnity cooperation and have them in bcache
> upstream for future development.
> 
> Thanks in advance for taking this. 

Applied, with 13/13 folded in.

-- 
Jens Axboe




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux