Re: [RFC PATCH 00/48] Target cluster implementation over DLM

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 03, 2022 at 12:36:56PM -0500, Mike Christie wrote:
> 
> On 8/3/22 11:04 AM, Dmitry Bogdanov wrote:
> > Hi linux target comminity.
> >
> > Let's me present RFC of an implementation of cluster features for Target
> > Core that needs for backstore devices shared through cluster nodes.
> >
> > The patchset is big and of several subsets, but it contains some arguable
> > things and it would take too much time to discsuss them separatelly.
> >
> > Patches 1-9:
> > Make RTPI be part of se_tpg instead of se_lun. That is a must because
> > there is no possibility to assign RTPI on a LUN.
> > That data model is different from SCST and current in LIO but still does
> > not contradict with SAM and even is more according to SAM - a whole TCM
> > is a SCSI Device, and all its ports are SCSI Ports with unique RTPIs.
> >  + unique identification of TPG through the cluster.
> >  + possibility of assignment of RPTI.
> >  - number of all TPGs will be limited to 65535.
> > This patchset was published first time 2 years ago [1]. In previous
> > version the peers RTPIs were put in <device>/alua/... folder. In this
> > version the peers RTPIs are part of TPGs on the remote fabric (patch 35).
> >
> > Patches 10-29:
> > Fixes some bugs and deviations from the standard in PR code.
> > Undepend pr_reg from se_nacl and se_tpg to be just a registration holder.
> > Make APTPL registrations (not linked to se_dev_entry) be full-fledged
> > registrations.
> 
> 
> What are the arguable parts? Do you think it will be the DLM part
> and coordinating it with nvmet developers? Or was it patches 1-9
> and the multi-node support? Or both :)
In fact every subset can be a subject to argue :) 
* RTPI patchset - changing data model from RTPI-set on backstore device
to RTPI-set on a whole node.
* PR refactoring - to much changes, may be APTPL changes are not
  backward compatible
* remote/dummy fabric - name 
* DLM_CKV - name, place and even a meaning of the module
* tcm_cluster - too much new exported symbols, not resistant to
  node death in between of storing PR data in DLM_CKV and other error
  cases.

> Is it possible and would it be valuable to at least kind of break this
> up a little?
> 
> I would break this up and post the fixes in one set. I'll help you get
> them in as soon as possible.
After approve of the idea I can break the patch set to several ones
and start to post it without RFC prefix. The only problem is that they
all depend on previous ones. So I have to post each after the previous
gets merged.
> 
> For patches 1-9, I think I remember you posting them before, but I was in
> the middle of starting a new job so I didn't review them. I really needed
> something like that at my last 2 jobs so I think it's a valuable feature
> and I'll review that as well.
> 
> If we could at least get those 2 chunks separated then it would make the DLM
> parts below easier to get eyeballs on. I'm ok with the idea in general. I
> think every nvmet developer will see the massive patchset and not even look at
> this first 0/48 email :)
I am not going to share this patchset to nvmet dev list :)
nvmet does not yet have a local version of CompareAndWrite and
Reservations features, so it is too early for them.
> 
> 
> >
> > Patches 30-34:
> > DLM_CKV module that uses DLM and provides:
> >  * Cluster Lock service (pure wrapper over DLM).
> >  * Cluster Key-Value service in memory storage.
> >  * Cluster Notification service with a blocking acknowledge.
> >  * Cluster membership callbacks.
> > This module is supposed to be used by TCM and nvmet to implement cluster
> > operations.
> >
> > Patch 35:
> > New 'remote' (in fact dummy) fabric module. Configuration on this fabric will
> > provide to TCM a view of TPG/LUN/ACL configuration on a peer nodes.
> >
> > Patche 36:
> > Introduce cluster ops and functions to register a cluster ops
> > implementation modules. There could be a several different modules.
> > The device attrib cluster_impl regulates which implementation to use
> > for that device. 'single' is for default (no cluster) implementation.
> >
> > Patches 37-48:
> > TCM Cluster over DLM module implementation inspired by SCST.
> >  * Use DLM_CKV Lock service to serialize order of PR OUT commands
> >  * Use DLM_CKV Key-Value storage service to store PR cluster data.
> > Sync it after successful execution of PR OUT command.
> >  * Use DLM_CKV Notification service to notify (in blocking manner) other
> > nodes to fetch PR cluster data. The handling of PR OUT command is
> > blocked until other nodes read the cluster PR data.
> >
> > It provides:
> >  * Cluster lock per LBA for Compare And Write.
> >  * Full support of SCSI-3 Persistent Reservations including
> >    PREEMPT AND ABORT and REGISTER AND MOVE.
> >  * Normal PR APTPL imlementation (persistanse over power loss)
> >  * Shared LUN RESET
> >  * Shared SCSI-2 Reservations.
> >  * Unit Attentions for all TPGs in cluster
> >



[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux