On 8/3/22 11:04 AM, Dmitry Bogdanov wrote: > Hi linux target comminity. > > Let's me present RFC of an implementation of cluster features for Target > Core that needs for backstore devices shared through cluster nodes. > > The patchset is big and of several subsets, but it contains some arguable > things and it would take too much time to discsuss them separatelly. > > Patches 1-9: > Make RTPI be part of se_tpg instead of se_lun. That is a must because > there is no possibility to assign RTPI on a LUN. > That data model is different from SCST and current in LIO but still does > not contradict with SAM and even is more according to SAM - a whole TCM > is a SCSI Device, and all its ports are SCSI Ports with unique RTPIs. > + unique identification of TPG through the cluster. > + possibility of assignment of RPTI. > - number of all TPGs will be limited to 65535. > This patchset was published first time 2 years ago [1]. In previous > version the peers RTPIs were put in <device>/alua/... folder. In this > version the peers RTPIs are part of TPGs on the remote fabric (patch 35). > > Patches 10-29: > Fixes some bugs and deviations from the standard in PR code. > Undepend pr_reg from se_nacl and se_tpg to be just a registration holder. > Make APTPL registrations (not linked to se_dev_entry) be full-fledged > registrations. What are the arguable parts? Do you think it will be the DLM part and coordinating it with nvmet developers? Or was it patches 1-9 and the multi-node support? Or both :) Is it possible and would it be valuable to at least kind of break this up a little? I would break this up and post the fixes in one set. I'll help you get them in as soon as possible. For patches 1-9, I think I remember you posting them before, but I was in the middle of starting a new job so I didn't review them. I really needed something like that at my last 2 jobs so I think it's a valuable feature and I'll review that as well. If we could at least get those 2 chunks separated then it would make the DLM parts below easier to get eyeballs on. I'm ok with the idea in general. I think every nvmet developer will see the massive patchset and not even look at this first 0/48 email :) > > Patches 30-34: > DLM_CKV module that uses DLM and provides: > * Cluster Lock service (pure wrapper over DLM). > * Cluster Key-Value service in memory storage. > * Cluster Notification service with a blocking acknowledge. > * Cluster membership callbacks. > This module is supposed to be used by TCM and nvmet to implement cluster > operations. > > Patch 35: > New 'remote' (in fact dummy) fabric module. Configuration on this fabric will > provide to TCM a view of TPG/LUN/ACL configuration on a peer nodes. > > Patche 36: > Introduce cluster ops and functions to register a cluster ops > implementation modules. There could be a several different modules. > The device attrib cluster_impl regulates which implementation to use > for that device. 'single' is for default (no cluster) implementation. > > Patches 37-48: > TCM Cluster over DLM module implementation inspired by SCST. > * Use DLM_CKV Lock service to serialize order of PR OUT commands > * Use DLM_CKV Key-Value storage service to store PR cluster data. > Sync it after successful execution of PR OUT command. > * Use DLM_CKV Notification service to notify (in blocking manner) other > nodes to fetch PR cluster data. The handling of PR OUT command is > blocked until other nodes read the cluster PR data. > > It provides: > * Cluster lock per LBA for Compare And Write. > * Full support of SCSI-3 Persistent Reservations including > PREEMPT AND ABORT and REGISTER AND MOVE. > * Normal PR APTPL imlementation (persistanse over power loss) > * Shared LUN RESET > * Shared SCSI-2 Reservations. > * Unit Attentions for all TPGs in cluster >