> On 26 Jan 2018, at 10.54, Matias Bjørling <mb@xxxxxxxxxxx> wrote: > > On 01/26/2018 09:30 AM, Javier Gonzalez wrote: >>> On 25 Jan 2018, at 22.02, Matias Bjørling <mb@xxxxxxxxxxx> wrote: >>> >>> On 01/25/2018 04:26 PM, Javier Gonzalez wrote: >>>> Hi, >>>> There are some topics that I would like to discuss at LSF/MM: >>>> - In the past year we have discussed a lot how we can integrate the >>>> Open-Channel SSD (OCSSD) spec with zone devices (SMR). This >>>> discussion is both at the interface level and at an in-kernel level. >>>> Now that Damien's and Hannes' patches are upstreamed in good shape, >>>> it would be a good moment to discuss how we can integrate the >>>> LightNVM subsystem with the existing code. >>> >>> The ZBC-OCSSD patches >>> (https://github.com/OpenChannelSSD/linux/tree/zbc-support) that I made >>> last year is a good starting point. >> Yes, this patches is a good place to start, but as mentioned below, they >> do not address how we would expose the parallelism on report_zone. >> The way I see it, zone-devices impose write constrains to gain capacity; >> OCSSD does that to enable the parallelism of the device. > > Also capacity for OCSSDs, as most raw flash is exposed. It is up to > the host to decide if over-provisioning is needed. > This is a good point. Actually, if we declare a _necessary_ OP area, users doing GC could use this OP space to do their job. For journaled-only areas, no extra GC will be necessary. For random areas, pblk can do the job (in a host managed solution). > This then can >> be used by different users to either lower down media wear, reach a >> stable state at the very early stage or guarantee tight latencies. That >> depends on how it is used. We can use an OCSSD as a zone-device and it >> will work, but it is coming back to using an interface that will narrow >> down the OCSSD scope (at least in its current format). >>> Specifically, in ALPSS'17 >>>> we had discussions on how we can extend the kernel zoned device >>>> interface with the notion of parallel units that the OCSSD geometry >>>> builds upon. We are now bringing the OCSSD spec. to standarization, >>>> but we have time to incorporate feedback and changes into the spec. >>> >>> Which spec? the OCSSD 2 spec that I have copyright on? I don't believe >>> it has been submitted or is under consideration to any standards body >>> yet and I don't currently plan to do that. >>> >>> You might have meant "to be finalized". As you know, I am currently >>> soliciting feedback and change requests from vendors and partners with >>> respect to the specification and is planning on closing it soon. If >>> CNEX is doing their own new specification, please be open about it, >>> and don't put it under the OCSSD name. >> As you know, there is a group of cloud providers and vendors that is >> starting to work on the standarization process with the current state of >> the 2.0 spec as the staring point - you have been part of these >> discussions... The goal for this group is to collect the feedback from >> all parties and come up with a spec. that is useful and covers cloud >> needs. Exactly for - as you imply -, not to tie the spec. to an >> organization and/or individual. My hope is that this spec is very >> similar to the OCSSD 2.0 that _we_ all have worked hard on putting >> together. > > Yes, that is my point. The workgroup device specification you are > discussing may or may not be OCSSD 2.0 similar/compatible and is not > tied to the process that is currently being run for the OCSSD 2.0 > specification. Please keep OCSSD out of the discussions until the > device specification from the workgroup has been completed and made > public. Hopefully the device specification turns out to be OCSSD 2.0 > compatible and the bits can be added to the 2.0 (2.1) specification. > If not, it has to be stand-alone, with its own implementation. > Then we agree. The reason to open the discussion is to ensure that feedback comes from different places. Many times we have experienced a mismatch between what is discussed in the standard bodies (e.g., NVMe working groups) and the reality of Linux. Ideally, we can avoid this. I _really_ hope that we can sit down and align OCSSD 2.X since it really makes no sense to have different flavours of the same thing in the wild... >> Later on, we can try to to checks on lba "batches", defined by this same >> write restrictions. But you are right that having a fully random lba >> vector will require individual checks and that is both expensive and >> intrusive. This can be isolated by flagging the nature of the bvec, >> something ala (sequential, batched, random). > > I think it must still be checked. One cannot trust that the LBAs are > as expected. For example, the case where LBAs are out of bounds and > accesses another partition. > Fair point. >>> For example supported natively in the NVMe specification. >> Then we agree that aiming at a stardard body is the goal, right? > > Vector I/O is orthogonal to proposing a zone/ocssd proposal to the > NVMe workgroup. Sure. But since both are related to the ocssd proposal, I would expect them to be discussed in the same context. I personally don't see much value in ocssd used as a zone device (same as I don't see the value of using an ocssd uniquely with pblk) - these are building blocks to enable adoption. Thhe real value comes from exposing the parallelism, and down the road the vector I/O is a more generic way of doing it. Javier
Attachment:
signature.asc
Description: Message signed with OpenPGP