Re: [PATCH 13/13] lightnvm: Inherit mdts from the parent nvme device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On 4 Mar 2019, at 15.24, Hans Holmberg <hans@xxxxxxxxxxxxx> wrote:
> 
> On Mon, Mar 4, 2019 at 2:44 PM Javier González <javier@xxxxxxxxxxx> wrote:
>>> On 4 Mar 2019, at 14.25, Matias Bjørling <mb@xxxxxxxxxxx> wrote:
>>> 
>>> On 3/4/19 2:19 PM, Javier González wrote:
>>>>> On 4 Mar 2019, at 13.22, Hans Holmberg <hans@xxxxxxxxxxxxx> wrote:
>>>>> 
>>>>> On Mon, Mar 4, 2019 at 12:44 PM Javier González <javier@xxxxxxxxxxx> wrote:
>>>>>>> On 4 Mar 2019, at 12.30, Hans Holmberg <hans@xxxxxxxxxxxxx> wrote:
>>>>>>> 
>>>>>>> On Mon, Mar 4, 2019 at 10:05 AM Javier González <javier@xxxxxxxxxxx> wrote:
>>>>>>>>> On 27 Feb 2019, at 18.14, Igor Konopko <igor.j.konopko@xxxxxxxxx> wrote:
>>>>>>>>> 
>>>>>>>>> Current lightnvm and pblk implementation does not care
>>>>>>>>> about NVMe max data transfer size, which can be smaller
>>>>>>>>> than 64*K=256K. This patch fixes issues related to that.
>>>>>>> 
>>>>>>> Could you describe *what* issues you are fixing?
>>>>>>> 
>>>>>>>>> Signed-off-by: Igor Konopko <igor.j.konopko@xxxxxxxxx>
>>>>>>>>> ---
>>>>>>>>> drivers/lightnvm/core.c      | 9 +++++++--
>>>>>>>>> drivers/nvme/host/lightnvm.c | 1 +
>>>>>>>>> include/linux/lightnvm.h     | 1 +
>>>>>>>>> 3 files changed, 9 insertions(+), 2 deletions(-)
>>>>>>>>> 
>>>>>>>>> diff --git a/drivers/lightnvm/core.c b/drivers/lightnvm/core.c
>>>>>>>>> index 5f82036fe322..c01f83b8fbaf 100644
>>>>>>>>> --- a/drivers/lightnvm/core.c
>>>>>>>>> +++ b/drivers/lightnvm/core.c
>>>>>>>>> @@ -325,6 +325,7 @@ static int nvm_create_tgt(struct nvm_dev *dev, struct nvm_ioctl_create *create)
>>>>>>>>>    struct nvm_target *t;
>>>>>>>>>    struct nvm_tgt_dev *tgt_dev;
>>>>>>>>>    void *targetdata;
>>>>>>>>> +     unsigned int mdts;
>>>>>>>>>    int ret;
>>>>>>>>> 
>>>>>>>>>    switch (create->conf.type) {
>>>>>>>>> @@ -412,8 +413,12 @@ static int nvm_create_tgt(struct nvm_dev *dev, struct nvm_ioctl_create *create)
>>>>>>>>>    tdisk->private_data = targetdata;
>>>>>>>>>    tqueue->queuedata = targetdata;
>>>>>>>>> 
>>>>>>>>> -     blk_queue_max_hw_sectors(tqueue,
>>>>>>>>> -                     (dev->geo.csecs >> 9) * NVM_MAX_VLBA);
>>>>>>>>> +     mdts = (dev->geo.csecs >> 9) * NVM_MAX_VLBA;
>>>>>>>>> +     if (dev->geo.mdts) {
>>>>>>>>> +             mdts = min_t(u32, dev->geo.mdts,
>>>>>>>>> +                             (dev->geo.csecs >> 9) * NVM_MAX_VLBA);
>>>>>>>>> +     }
>>>>>>>>> +     blk_queue_max_hw_sectors(tqueue, mdts);
>>>>>>>>> 
>>>>>>>>>    set_capacity(tdisk, tt->capacity(targetdata));
>>>>>>>>>    add_disk(tdisk);
>>>>>>>>> diff --git a/drivers/nvme/host/lightnvm.c b/drivers/nvme/host/lightnvm.c
>>>>>>>>> index b759c25c89c8..b88a39a3cbd1 100644
>>>>>>>>> --- a/drivers/nvme/host/lightnvm.c
>>>>>>>>> +++ b/drivers/nvme/host/lightnvm.c
>>>>>>>>> @@ -991,6 +991,7 @@ int nvme_nvm_register(struct nvme_ns *ns, char *disk_name, int node)
>>>>>>>>>    geo->csecs = 1 << ns->lba_shift;
>>>>>>>>>    geo->sos = ns->ms;
>>>>>>>>>    geo->ext = ns->ext;
>>>>>>>>> +     geo->mdts = ns->ctrl->max_hw_sectors;
>>>>>>>>> 
>>>>>>>>>    dev->q = q;
>>>>>>>>>    memcpy(dev->name, disk_name, DISK_NAME_LEN);
>>>>>>>>> diff --git a/include/linux/lightnvm.h b/include/linux/lightnvm.h
>>>>>>>>> index 5d865a5d5cdc..d3b02708e5f0 100644
>>>>>>>>> --- a/include/linux/lightnvm.h
>>>>>>>>> +++ b/include/linux/lightnvm.h
>>>>>>>>> @@ -358,6 +358,7 @@ struct nvm_geo {
>>>>>>>>>    u16     csecs;          /* sector size */
>>>>>>>>>    u16     sos;            /* out-of-band area size */
>>>>>>>>>    bool    ext;            /* metadata in extended data buffer */
>>>>>>>>> +     u32     mdts;           /* Max data transfer size*/
>>>>>>>>> 
>>>>>>>>>    /* device write constrains */
>>>>>>>>>    u32     ws_min;         /* minimum write size */
>>>>>>>>> --
>>>>>>>>> 2.17.1
>>>>>>>> 
>>>>>>>> I see where you are going with this and I partially agree, but none of
>>>>>>>> the OCSSD specs define a way to define this parameter. Thus, adding this
>>>>>>>> behavior taken from NVMe in Linux can break current implementations. Is
>>>>>>>> this a real life problem for you? Or this is just for NVMe “correctness”?
>>>>>>>> 
>>>>>>>> Javier
>>>>>>> 
>>>>>>> Hmm.Looking into the 2.0 spec what it says about vector reads:
>>>>>>> 
>>>>>>> (figure 28):"The number of Logical Blocks (NLB): This field indicates
>>>>>>> the number of logical blocks to be read. This is a 0’s based value.
>>>>>>> Maximum of 64 LBAs is supported."
>>>>>>> 
>>>>>>> You got the max limit covered, and the spec  does not say anything
>>>>>>> about the minimum number of LBAs to support.
>>>>>>> 
>>>>>>> Matias: any thoughts on this?
>>>>>>> 
>>>>>>> Javier: How would this patch break current implementations?
>>>>>> 
>>>>>> Say an OCSSD controller that sets mdts to a value under 64 or does not
>>>>>> set it at all (maybe garbage). Think you can get to one pretty quickly...
>>>>> 
>>>>> So we cant make use of a perfectly good, standardized, parameter
>>>>> because some hypothetical non-compliant device out there might not
>>>>> provide a sane value?
>>>> The OCSSD standard has never used NVMe parameters, so there is no
>>>> compliant / non-compliant. In fact, until we changed OCSSD 2.0 to
>>>> get the sector and OOB sizes from the standard identify
>>>> command, we used to have them in the geometry.
>>> 
>>> What the hell? Yes it has. The whole OCSSD spec is dependent on the
>>> NVMe spec. It is using many commands from the NVMe specification,
>>> which is not defined in the OCSSD specification.
>> 
>> First, lower the tone.
>> 
>> Second, no, it has not and never has, starting with all the write
>> constrains, continuing with the vector commands, etc. You cannot choose
>> what you want to be compliant with and what you do not. OCSSD uses the
>> NVMe protocol but it is self sufficient with its geometry for all the
>> read / write / erase paths - it even depends on different PCIe class
>> codes to be identified… To do this in the way the rest of the spec is
>> defined, we either add a filed to the geometry or explicitly mention
>> that MDTS is used, as we do with the sector and metadata sizes.
>> 
>> Third, as a maintainer of this subsystem you should care about devices
>> in the field that might break due to such a change (supported by the
>> company you work for or not) - even if you can argue whether the change
>> is compliant or not.
>> 
>> And Hans, as a representative of a company that has such devices out
>> there, you should care too.
> 
> If you worry about me doing my job, you need not to.
> I test. So far I have not found any regressions in this patchset.
> 
> Please keep your open source hat on Javier.

Ok. You said it.

Then please apply the patch :)

Reviewed-by: Javier González <javier@xxxxxxxxxxx>

Attachment: signature.asc
Description: Message signed with OpenPGP


[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux