Re: [PATCH 4/5] lightnvm: pblk: add padding distribution sysfs attribute

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 6, 2018 at 11:50 AM, Matias Bjørling <mb@xxxxxxxxxxx> wrote:
> On 02/06/2018 10:27 AM, Hans Holmberg wrote:
>>
>> Hi Matias,
>>
>> On Wed, Jan 31, 2018 at 9:44 AM, Matias Bjørling <mb@xxxxxxxxxxx> wrote:
>>>
>>> On 01/31/2018 03:06 AM, Javier González wrote:
>>>>
>>>>
>>>> From: Hans Holmberg <hans.holmberg@xxxxxxxxxxxx>
>>>>
>>>> When pblk receives a sync, all data up to that point in the write buffer
>>>> must be comitted to persistent storage, and as flash memory comes with a
>>>> minimal write size there is a significant cost involved both in terms
>>>> of time for completing the sync and in terms of write amplification
>>>> padded sectors for filling up to the minimal write size.
>>>>
>>>> In order to get a better understanding of the costs involved for syncs,
>>>> Add a sysfs attribute to pblk: padded_dist, showing a normalized
>>>> distribution of sectors padded. In order to facilitate measurements of
>>>> specific workloads during the lifetime of the pblk instance, the
>>>> distribution can be reset by writing 0 to the attribute.
>>>>
>>>> Do this by introducing counters for each possible padding:
>>>> {0..(minimal write size - 1)} and calculate the normalized distribution
>>>> when showing the attribute.
>>>>
>>>> Signed-off-by: Hans Holmberg <hans.holmberg@xxxxxxxxxxxx>
>>>> Signed-off-by: Javier González <javier@xxxxxxxxxxxx>
>>>> ---
>>>>    drivers/lightnvm/pblk-init.c  | 16 ++++++++--
>>>>    drivers/lightnvm/pblk-rb.c    | 15 +++++-----
>>>>    drivers/lightnvm/pblk-sysfs.c | 68
>>>> +++++++++++++++++++++++++++++++++++++++++++
>>>>    drivers/lightnvm/pblk.h       |  6 +++-
>>>>    4 files changed, 95 insertions(+), 10 deletions(-)
>>>>
>>>> diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
>>>> index 7eedc5daebc8..a5e3510c0f38 100644
>>>> --- a/drivers/lightnvm/pblk-init.c
>>>> +++ b/drivers/lightnvm/pblk-init.c
>>>> @@ -921,6 +921,7 @@ static void pblk_free(struct pblk *pblk)
>>>>    {
>>>>          pblk_luns_free(pblk);
>>>>          pblk_lines_free(pblk);
>>>> +       kfree(pblk->pad_dist);
>>>>          pblk_line_meta_free(pblk);
>>>>          pblk_core_free(pblk);
>>>>          pblk_l2p_free(pblk);
>>>> @@ -998,11 +999,13 @@ static void *pblk_init(struct nvm_tgt_dev *dev,
>>>> struct gendisk *tdisk,
>>>>          pblk->pad_rst_wa = 0;
>>>>          pblk->gc_rst_wa = 0;
>>>>    +     atomic_long_set(&pblk->nr_flush, 0);
>>>> +       pblk->nr_flush_rst = 0;
>>>> +
>>>>    #ifdef CONFIG_NVM_DEBUG
>>>>          atomic_long_set(&pblk->inflight_writes, 0);
>>>>          atomic_long_set(&pblk->padded_writes, 0);
>>>>          atomic_long_set(&pblk->padded_wb, 0);
>>>> -       atomic_long_set(&pblk->nr_flush, 0);
>>>>          atomic_long_set(&pblk->req_writes, 0);
>>>>          atomic_long_set(&pblk->sub_writes, 0);
>>>>          atomic_long_set(&pblk->sync_writes, 0);
>>>> @@ -1034,10 +1037,17 @@ static void *pblk_init(struct nvm_tgt_dev *dev,
>>>> struct gendisk *tdisk,
>>>>                  goto fail_free_luns;
>>>>          }
>>>>    +     pblk->pad_dist = kzalloc((pblk->min_write_pgs - 1) *
>>>> sizeof(atomic64_t),
>>>> +                                GFP_KERNEL);
>>>> +       if (!pblk->pad_dist) {
>>>> +               ret = -ENOMEM;
>>>> +               goto fail_free_line_meta;
>>>> +       }
>>>> +
>>>>          ret = pblk_core_init(pblk);
>>>>          if (ret) {
>>>>                  pr_err("pblk: could not initialize core\n");
>>>> -               goto fail_free_line_meta;
>>>> +               goto fail_free_pad_dist;
>>>>          }
>>>>          ret = pblk_l2p_init(pblk);
>>>> @@ -1097,6 +1107,8 @@ static void *pblk_init(struct nvm_tgt_dev *dev,
>>>> struct gendisk *tdisk,
>>>>          pblk_l2p_free(pblk);
>>>>    fail_free_core:
>>>>          pblk_core_free(pblk);
>>>> +fail_free_pad_dist:
>>>> +       kfree(pblk->pad_dist);
>>>>    fail_free_line_meta:
>>>>          pblk_line_meta_free(pblk);
>>>>    fail_free_luns:
>>>> diff --git a/drivers/lightnvm/pblk-rb.c b/drivers/lightnvm/pblk-rb.c
>>>> index 7044b5599cc4..264372d7b537 100644
>>>> --- a/drivers/lightnvm/pblk-rb.c
>>>> +++ b/drivers/lightnvm/pblk-rb.c
>>>> @@ -437,9 +437,7 @@ static int pblk_rb_may_write_flush(struct pblk_rb
>>>> *rb,
>>>> unsigned int nr_entries,
>>>>          if (bio->bi_opf & REQ_PREFLUSH) {
>>>>                  struct pblk *pblk = container_of(rb, struct pblk, rwb);
>>>>    -#ifdef CONFIG_NVM_DEBUG
>>>>                  atomic_long_inc(&pblk->nr_flush);
>>>> -#endif
>>>>                  if (pblk_rb_flush_point_set(&pblk->rwb, bio, mem))
>>>>                          *io_ret = NVM_IO_OK;
>>>>          }
>>>> @@ -620,14 +618,17 @@ unsigned int pblk_rb_read_to_bio(struct pblk_rb
>>>> *rb,
>>>> struct nvm_rq *rqd,
>>>>                          pr_err("pblk: could not pad page in write
>>>> bio\n");
>>>>                          return NVM_IO_ERR;
>>>>                  }
>>>> +
>>>> +               if (pad < pblk->min_write_pgs)
>>>> +                       atomic64_inc(&pblk->pad_dist[pad - 1]);
>>>> +               else
>>>> +                       pr_warn("pblk: padding more than min.
>>>> sectors\n");
>>>
>>>
>>>
>>> Curious, when would this happen? Would it be an implementation error or
>>> something a user did wrong?
>>
>>
>> This would be an implementation error - and this check is just a
>> safeguard to make sure we won't go out of bounds when updating the
>> statistics.
>>
>
> Ah, does it make sense to have such defensive programming, when the value is
> never persisted? An 64 bit integer is quite large, and if only used for
> workloads for a single instance, it would practically never trigger, and if
> it did, do one care?

Ah, sorry for being unclear,  it's not for checking if the pad
counters overflow  - I wanted to make sure that we won't index
pad_dist with negative numbers if we mess up the padding calculations
in the future.

>
> <snip>
>
>>>
>>> Looks good to me. Let me know if you want to send a new patch, or if I
>>> may
>>> make the change when I pick it up.
>>
>>
>> Thanks for the review, i saw that you already reworked the patch a bit
>> on your core branch.
>> However, i found a build issue when building for i386, so i'll fix
>> that and submit a V2(that includes your update)
>
>
> Ok. I assume this is the kbuild mail I asked about a couple of days ago.
> I'll wait for v2.

I did see that particular mail, but it's most likely the same issue
that the V2 will fix.

Thanks,
Hans




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux