Re: Re: question about striped_read

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>On Wed, 31 Jul 2013, majianpeng wrote:
>> >On Tue, Jul 30, 2013 at 7:41 PM, majianpeng <majianpeng@xxxxxxxxx> wrote:
[snip]
>
>For ceph_osdc_readpages(),
>
>> A: ret = ENOENT
>
>From the original code, for this case we should zero the area.
Why?

Thanks!
Jianpeng Ma
>The object does not exist.
>
>> B: ret = 0
>
>The object exists but we read 0 bytes, which means we are past EOF or the 
>object has size 0 bytes.  Either way, we are either in a hole or past EOF.
>
>sage
>
>> 
>> Only we knowed this, we can handle exactly.
>> Sage, can you explain those meaning in detail?
>> 
>> Thanks!
>> Jianpeng Ma
>> 
>> >Regards
>> >Yan, Zheng
>> >
>> >
>> >>>> But i think i will add a parameter about hit_hole. It will make the code easy to understand.
>> >>>>
>> >>>
>> >>> i think 'was_short' is equal to 'hit_hole'
>> >>>
>> >[snip]
>> Thanks!
>> Jianpeng Ma
>> >On Tue, Jul 30, 2013 at 7:41 PM, majianpeng <majianpeng@xxxxxxxxx> wrote:
>> >>>>
>> >>>>>dd if=/dev/urandom bs=1M count=2 of=file_with_holes
>> >>>>>dd if=/dev/urandom bs=1M count=2 seek=4 of=file_with_holes conv=notrunc
>> >>>>>dd if=file_with_holes bs=8M >/dev/null
>> >>>>>
>> >>>> diff --git a/fs/ceph/file.c b/fs/ceph/file.c
>> >>>> index 2ddf061..22a98e5 100644
>> >>>> --- a/fs/ceph/file.c
>> >>>> +++ b/fs/ceph/file.c
>> >>>> @@ -349,17 +349,17 @@ more:
>> >>>>         dout("striped_read %llu~%u (read %u) got %d%s%s\n", pos, left, read,
>> >>>>              ret, hit_stripe ? " HITSTRIPE" : "", was_short ? " SHORT" : "");
>> >>>>
>> >>>> -       if (ret > 0) {
>> >>>> -               int didpages = (page_align + ret) >> PAGE_CACHE_SHIFT;
>> >>>> +       if (ret >= 0) {
>> >>>> +               int didpages = (page_align + this_len) >> PAGE_CACHE_SHIFT;
>> >>>>
>> >>>> -               if (read < pos - off) {
>> >>>> -                       dout(" zero gap %llu to %llu\n", off + read, pos);
>> >>>> -                       ceph_zero_page_vector_range(page_align + read,
>> >>>> -                                                   pos - off - read, pages);
>> >>>> +               if (was_short) {
>> >>>> +                       dout(" zero gap %llu to %llu\n", pos + ret, pos + this_len);
>> >>>> +                       ceph_zero_page_vector_range(page_align + ret,
>> >>>> +                                                   this_len - ret, pages);
>> >>>>                 }
>> >>>> -               pos += ret;
>> >>>> +               pos += this_len;
>> >>>>                 read = pos - off;
>> >>>> -               left -= ret;
>> >>>> +               left -= this_len;
>> >>>>                 page_pos += didpages;
>> >>>>                 pages_left -= didpages;
>> >>>>
>> >>>> This patch can do those case. It only add ret== 0 in judgement 'ret > 0".
>> >>>
>> >>>maybe we should add a  i_size check. stop reading next strip object
>> >>>when 'pos > i_size'
>> >>>
>> >> I think we can't do this because i_size may smaller than real size in ceph.
>> >>
>> >ceph_aio_read() calls ceph_do_getattr() when 'checkeof = true', it
>> >handles the case.
>> >
>> >We must do i_size check in striped_read(), otherwise user program always gets as
>> >much data as it requests. For example
>> >
>> >dd if=/dev/urandom bs=1M count=1 of=file_with_holes
>> >dd if=file_with_holes bs=64M iflag=direct of=/dev/null
>> >
>> >Regards
>> >Yan, Zheng
>> >
>> >
>> >>>> But i think i will add a parameter about hit_hole. It will make the code easy to understand.
>> >>>>
>> >>>
>> >>> i think 'was_short' is equal to 'hit_hole'
>> >>>
>> >[snip]
Thanks!
Jianpeng Ma
>On Wed, 31 Jul 2013, majianpeng wrote:
>> >On Tue, Jul 30, 2013 at 7:41 PM, majianpeng <majianpeng@xxxxxxxxx> wrote:
>> >>>>
>> >>>>>dd if=/dev/urandom bs=1M count=2 of=file_with_holes
>> >>>>>dd if=/dev/urandom bs=1M count=2 seek=4 of=file_with_holes conv=notrunc
>> >>>>>dd if=file_with_holes bs=8M >/dev/null
>> >>>>>
>> >>>> diff --git a/fs/ceph/file.c b/fs/ceph/file.c
>> >>>> index 2ddf061..22a98e5 100644
>> >>>> --- a/fs/ceph/file.c
>> >>>> +++ b/fs/ceph/file.c
>> >>>> @@ -349,17 +349,17 @@ more:
>> >>>>         dout("striped_read %llu~%u (read %u) got %d%s%s\n", pos, left, read,
>> >>>>              ret, hit_stripe ? " HITSTRIPE" : "", was_short ? " SHORT" : "");
>> >>>>
>> >>>> -       if (ret > 0) {
>> >>>> -               int didpages = (page_align + ret) >> PAGE_CACHE_SHIFT;
>> >>>> +       if (ret >= 0) {
>> >>>> +               int didpages = (page_align + this_len) >> PAGE_CACHE_SHIFT;
>> >>>>
>> >>>> -               if (read < pos - off) {
>> >>>> -                       dout(" zero gap %llu to %llu\n", off + read, pos);
>> >>>> -                       ceph_zero_page_vector_range(page_align + read,
>> >>>> -                                                   pos - off - read, pages);
>> >>>> +               if (was_short) {
>> >>>> +                       dout(" zero gap %llu to %llu\n", pos + ret, pos + this_len);
>> >>>> +                       ceph_zero_page_vector_range(page_align + ret,
>> >>>> +                                                   this_len - ret, pages);
>> >>>>                 }
>> >>>> -               pos += ret;
>> >>>> +               pos += this_len;
>> >>>>                 read = pos - off;
>> >>>> -               left -= ret;
>> >>>> +               left -= this_len;
>> >>>>                 page_pos += didpages;
>> >>>>                 pages_left -= didpages;
>> >>>>
>> >>>> This patch can do those case. It only add ret== 0 in judgement 'ret > 0".
>> >>>
>> >>>maybe we should add a  i_size check. stop reading next strip object
>> >>>when 'pos > i_size'
>> >>>
>> >> I think we can't do this because i_size may smaller than real size in ceph.
>> >>
>> >ceph_aio_read() calls ceph_do_getattr() when 'checkeof = true', it
>> >handles the case.
>> >
>> >We must do i_size check in striped_read(), otherwise user program always gets as
>> >much data as it requests. For example
>> >
>> >dd if=/dev/urandom bs=1M count=1 of=file_with_holes
>> >dd if=file_with_holes bs=64M iflag=direct of=/dev/null
>> >
>> Before doing that, we must know the meaning of return value.
>
>For ceph_osdc_readpages(),
>
>> A: ret = ENOENT
>
>The object does not exist.
>
>> B: ret = 0
>
>The object exists but we read 0 bytes, which means we are past EOF or the 
>object has size 0 bytes.  Either way, we are either in a hole or past EOF.
>
>sage
>
>> 
>> Only we knowed this, we can handle exactly.
>> Sage, can you explain those meaning in detail?
>> 
>> Thanks!
>> Jianpeng Ma
>> 
>> >Regards
>> >Yan, Zheng
>> >
>> >
>> >>>> But i think i will add a parameter about hit_hole. It will make the code easy to understand.
>> >>>>
>> >>>
>> >>> i think 'was_short' is equal to 'hit_hole'
>> >>>
>> >[snip]
>> Thanks!
>> Jianpeng Ma
>> >On Tue, Jul 30, 2013 at 7:41 PM, majianpeng <majianpeng@xxxxxxxxx> wrote:
>> >>>>
>> >>>>>dd if=/dev/urandom bs=1M count=2 of=file_with_holes
>> >>>>>dd if=/dev/urandom bs=1M count=2 seek=4 of=file_with_holes conv=notrunc
>> >>>>>dd if=file_with_holes bs=8M >/dev/null
>> >>>>>
>> >>>> diff --git a/fs/ceph/file.c b/fs/ceph/file.c
>> >>>> index 2ddf061..22a98e5 100644
>> >>>> --- a/fs/ceph/file.c
>> >>>> +++ b/fs/ceph/file.c
>> >>>> @@ -349,17 +349,17 @@ more:
>> >>>>         dout("striped_read %llu~%u (read %u) got %d%s%s\n", pos, left, read,
>> >>>>              ret, hit_stripe ? " HITSTRIPE" : "", was_short ? " SHORT" : "");
>> >>>>
>> >>>> -       if (ret > 0) {
>> >>>> -               int didpages = (page_align + ret) >> PAGE_CACHE_SHIFT;
>> >>>> +       if (ret >= 0) {
>> >>>> +               int didpages = (page_align + this_len) >> PAGE_CACHE_SHIFT;
>> >>>>
>> >>>> -               if (read < pos - off) {
>> >>>> -                       dout(" zero gap %llu to %llu\n", off + read, pos);
>> >>>> -                       ceph_zero_page_vector_range(page_align + read,
>> >>>> -                                                   pos - off - read, pages);
>> >>>> +               if (was_short) {
>> >>>> +                       dout(" zero gap %llu to %llu\n", pos + ret, pos + this_len);
>> >>>> +                       ceph_zero_page_vector_range(page_align + ret,
>> >>>> +                                                   this_len - ret, pages);
>> >>>>                 }
>> >>>> -               pos += ret;
>> >>>> +               pos += this_len;
>> >>>>                 read = pos - off;
>> >>>> -               left -= ret;
>> >>>> +               left -= this_len;
>> >>>>                 page_pos += didpages;
>> >>>>                 pages_left -= didpages;
>> >>>>
>> >>>> This patch can do those case. It only add ret== 0 in judgement 'ret > 0".
>> >>>
>> >>>maybe we should add a  i_size check. stop reading next strip object
>> >>>when 'pos > i_size'
>> >>>
>> >> I think we can't do this because i_size may smaller than real size in ceph.
>> >>
>> >ceph_aio_read() calls ceph_do_getattr() when 'checkeof = true', it
>> >handles the case.
>> >
>> >We must do i_size check in striped_read(), otherwise user program always gets as
>> >much data as it requests. For example
>> >
>> >dd if=/dev/urandom bs=1M count=1 of=file_with_holes
>> >dd if=file_with_holes bs=64M iflag=direct of=/dev/null
>> >
>> >Regards
>> >Yan, Zheng
>> >
>> >
>> >>>> But i think i will add a parameter about hit_hole. It will make the code easy to understand.
>> >>>>
>> >>>
>> >>> i think 'was_short' is equal to 'hit_hole'
>> >>>
>> >[snip]?韬{.n?????%??檩??w?{.n????u朕?Ф?塄}?财??j:+v??????2??璀??摺?囤??z夸z罐?+?????w棹f





[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux