>On Mon, Jul 29, 2013 at 11:00 AM, majianpeng <majianpeng@xxxxxxxxx> wrote: >> >> [snip] >> >I don't think the later was_short can handle the hole case. For the hole case, >> >we should try reading next strip object instead of return. how about >> >below patch. >> > >> Hi Yan, >> i uesed this demo to test hole case. >> dd if=/dev/urandom bs=4096 count=2 of=file_with_holes >> dd if=/dev/urandom bs=4096 seek=7 count=2 of=file_with_holes >> >> dd if=file_with_holes of=/dev/null bs=16k count=1 iflag=direct >> Using the dynamic_debug in striped_read, the message are: >> >[ 8743.663499] ceph: file.c:350 : striped_read 0~16384 (read 0) got 16384 >> >[ 8743.663502] ceph: file.c:390 : striped_read returns 16384 >> From the messages, we can see it can't hit the short-read. >> For the ceph-file-hole, how does the ceph handle? >> Or am i missing something? > >the default strip size is 4M, all data are written to the first object >in your test case. >could you try something like below. > >dd if=/dev/urandom bs=1M count=2 of=file_with_holes >dd if=/dev/urandom bs=1M count=2 seek=4 of=file_with_holes conv=notrunc >dd if=file_with_holes bs=8M >/dev/null > >From above test, i think your patch is right. Although, the original code can work but it call multi striped_read. As your said for stripe short-read,it doesn't make sense to return rather than reading next stripe. But can you add some comments for this? The short-read reasongs are two:EOF or hit-hole. But for hit-hole there are some differents case. For that i don't know. Thanks! Jianpeng Ma >Regards >Yan, Zheng > > >> >> >> Thanks! >> Jianpeng Ma >> >> >Regards >> >Yan, Zheng >> >--- >> >diff --git a/fs/ceph/file.c b/fs/ceph/file.c >> >index 271a346..6ca2921 100644 >> >--- a/fs/ceph/file.c >> >+++ b/fs/ceph/file.c >> >@@ -350,16 +350,17 @@ more: >> > ret, hit_stripe ? " HITSTRIPE" : "", was_short ? " SHORT" : ""); >> > >> > if (ret > 0) { >> >- int didpages = (page_align + ret) >> PAGE_CACHE_SHIFT; >> >+ int didpages = (page_align + this_len) >> PAGE_CACHE_SHIFT; >> > >> >- if (read < pos - off) { >> >- dout(" zero gap %llu to %llu\n", off + read, pos); >> >- ceph_zero_page_vector_range(page_align + read, >> >- pos - off - read, pages); >> >+ if (was_short) { >> >+ dout(" zero gap %llu to %llu\n", >> >+ pos + ret, pos + this_len); >> >+ ceph_zero_page_vector_range(page_align + ret, >> >+ this_len - ret, page_pos); >> > } >> >- pos += ret; >> >+ pos += this_len; >> > read = pos - off; >> >- left -= ret; >> >+ left -= this_len; >> > page_pos += didpages; >> > pages_left -= didpages; >> > Thanks! Jianpeng Ma >On Mon, Jul 29, 2013 at 11:00 AM, majianpeng <majianpeng@xxxxxxxxx> wrote: >> >> [snip] >> >I don't think the later was_short can handle the hole case. For the hole case, >> >we should try reading next strip object instead of return. how about >> >below patch. >> > >> Hi Yan, >> i uesed this demo to test hole case. >> dd if=/dev/urandom bs=4096 count=2 of=file_with_holes >> dd if=/dev/urandom bs=4096 seek=7 count=2 of=file_with_holes >> >> dd if=file_with_holes of=/dev/null bs=16k count=1 iflag=direct >> Using the dynamic_debug in striped_read, the message are: >> >[ 8743.663499] ceph: file.c:350 : striped_read 0~16384 (read 0) got 16384 >> >[ 8743.663502] ceph: file.c:390 : striped_read returns 16384 >> From the messages, we can see it can't hit the short-read. >> For the ceph-file-hole, how does the ceph handle? >> Or am i missing something? > >the default strip size is 4M, all data are written to the first object >in your test case. >could you try something like below. > >dd if=/dev/urandom bs=1M count=2 of=file_with_holes >dd if=/dev/urandom bs=1M count=2 seek=4 of=file_with_holes conv=notrunc >dd if=file_with_holes bs=8M >/dev/null > >Regards >Yan, Zheng > > >> >> >> Thanks! >> Jianpeng Ma >> >> >Regards >> >Yan, Zheng >> >--- >> >diff --git a/fs/ceph/file.c b/fs/ceph/file.c >> >index 271a346..6ca2921 100644 >> >--- a/fs/ceph/file.c >> >+++ b/fs/ceph/file.c >> >@@ -350,16 +350,17 @@ more: >> > ret, hit_stripe ? " HITSTRIPE" : "", was_short ? " SHORT" : ""); >> > >> > if (ret > 0) { >> >- int didpages = (page_align + ret) >> PAGE_CACHE_SHIFT; >> >+ int didpages = (page_align + this_len) >> PAGE_CACHE_SHIFT; >> > >> >- if (read < pos - off) { >> >- dout(" zero gap %llu to %llu\n", off + read, pos); >> >- ceph_zero_page_vector_range(page_align + read, >> >- pos - off - read, pages); >> >+ if (was_short) { >> >+ dout(" zero gap %llu to %llu\n", >> >+ pos + ret, pos + this_len); >> >+ ceph_zero_page_vector_range(page_align + ret, >> >+ this_len - ret, page_pos); >> > } >> >- pos += ret; >> >+ pos += this_len; >> > read = pos - off; >> >- left -= ret; >> >+ left -= this_len; >> > page_pos += didpages; >> > pages_left -= didpages; >> >?韬{.n?????%??檩??w?{.n????u朕?Ф?塄}?财??j:+v??????2??璀??摺?囤??z夸z罐?+?????w棹f