On Thu, Aug 1, 2013 at 2:30 PM, majianpeng <majianpeng@xxxxxxxxx> wrote: >>On Thu, Aug 1, 2013 at 9:45 AM, majianpeng <majianpeng@xxxxxxxxx> wrote: >>>>On Wed, Jul 31, 2013 at 3:32 PM, majianpeng <majianpeng@xxxxxxxxx> wrote: >>>>> >>>>> [snip] >>>>> Test case >>>>> A: touch file >>>>> dd if=file of=/dev/null bs=5M count=1 iflag=direct >>>>> B: [data(2M)|hole(2m)][data(2M)] >>>>> dd if=file of=/dev/null bs=8M count=1 iflag=direct >>>>> C: [data(4M)[hole(4M)][hole(4M)][data(2M)] >>>>> dd if=file of=/dev/null bs=16M count=1 iflag=direct >>>>> D: touch file;truncate -s 5M file >>>>> dd if=file of=/dev/null bs=8M count=1 iflag=direct >>>>> >>>>> Those cases can work. >>>>> Now i make different processing for short-read between 'ret > 0' and "ret =0". >>>>> For the short-read which ret > 0, it don't do read-page rather than zero the left area. >>>>> This means reduce one meaningless read operation. >>>>> >>>> >>>>This patch looks good. But I still hope not to duplicate code. >>>> >>>>how about change >>>> "hit_stripe = this_len < left;" >>>>to >>>> "hit_stripe = this_len < left && (ret == this_len || pos + this_len < >>>>inode->i_size);" >>>> >>> To make the code easy to understand, i don't apply your suggestion.But i add this check on the judgement of >>> whether read more contents. >>> The follow is the latest patch.Can you check? >>> >>> diff --git a/fs/ceph/file.c b/fs/ceph/file.c >>> index 2ddf061..3d8d14d 100644 >>> --- a/fs/ceph/file.c >>> +++ b/fs/ceph/file.c >>> @@ -349,44 +349,38 @@ more: >>> dout("striped_read %llu~%u (read %u) got %d%s%s\n", pos, left, read, >>> ret, hit_stripe ? " HITSTRIPE" : "", was_short ? " SHORT" : ""); >>> >>> - if (ret > 0) { >>> - int didpages = (page_align + ret) >> PAGE_CACHE_SHIFT; >>> - >>> - if (read < pos - off) { >>> - dout(" zero gap %llu to %llu\n", off + read, pos); >>> - ceph_zero_page_vector_range(page_align + read, >>> - pos - off - read, pages); >>> + if (ret >= 0) { >>> + int didpages; >>> + if (was_short && (pos + ret < inode->i_size)) { >>> + u64 tmp = min(this_len - ret, >>> + inode->i_size - pos - ret); >>> + dout(" zero gap %llu to %llu\n", >>> + pos + ret, pos + ret + tmp); >>> + ceph_zero_page_vector_range(page_align + read + ret, >>> + tmp, pages); >>> + ret += tmp; >>> } >>> + >>> + didpages = (page_align + ret) >> PAGE_CACHE_SHIFT; >>> pos += ret; >>> read = pos - off; >>> left -= ret; >>> page_pos += didpages; >>> pages_left -= didpages; >>> >>> - /* hit stripe? */ >>> - if (left && hit_stripe) >>> + /* hit stripe and need continue*/ >>> + if (left && hit_stripe && pos < inode->i_size) >>> goto more; >>> + >>> } >>> >>> - if (was_short) { >>> + if (ret >= 0) { >>> + ret = read; >>> /* did we bounce off eof? */ >>> if (pos + left > inode->i_size) >>> *checkeof = 1; >>> - >>> - /* zero trailing bytes (inside i_size) */ >>> - if (left > 0 && pos < inode->i_size) { >>> - if (pos + left > inode->i_size) >>> - left = inode->i_size - pos; >>> - >>> - dout("zero tail %d\n", left); >>> - ceph_zero_page_vector_range(page_align + read, left, >>> - pages); >>> - read += left; >>> - } >>> } >>> >>> - if (ret >= 0) >>> - ret = read; >> >>I think this line should be "if (read > 0) ret = read;". Other than >>this, your patch looks good. > Because you metioned this, I noticed for ceph_sync_read/write the result are && the every striped read/write. > That is if we met one error, the total result is error.It can't return partial result. This behavior is not correct. If we read/write some data, then meet an error, we should return the size we have read/written. I think all other FS behave like this. See generic_file_aio_read() and do_generic_file_read(). Regards Yan, Zheng > I think i should write anthor patch for that. >>You can add "Reviewed-by: Yan, Zheng <zheng.z.yan@xxxxxxxxx>" to your >>formal patch. > Ok ,thanks your times. > > Thanks! > Jianpeng Ma >> >>Regards >>Yan, Zheng >> >> >>> dout("striped_read returns %d\n", ret); >>> return ret; >>> } >>> >>> Thanks! >>> Jianpeng Ma > Thanks! > Jianpeng Ma >>On Thu, Aug 1, 2013 at 9:45 AM, majianpeng <majianpeng@xxxxxxxxx> wrote: >>>>On Wed, Jul 31, 2013 at 3:32 PM, majianpeng <majianpeng@xxxxxxxxx> wrote: >>>>> >>>>> [snip] >>>>> Test case >>>>> A: touch file >>>>> dd if=file of=/dev/null bs=5M count=1 iflag=direct >>>>> B: [data(2M)|hole(2m)][data(2M)] >>>>> dd if=file of=/dev/null bs=8M count=1 iflag=direct >>>>> C: [data(4M)[hole(4M)][hole(4M)][data(2M)] >>>>> dd if=file of=/dev/null bs=16M count=1 iflag=direct >>>>> D: touch file;truncate -s 5M file >>>>> dd if=file of=/dev/null bs=8M count=1 iflag=direct >>>>> >>>>> Those cases can work. >>>>> Now i make different processing for short-read between 'ret > 0' and "ret =0". >>>>> For the short-read which ret > 0, it don't do read-page rather than zero the left area. >>>>> This means reduce one meaningless read operation. >>>>> >>>> >>>>This patch looks good. But I still hope not to duplicate code. >>>> >>>>how about change >>>> "hit_stripe = this_len < left;" >>>>to >>>> "hit_stripe = this_len < left && (ret == this_len || pos + this_len < >>>>inode->i_size);" >>>> >>> To make the code easy to understand, i don't apply your suggestion.But i add this check on the judgement of >>> whether read more contents. >>> The follow is the latest patch.Can you check? >>> >>> diff --git a/fs/ceph/file.c b/fs/ceph/file.c >>> index 2ddf061..3d8d14d 100644 >>> --- a/fs/ceph/file.c >>> +++ b/fs/ceph/file.c >>> @@ -349,44 +349,38 @@ more: >>> dout("striped_read %llu~%u (read %u) got %d%s%s\n", pos, left, read, >>> ret, hit_stripe ? " HITSTRIPE" : "", was_short ? " SHORT" : ""); >>> >>> - if (ret > 0) { >>> - int didpages = (page_align + ret) >> PAGE_CACHE_SHIFT; >>> - >>> - if (read < pos - off) { >>> - dout(" zero gap %llu to %llu\n", off + read, pos); >>> - ceph_zero_page_vector_range(page_align + read, >>> - pos - off - read, pages); >>> + if (ret >= 0) { >>> + int didpages; >>> + if (was_short && (pos + ret < inode->i_size)) { >>> + u64 tmp = min(this_len - ret, >>> + inode->i_size - pos - ret); >>> + dout(" zero gap %llu to %llu\n", >>> + pos + ret, pos + ret + tmp); >>> + ceph_zero_page_vector_range(page_align + read + ret, >>> + tmp, pages); >>> + ret += tmp; >>> } >>> + >>> + didpages = (page_align + ret) >> PAGE_CACHE_SHIFT; >>> pos += ret; >>> read = pos - off; >>> left -= ret; >>> page_pos += didpages; >>> pages_left -= didpages; >>> >>> - /* hit stripe? */ >>> - if (left && hit_stripe) >>> + /* hit stripe and need continue*/ >>> + if (left && hit_stripe && pos < inode->i_size) >>> goto more; >>> + >>> } >>> >>> - if (was_short) { >>> + if (ret >= 0) { >>> + ret = read; >>> /* did we bounce off eof? */ >>> if (pos + left > inode->i_size) >>> *checkeof = 1; >>> - >>> - /* zero trailing bytes (inside i_size) */ >>> - if (left > 0 && pos < inode->i_size) { >>> - if (pos + left > inode->i_size) >>> - left = inode->i_size - pos; >>> - >>> - dout("zero tail %d\n", left); >>> - ceph_zero_page_vector_range(page_align + read, left, >>> - pages); >>> - read += left; >>> - } >>> } >>> >>> - if (ret >= 0) >>> - ret = read; >> >>I think this line should be "if (read > 0) ret = read;". Other than >>this, your patch looks good. >>You can add "Reviewed-by: Yan, Zheng <zheng.z.yan@xxxxxxxxx>" to your >>formal patch. >> >>Regards >>Yan, Zheng >> >> >>> dout("striped_read returns %d\n", ret); >>> return ret; >>> } >>> >>> Thanks! >>> Jianpeng Ma -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html