Re: Disable fiemap lead to Data In-balance between OSD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks to Haomai's suggested solutions. What about this:
https://github.com/mslovy/ceph/commit/539b7998fea16f8af3f6cbbbd243f6996f292acc
https://github.com/mslovy/ceph/commit/33240080f3324a70a288c79a77846688c1f29db5

As Haomai described, fiemap is default disabled in previous version
and may not use in newest version.
So is it really needed or should we backport the this fix?  any suggestions?

Ping Sam.

Regards
Ning Yao


2016-10-12 23:02 GMT+08:00 Haomai Wang <haomai@xxxxxxxx>:
> thanks to Ning Yao. We have found ceph's incorrect usage in xfs fiemap.
>
> Actually this reminds me when I'm looking for unaligned fiemap lookup,
> we also observe this case. Refer to
> http://www.spinics.net/lists/xfs/msg38001.html, if fiemap extents
> larger than 1364, single fiemap call will only return 1364. We need to
> check the last fiemap extent with FIEMAP_EXTENT_LAST flag. If not, we
> need to continue to call fiemap.
>
> Fortunately 1364 extents requires at least 8MB object but rbd's
> default object size is 4MB. So if we don't change object size, nothing
> happen. But I remember openstack glance's default object size is 64MB.
> So it maybe problem for that case. Since I often advertise rbd users
> to turn fiemap on, I hope no one don't hit this bug....
>
> And one way is fix fiemap usage in GenericFilesystemBackend, another
> is totally abandon fiemap in hammer. Or we don't need to do anything
> since fiemap is disable default?
>
> Anyway, thanks Ning Yao again!
>
>
> On Fri, Sep 30, 2016 at 11:23 AM, Jeff Liu <jeff.liu@xxxxxxxxxxxx> wrote:
>> Could you please show your test cases about the fiemap issue against XFS?
>> I'd like to dig into it if that is still existing in upstream code base.
>>
>> On 2016年09月29日 21:49, Ning Yao wrote:
>>
>> XFS has #fiemap extent intervals limitted in kernel, so if we do not
>> use seek_data, seek_hole. It will lead to getting a wrong fiemap
>> (absence of some extents)  from a large object. It is actually not
>> security before Jewel with enabling filestore_seek_data_hole.
>> Regards
>> Ning Yao
>>
>>
>> 2016-09-29 10:27 GMT+08:00 Haomai Wang <haomai@xxxxxxxx>:
>>
>>> On Thu, Sep 29, 2016 at 10:25 AM, Haomai Wang <haomai@xxxxxxxx> wrote:
>>>> On Thu, Sep 29, 2016 at 12:26 AM, Ning Yao <zay11022@xxxxxxxxx> wrote:
>>>>> Hi,
>>>>>
>>>>> As lots of fiemap issues in XFS, fiemap is default disabled now,
>>>>> especially in Hammer, before seek_data, seek_hole is added.
>>>>>
>>>>> But disabling fiemap feature will cause a small sparse object become a
>>>>> large full object during PushOps, which may lead to notably data
>>>>> in-balance between OSD, especially on the new added OSD  during data
>>>>> rebalance. With those full objects, some OSDs may simultaneously
>>>>> becomes full.
>>>> Until now, I don't know existing problem with fiemap enabled in
>>>> hammer. Although we find it maybe problem when clone to a existing
>>>> overlap data range, but it won't exists in real case.
>>> Hmm, I can't guarantee this... I only means if you want to have sparse
>>> object, you can enable this. ....
>>>
>>>>> Furthermore, currently, it is impossible to make the full objects
>>>>> sparse again if we enable the fiemap feature in the future.
>>>>>
>>>>> So I think if any solutions to make a full object back to a sparse
>>>>> object again? One of the idea is to check whether the content in the
>>>>> object contains consecutive zero and punch zeros for those object
>>>>> during deep-scrub,  is that possible and reasonable?
>>>> Obviously it's a complex thing more than we get.
>>>>
>>>>>
>>>>> Regards
>>>>> Ning Yao
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>> --
>> Cheers,
>>
>> Jeff Liu
>>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux