Re: Disable fiemap lead to Data In-balance between OSD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 14, 2016 at 1:06 AM, Ning Yao <zay11022@xxxxxxxxx> wrote:
> Thanks to Haomai's suggested solutions. What about this:
> https://github.com/mslovy/ceph/commit/539b7998fea16f8af3f6cbbbd243f6996f292acc
> https://github.com/mslovy/ceph/commit/33240080f3324a70a288c79a77846688c1f29db5

 Cool, the fix is looks good to me..

>
> As Haomai described, fiemap is default disabled in previous version
> and may not use in newest version.
> So is it really needed or should we backport the this fix?  any suggestions?
>
> Ping Sam.

Sam is on vacation. @sage's option?

>
> Regards
> Ning Yao
>
>
> 2016-10-12 23:02 GMT+08:00 Haomai Wang <haomai@xxxxxxxx>:
>> thanks to Ning Yao. We have found ceph's incorrect usage in xfs fiemap.
>>
>> Actually this reminds me when I'm looking for unaligned fiemap lookup,
>> we also observe this case. Refer to
>> http://www.spinics.net/lists/xfs/msg38001.html, if fiemap extents
>> larger than 1364, single fiemap call will only return 1364. We need to
>> check the last fiemap extent with FIEMAP_EXTENT_LAST flag. If not, we
>> need to continue to call fiemap.
>>
>> Fortunately 1364 extents requires at least 8MB object but rbd's
>> default object size is 4MB. So if we don't change object size, nothing
>> happen. But I remember openstack glance's default object size is 64MB.
>> So it maybe problem for that case. Since I often advertise rbd users
>> to turn fiemap on, I hope no one don't hit this bug....
>>
>> And one way is fix fiemap usage in GenericFilesystemBackend, another
>> is totally abandon fiemap in hammer. Or we don't need to do anything
>> since fiemap is disable default?
>>
>> Anyway, thanks Ning Yao again!
>>
>>
>> On Fri, Sep 30, 2016 at 11:23 AM, Jeff Liu <jeff.liu@xxxxxxxxxxxx> wrote:
>>> Could you please show your test cases about the fiemap issue against XFS?
>>> I'd like to dig into it if that is still existing in upstream code base.
>>>
>>> On 2016年09月29日 21:49, Ning Yao wrote:
>>>
>>> XFS has #fiemap extent intervals limitted in kernel, so if we do not
>>> use seek_data, seek_hole. It will lead to getting a wrong fiemap
>>> (absence of some extents)  from a large object. It is actually not
>>> security before Jewel with enabling filestore_seek_data_hole.
>>> Regards
>>> Ning Yao
>>>
>>>
>>> 2016-09-29 10:27 GMT+08:00 Haomai Wang <haomai@xxxxxxxx>:
>>>
>>>> On Thu, Sep 29, 2016 at 10:25 AM, Haomai Wang <haomai@xxxxxxxx> wrote:
>>>>> On Thu, Sep 29, 2016 at 12:26 AM, Ning Yao <zay11022@xxxxxxxxx> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> As lots of fiemap issues in XFS, fiemap is default disabled now,
>>>>>> especially in Hammer, before seek_data, seek_hole is added.
>>>>>>
>>>>>> But disabling fiemap feature will cause a small sparse object become a
>>>>>> large full object during PushOps, which may lead to notably data
>>>>>> in-balance between OSD, especially on the new added OSD  during data
>>>>>> rebalance. With those full objects, some OSDs may simultaneously
>>>>>> becomes full.
>>>>> Until now, I don't know existing problem with fiemap enabled in
>>>>> hammer. Although we find it maybe problem when clone to a existing
>>>>> overlap data range, but it won't exists in real case.
>>>> Hmm, I can't guarantee this... I only means if you want to have sparse
>>>> object, you can enable this. ....
>>>>
>>>>>> Furthermore, currently, it is impossible to make the full objects
>>>>>> sparse again if we enable the fiemap feature in the future.
>>>>>>
>>>>>> So I think if any solutions to make a full object back to a sparse
>>>>>> object again? One of the idea is to check whether the content in the
>>>>>> object contains consecutive zero and punch zeros for those object
>>>>>> during deep-scrub,  is that possible and reasonable?
>>>>> Obviously it's a complex thing more than we get.
>>>>>
>>>>>>
>>>>>> Regards
>>>>>> Ning Yao
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>> --
>>> Cheers,
>>>
>>> Jeff Liu
>>>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux