Re: [potential issue, question] whiteout shows up in merged directory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 9/4/23 10:07 PM, Amir Goldstein wrote:
> On Mon, Sep 4, 2023 at 4:27 PM Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx> wrote:
>>
>>
>>
>> On 2023/9/4 20:49, Jingbo Xu wrote:
>>
>> ...
>>
>>>
>>> Thanks for the reply and it's really helpful to me.
>>>
>>> I can understand in the normal use case, whiteout can not appear in
>>> non-merged directory without origin xattr, except it's hand crafted.
>>>
>>> But indeed we suffer from this issue in the tarfs for erofs-utils we are
>>> developing. As described previously, in tarfs mode erofs-utils can
>>> convert each tar layer into one separate erofs image, and then merge
>>> these erofs images into one merged erofs image in a overlayfs-like model.
>>>
>>> Suppose:
>>>
>>> layer 0 + layer 1   +        layer 2         -->  merged
>>>         /foo/bar   /foo/bar (whiteout)
>>>
>>>
>>> To speed the merging process, we may merge the two top-most layers
>>> (layer 1 and layer 2) first, and then make layer0 merged into the final
>>> merged image as:
>>>
>>>
>>>
>>>             layer 1   +        layer 2         -->  merged-intermediate
>>>         /foo/bar   /foo/bar (whiteout)
>>>
>>> layer0 + merged-intermediate                -->  merged
>>
>>
>> I could add some more background to this, assuming layer 0 is a
>> baseos layer (e.g. almost all images use this layer); and layer 1 +
>> layer 2 belongs to some specific workload images;
>>
>> since layer 1 + layer 2 are always used together, so we could merge
>> layer 1 + layer 2 as a new merged layer to avoid extra overhead of
>> too many overlay layer dirs (but to simplify, here we just illustrate
>> layer 1 and layer 2, there could be layer 3, 4, ...), but layer 1 +
>> layer 2 has no relationship with layer 0 in principle (in principle,
>> merge tool doesn't need to know if layer 0 or any underlay layer
>> exists).
>>
>> So if we merge layer 1 + layer 2 here first, and use layer0 together
>> with the merged layer, it could generate such whiteout cases
>> described before.
>>
> ...
>>>
>>> Then there comes the problem: when merging layer1 and layer2, I need to
>>> keep the whiteout in the intermediate merged image though the target of
>>> the whiteout has showed up in underlying layer (/foo/bar in layer 1),
>>> because I have no idea if "/foo/bar" exits in the following further
>>> underlying layer (layer 0).  Reusing this logic, the whiteout is kept
>>> there in the final merged image after merging layer0 and
>>> merged-intermediate.
>>>
>>> Then if "/foo" is not a merged directory, the "/foo/bar" whiteout will
>>> be exposed in the overlayfs unexpectedly.
>>>
>>> Currently we work around this in erofs-utils side.  Apart from setting
>>> origin xattr on the parent directory of the whiteout, I'm not sure if
>>> the above use case is reasonable enough to fix this in the kernel side.
>>>
>> Anyway, we could work around this in the merge tool, but I'm not
>> sure if it's a design constaint of overlayfs.
>>
> 
> Let me put it this way:
> If there was an official offline tool to merge overlayfs layers
> I would expect that tool to mark the offline merged directories
> with an empty "trusted.overlayfs.origin", to be able to distinguish
> them from pure non-merge directories.
> 
> I do not consider dealing with this in erofs-utils side a workaround
> I consider it crafting layers in expected overlayfs format.

Thanks for the suggestion.  I just tested it and marking parent
directory of the whiteout with origin xattr indeed fixes this issue.

> 
> You should know that there are potential costs for marking a directory
> as merged directory - ovl_iterate() implementation for merged dirs
> that needs to filter out whiteouts is quite different than the
> ovl_iterate_real() case -
> The entire dirs needs to be read into cache before any response
> could be returned. For very large dirs this may matter.

Thanks for the reminder.

> 
> So you may want your tool to be able to clear the unneeded whiteouts
> and unneeded origin xattr eventually.
> 
> OTOH, ovl_dir_read_impure() with xino enabled on layers
> not from the same fs, has quite a similar impact.
> Not sure if this configuration is relevant for your use case.

Also will check it later.


-- 
Thanks,
Jingbo



[Index of Archives]     [Linux Filesystems Devel]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux