On Fri, May 26, 2023 at 9:27 PM Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx> wrote: > > Hi, > > On 2023/5/26 04:36, Alexander Larsson wrote: > > On Fri, May 26, 2023 at 7:12 AM Amir Goldstein <amir73il@xxxxxxxxx> wrote: > >> > >> On Thu, May 25, 2023 at 7:59 PM Giuseppe Scrivano <gscrivan@xxxxxxxxxx> wrote: > >>> > >>> Hi Amir, > >>> > >>> Amir Goldstein <amir73il@xxxxxxxxx> writes: > >>> > >>>> On Thu, May 25, 2023 at 6:21 PM Alexander Larsson <alexl@xxxxxxxxxx> wrote: > >>>>> > >>>>> Something that came up about this in a discussion recently was > >>>>> multi-layer composefs style images. For example, this may be a useful > >>>>> approach for multi-layer container images. > >>>>> > >>>>> In such a setup you would have one lowerdata layer, but two real > >>>>> lowerdirs, like lowerdir=A:B::C. In this situation a file in B may > >>>>> accidentally have the same name as a file on C, causing a redirect > >>>>> from A to end up in B instead of C. > >>>>> > >>>> > >>>> I was under the impression that the names of the data blobs in C > >>>> are supposed to be content derived names (hash). > >>>> Is this not the case or is the concern about hash conflicts? > >>>> > >>>>> Would it be possible to have a syntax for redirects that mean "only > >>>>> lookup in lowerdata layers. For example a double-slash path > >>>>> //some/file. > >>>>> > >>>> > >>>> Anything is possible if we can define the problem that needs to be solved. > >>>> In this case, I did not understand why the problem is limited to finding a file > >>>> by mistake in layer B. > >>>> > >>>> If there are several data layers A:B::C:D why wouldn't we have the same > >>>> problem with a file name collision between C and D? > >>> > >>> the data layer is constructed in a way that files are stored by their > >>> hash and there is control from the container runtime on how this is > >>> built and maintained. So a file name collision would happen only when > >>> on a hash collision. > >>> > >>> Differently for the other layers we've no control on what files are in > >>> the image, unless we limit to mount only one EROFS as the first lower > >>> layer and then all the other lower layers are data layers. > >>> > >>> Given your example above A:B::C:D, if both A and B are EROFS we are > >>> limited in the files/directories that can be in B. > >>> > >>> e.g. we have A/foo with the following xattrs: > >>> > >>> trusted.overlay.metacopy="" > >>> trusted.overlay.redirect="/1e/de1743e73b904f16924c04fbd0b7fbfb7e45b8640241e7a08779e8f38fc20d" > >>> > >>> Now what would happen if /1e is present as a file in layer B? It will > >>> just cause the lookup for `foo` to fail with EIO since the redirect > >>> didn't find any file in the layers below. > >>> > >>> > >> > >> I understand the problem and I understand why a // redirect to data-only layers > >> would be a simple and workable solution for composefs. > >> > >> Unlike the rest of the changes to overlayfs that we worked on to support > >> composefs, this would really be a composefs only on-disk format because it > >> could not be generated by overlayfs itself, so we need Miklos to chime in to > >> say if this is acceptable. > > An alternative way might allow data-only layers (or invisible layers) in the > middle rather than as the tail? > Anything is possible if you can justify its worth. > I'm not sure in the long term if it's flexible to fix data-only layers as the > bottom-most layers for future potential use cases. > > At a quick glance, I've seen the implementation of this patchset also > strictly code that. I wonder if using non-fixed invisible layers increases > the complexity or am I still missing something? > The current implementation is quite simplified due to keeping data-only layers in the tail, and even more simplified that lazy lowerdata lookup is only in the data-only layers at the tail of the stack. The documentation is also simpler as do the tests. Making all the the above more complex needs justification and so far I did not see any use case that would justify it, because the /.cfs workaround is good enough IMO. That leaves the question - is the design/API flexible enough to be extended in the future if we needed to? If we would want to support data-only layers in the middle on the stack, which would this syntax make sense? lowerdir=lower1::data1:lower2::data2 If this syntax makes sense to everyone, then we can change the syntax of data-only in the tail from lower1::data1:data2 to lower1::data1::data2 and enforce that after the first ::, only :: are allowed. Miklos, any thoughts? I have a feeling that this was your natural interpretation when you first saw the :: syntax. Thanks, Amir.