Re: [RFC PATCH 1/3] fs: dax.c: move fs hole signifier from DAX_ZERO_PAGE to XA_ZERO_ENTRY

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 30, 2020 at 04:09:23PM +0100, Jan Kara wrote:
> On Mon 30-11-20 06:22:42, Amy Parker wrote:
> > > > +/*
> > > > + * A zero entry, XA_ZERO_ENTRY, is used to represent a zero page. This
> > > > + * definition helps with checking if an entry is a PMD size.
> > > > + */
> > > > +#define XA_ZERO_PMD_ENTRY DAX_PMD | (unsigned long)XA_ZERO_ENTRY
> > > > +
> > >
> > > Firstly, if you define a macro, we usually wrap it inside braces like:
> > >
> > > #define XA_ZERO_PMD_ENTRY (DAX_PMD | (unsigned long)XA_ZERO_ENTRY)
> > >
> > > to avoid unexpected issues when macro expands and surrounding operators
> > > have higher priority.
> > 
> > Oops! Must've missed that - I'll make sure to get on that when
> > revising this patch.
> > 
> > > Secondly, I don't think you can combine XA_ZERO_ENTRY with DAX_PMD (or any
> > > other bits for that matter). XA_ZERO_ENTRY is defined as
> > > xa_mk_internal(257) which is ((257 << 2) | 2) - DAX bits will overlap with
> > > the bits xarray internal entries are using and things will break.
> > 
> > Could you provide an example of this overlap? I can't seem to find any.
> 
> Well XA_ZERO_ENTRY | DAX_PMD == ((257 << 2) | 2) | (1 << 1). So the way
> you've defined XA_ZERO_PMD_ENTRY the DAX_PMD will just get lost. AFAIU (but
> Matthew might correct me here), for internal entries (and XA_ZERO_ENTRY is
> one instance of such entry) low 10-bits of the of the entry values are
> reserved for internal xarray usage so DAX could use only higher bits. For
> classical value entries, only the lowest bit is reserved for xarray usage,
> all the rest is available for the user (and so DAX uses it).

The XArray entries are pretty inventive in how they are used ...

1. If bit 0 is set, it's a value entry.  That is, it encodes an integer
between 0 and LONG_MAX.
2. If bits 0 & 1 are clear, it's a pointer.
3. If bit 0 is clear and bit 1 is set, it's _either_ an internal entry,
_or_ it's a pointer that's only 2-byte aligned.  These can exist on m68k,
alas.

Internal entries above -MAX_ERRNO are used for returning errors.
Internal entries below 1024 (256 << 2) are used for sibling entries.
Internal entry 256 is the retry entry.
Internal entry 257 is the zero entry.
Internal entries 258-1023 are not currently used.
Internal entries between 4096 and MAX_ERRNO are pointers to the next
level of the tree.

The m68k pointer problem is "solved" by only allowing them to be in a
node which is the bottom of the tree.  This means that the optimisation
of placing a single pointer at index 0 in the root of the tree has to be
disabled for these pointers.  That's unfortunate, but there's no other
way to solve it, given the need for RCU readers.  You also can't use
an m68k pointer for a multi-index entry.

There's also support for pointers tagged in their lower bits.  Those are
incompatible with value entries.  And you can't use pointer tag 2 ...



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux