Re: [PATCH 1/2] lockref: speculatively spin waiting for the lock to be released

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 13, 2024 at 8:43 PM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Thu, 13 Jun 2024 at 11:13, Mateusz Guzik <mjguzik@xxxxxxxxx> wrote:
> >
> > I would assume the rule is pretty much well known and instead an
> > indicator where is what (as added in my comments) would be welcome.
>
> Oh, the rule is well-known, but I think what is worth pointing out is
> the different classes of fields, and the name[] field in particular.
>
> This ordering was last really updated back in 2011, by commit
> 44a7d7a878c9 ("fs: cache optimise dentry and inode for rcu-walk"). And
> it was actually somewhat intentional at the time. Quoting from that
> commit:
>
>     We also fit in 8 bytes of inline name in the first 64 bytes, so for short
>     names, only 64 bytes needs to be touched to perform the lookup. We should
>     get rid of the hash->prev pointer from the first 64 bytes, and fit 16 bytes
>     of name in there, which will take care of 81% rather than 32% of the kernel
>     tree.
>

Right. Things degrading by accident is why I suggested BUILD_BUG_ON.

> but what has actually really changed - and that I didn't even realize
> until I now did a 'pahole' on it, was that this was all COMPLETELY
> broken by
>
>        seqcount_spinlock_t        d_seq;
>
> because seqcount_spinlock_t has been entirely broken and went from
> being 4 bytes back when, to now being 64 bytes.
>
> Crazy crazy.
>
> How did that happen?
>

perhaps lockdep in your config?

this is the layout on my prod config:
struct dentry {
        unsigned int               d_flags;              /*     0     4 */
        seqcount_spinlock_t        d_seq;                /*     4     4 */
        struct hlist_bl_node       d_hash;               /*     8    16 */
        struct dentry *            d_parent;             /*    24     8 */
        struct qstr                d_name;               /*    32    16 */
        struct inode *             d_inode;              /*    48     8 */
        unsigned char              d_iname[40];          /*    56    40 */
        /* --- cacheline 1 boundary (64 bytes) was 32 bytes ago --- */
        const struct dentry_operations  * d_op;          /*    96     8 */
        struct super_block *       d_sb;                 /*   104     8 */
        long unsigned int          d_time;               /*   112     8 */
        void *                     d_fsdata;             /*   120     8 */
        /* --- cacheline 2 boundary (128 bytes) --- */
        struct lockref             d_lockref
__attribute__((__aligned__(8))); /*   128     8 */
        union {
                struct list_head   d_lru;                /*   136    16 */
                wait_queue_head_t * d_wait;              /*   136     8 */
        };                                               /*   136    16 */
        struct hlist_node          d_sib;                /*   152    16 */
        struct hlist_head          d_children;           /*   168     8 */
        union {
                struct hlist_node  d_alias;              /*   176    16 */
                struct hlist_bl_node d_in_lookup_hash;   /*   176    16 */
                struct callback_head d_rcu
__attribute__((__aligned__(8))); /*   176    16 */
        } d_u __attribute__((__aligned__(8)));           /*   176    16 */

        /* size: 192, cachelines: 3, members: 16 */
        /* forced alignments: 2 */
} __attribute__((__aligned__(8)));


-- 
Mateusz Guzik <mjguzik gmail.com>





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux