Re: [PATCH v2 2/2] ceph: quota: fix quota subdir mounts

Luis Henriques <lhenriques@xxxxxxxx> · Tue, 19 Mar 2019 16:42:24 +0000

"Yan, Zheng" <ukernel@xxxxxxxxx> writes:

> On Tue, Mar 12, 2019 at 10:22 PM Luis Henriques <lhenriques@xxxxxxxx> wrote:
...
>> +static struct inode *lookup_quotarealm_inode(struct ceph_mds_client *mdsc,
>> +                                            struct super_block *sb,
>> +                                            struct ceph_snap_realm *realm)
>> +{
>> +       struct inode *in;
>> +
>> +       in = ceph_lookup_inode(sb, realm->ino);
>> +       if (IS_ERR(in)) {
>> +               pr_warn("Can't lookup inode %llx (err: %ld)\n",
>> +                       realm->ino, PTR_ERR(in));
>> +               return in;
>> +       }
>> +
>> +       spin_lock(&mdsc->quotarealms_inodes_lock);
>> +       list_add(&ceph_inode(in)->i_quotarealms_inode_item,
>> +                &mdsc->quotarealms_inodes_list);
>> +       spin_unlock(&mdsc->quotarealms_inodes_lock);
>> +
> Multiple threads can call this function for the same inode at the same
> time. need to handle this. Besides, client needs to record lookupino
> error. Otherwise, client may repeatedly send useless request.

Good point.  So, the only way I see to fix this is to drop the
mdsc->quotarealms_inodes_list and instead use an ordered list/tree of
structs that would either point to the corresponding ceph inode or to
NULL if there was an error in the lookup:

	struct ceph_realm_inode {
                u64 ino;
		struct ceph_inode_info *ci;
		spinlock_t lock;
		unsigned long timeout;
	}

The 'timeout' field would be used to try to do the lookup again if the
error occurred long time ago.

The code would then create a new struct for the realm->ino (if one is
not found in the mdsc list), lock it and do the lookupino; if there's a
struct already on the list, it either means there's a lookupino in
progress or there was an error in the last lookup.

This sounds overly complicated so I may be missing the obvious simple
fix.  Any ideas?

>> +       spin_lock(&realm->inodes_with_caps_lock);
>> +       realm->inode = in;
>
> reply of lookup_ino should alreadly set realm->inode

Yes, of course.  This was silly.

Cheers,
-- 
Luis