Re: Huge lookup when recursively mkdir

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Xiaoxi , 

As i know, mkdir -p “/a/b/c/d/e”
client would first lookup the existed path ,  and do truly mkdir to the left part.

Thanks,
Dongdong.

> 在 2017年10月23日,下午2:01,Xiaoxi Chen <superdebuger@xxxxxxxxx> 写道:
> 
> Yes,  actually lots of (50+) clients are trying to create the same
> large directory tree concurrently. So the behavior is most of the
> mkdir will get -EEXISTS.
> 
> not very understand how the mkdir call from application level be
> finally turned into lookup in MDS?  could you please explain a bit
> more ?
> 
> 2017-10-23 8:44 GMT+08:00 Yan, Zheng <ukernel@xxxxxxxxx>:
>> On Sun, Oct 22, 2017 at 11:27 PM, Xiaoxi Chen <superdebuger@xxxxxxxxx> wrote:
>>> To add another data point, switched to ceph-fuse 12.2.0, still seeing
>>> lots of lookup.
>>> lookup avg 1892
>>> mkdir avg  367
>>> create avg 222
>>> open avg 228
>>> 
>> 
>> But in your test, mkdir avg was about 1.5 times of open avg. I think
>> your test created millions of directories, lookups were from cache
>> miss. You can try enlarging client_cache_size. But I don't think it
>> will help much when active set of directory are so large.
>> 
>> 
>>> 2017-10-21 2:09 GMT+08:00 Xiaoxi Chen <superdebuger@xxxxxxxxx>:
>>>> @Zheng, my kernel doesn't even has c3f4688a08f.   But 200fd27 ("ceph:
>>>> use lookup request to revalidate dentry") is there.
>>>> 
>>>> 2017-10-21 0:54 GMT+08:00 Xiaoxi Chen <superdebuger@xxxxxxxxx>:
>>>>> Thanks, will check.
>>>>> 
>>>>> A general question, does cephfs kernel client drop dentries/inode
>>>>> cache aggressively?   What I know is if MDS issue
>>>>> CEPH_SESSION_RECALL_STATE, client will drop, but is there other cases
>>>>> client will drop cache?
>>>>> 
>>>>> 
>>>>> 
>>>>> 2017-10-20 16:39 GMT+08:00 Yan, Zheng <ukernel@xxxxxxxxx>:
>>>>>> On Fri, Oct 20, 2017 at 3:28 PM, Xiaoxi Chen <superdebuger@xxxxxxxxx> wrote:
>>>>>>> Centos 7.3, kernel version 3.10.0-514.26.2.el7.x86_64.
>>>>>>> 
>>>>>>> I extract the logical of file creation in our workload into a
>>>>>>> reproducer , like below.
>>>>>>> 
>>>>>>> Concurrently run the reproducer in 2+ node can see a lots of lookup OP.
>>>>>>> I thought the lookup is to open the directory tree so I tried to
>>>>>>> pre-make most of the dirs ,  use ls -i trying to read the dentries and
>>>>>>> cache it, then re-run the reproducer,  seems nothing different..
>>>>>>> 
>>>>>>> #include <sys/stat.h>
>>>>>>> #include <fcntl.h>
>>>>>>> int create_file(char * base, int count, int max, int depth)
>>>>>>> {
>>>>>>>    int i;
>>>>>>>    for(i=0; i<count; i++) {
>>>>>>>        char dir[256];
>>>>>>>        int mydir = rand() % max;
>>>>>>>        sprintf(dir, "%s/%d", path, mydir);
>>>>>>>        if (depth >=1) {
>>>>>>>            mkdir(dir,0777);
>>>>>>>            create_dir(dir, count, max, depth - 1);
>>>>>>>        } else {
>>>>>>>            int fd = open(dir, O_CREAT | O_EXCL| O_WRONLY , 0666);
>>>>>>>            printf("opened path : %s = %d\n", path, fd);
>>>>>>>            close(fd);
>>>>>>>        }
>>>>>>>    }
>>>>>>> }
>>>>>>> int main(int argc, char argv[])
>>>>>>> {
>>>>>>>    char path[256];
>>>>>>>    while(1) {
>>>>>>>      create_file("/import/SQL01", 1, 4 ,10);
>>>>>>>    }
>>>>>>> }
>>>>>>> 
>>>>>> 
>>>>>> still don't see this behavior on 4.13 kernel. I suspect there is
>>>>>> something wrong with dentry lease. please check if your kernel
>>>>>> include:
>>>>>> 
>>>>>> commit c3f4688a08f (ceph: don't set req->r_locked_dir in ceph_d_revalidate)
>>>>>> commit 5eb9f6040f3 (ceph: do a LOOKUP in d_revalidate instead of GETATTR)
>>>>>> 
>>>>>> The first commit can cause this issue, the second one fixes it.
>>>>>> 
>>>>>> Regards
>>>>>> Yan, Zheng
>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 2017-10-20 10:55 GMT+08:00 Yan, Zheng <ukernel@xxxxxxxxx>:
>>>>>>>> On Fri, Oct 20, 2017 at 12:49 AM, Xiaoxi Chen <superdebuger@xxxxxxxxx> wrote:
>>>>>>>>> Hi,
>>>>>>>>> 
>>>>>>>>>      I am seeing a lot of lookup request when doing recursive mkdir.
>>>>>>>>>      The workload behavior is like:
>>>>>>>>>          mkdir DIR0
>>>>>>>>>          mkdir DIR0/DIR1
>>>>>>>>>          mkdir DIR0/DIR1/DIR2
>>>>>>>>>          ....
>>>>>>>>>          mkdir DIR0/DIR1/DIR2......./DIR7
>>>>>>>>>          create DIR0/DIR1/DIR2......./DIR7/FILE1
>>>>>>>>> 
>>>>>>>>>      and concurrently run on 50+ clients, the dir name in different
>>>>>>>>> clients may or maynot be the same.
>>>>>>>>> 
>>>>>>>>>       from the admin socket I was seeing ~50K create requests, but
>>>>>>>>> got 400K lookup requests. The lookup eat up most of the mds capability
>>>>>>>>> so file create is slow.
>>>>>>>>> 
>>>>>>>>>       Where is the lookup comes from and can we have anyway to
>>>>>>>>> optimize it out ?
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> I don't see this behavior when running following commands in 4.13
>>>>>>>> kernel client and luminous version ceph-fuse. which client do you use?
>>>>>>>> 
>>>>>>>> mkdir d1
>>>>>>>> mkdir d1/d2
>>>>>>>> mkdir d1/d2/d3
>>>>>>>> mkdir d1/d2/d3/d4/
>>>>>>>> mkdir d1/d2/d3/d4/d5
>>>>>>>> touch d1/d2/d3/d4/d5/f
>>>>>>>> 
>>>>>>>>>     Xiaoxi
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux