Re: ceph-mds lock 10-second delay

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



 use top command to see ceph-mds:

 PID USER      PR  NI    VIRT    RES    SHR    S    %CPU   %MEM
TIME+     COMMAND
36769 ceph      20   0 5860096 4.264g  14880 S     74.8      1.7
  3152:47      ceph-mds
10675 root        20   0  158132   4900   3776    R     1.6        0.0
        0:00.88       top

2018-03-09 21:11 GMT+08:00 qi Shi <m13913886148@xxxxxxxxx>:
> 1, when I use "ls" command
> ---------------------------------------------------------------------
> [root@JXQ-240-27-10 mycephfs]# time ls
> gpu  testshiqi  tiger
>
> real    0m3.325s
> user    0m0.000s
> sys     0m0.008s
> ---------------------------------------------------------------------
>
>
> 2, When I use vi to quickly open or close files.
> ------------------------------------------------------------------------
> [root@JXQ-240-27-10 mycephfs]# time vi testshiqi
> real    0m7.529s
> user    0m0.002s
> sys     0m0.012s
> ----------------------------------------------------------------------
> Sometimes executing "vi xxxfile" command will delay for a few seconds
> Sometimes executing "q!" command to close file will delay for a few seconds
> "testshiqi" is a test file and only I am using it alone
>
> 2018-03-09 21:00 GMT+08:00 qi Shi <m13913886148@xxxxxxxxx>:
>> User response:
>> 1, Using the vi tool to open the file is very slow.
>> 2, Using "ls" and "du" commands is very slow.
>>
>> How to solve this problem?
>>
>> The ceph-mds operation log I collected:
>> ------------------------------------------------------------------------------------------------------------
>>   "description": "client_request(client.26337:550474 open
>> #100018f693d\/zj_12345067041514269333_20171227031708_6503737997126613699_120.wav
>> 2018-03-09 19:40:01.664683)",
>>             "initiated_at": "2018-03-09 19:40:01.665946",
>>             "age": 30.520671,
>>             "duration": 10.221127,
>>             "type_data": [
>>                 "done",
>>                 "client.26337:550474",
>>                 "client_request",
>>                 {
>>                     "client": "client.26337",
>>                     "tid": 550474
>>                 },
>>                 [
>>                     {
>>                         "time": "2018-03-09 19:40:01.665946",
>>                         "event": "initiated"
>>                     },
>>                     {
>>                         "time": "2018-03-09 19:40:11.885465",
>>                         "event": "acquired locks"
>>                     },
>>                     {
>>                         "time": "2018-03-09 19:40:11.885772",
>>                         "event": "replying"
>>                     },
>>                     {
>>                         "time": "2018-03-09 19:40:11.885991",
>>                         "event": "finishing request"
>>                     },
>>                     {
>>                         "time": "2018-03-09 19:40:11.887036",
>>                         "event": "cleaned up request"
>>                     },
>>                     {
>>                         "time": "2018-03-09 19:40:11.887073",
>>                         "event": "done"
>>                     }
>>                 ]
>> ----------------------------------------------------------------------------------------------------
>> I found a 10 second delay between "initiated" and "acquired locks" events
>>
>>
>> My hardware configuration:
>> 1, All osds of hard disk is SSD.
>> 2, osds: 64
>> 3, memory : 256G
>> 4, cpus : 48
>> 5, network: 20000Mb/s
>> 6, ceph version : 10.2.6
>> 7, linux kernel version: 4.13
>>
>> ceph-mds usage:
>> 1,ceph-mds memory usage: 4G
>> 2,ceph-mds cpu usage rate: 200%
>> 3, files : 30 million small files in cephfs
>> 4, objects: 100 million objects in cephfs_data pool
>> 5, 10 users are operating in their own private folder, data is not shared
>> 6,  Every user uses the AI program to read, write, and search training
>> files, and he also uses vi tool to edit script files.
>>
>> ceph osd perf:
>> ---------------------------------------------------------------------------------------------------
>> osd fs_commit_latency(ms) fs_apply_latency(ms)
>>   9                     0                    2
>>   8                     1                    3
>>   7                     0                    2
>>  63                     0                    2
>>  62                     0                    1
>>  61                     0                    2
>>  60                     0                    2
>>   6                     0                    2
>>  59                     1                    2
>>  58                     1                    2
>>  57                     1                    2
>>  56                     1                    2
>>  55                     1                    3
>>  54                     0                    2
>>  53                     1                    2
>> ---------------------------------------------------------------------------------------------------
>> So I think delay is not the reason for osd
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux