Re: ceph mds dump tree - root inode is not in cache

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Weiwen,

please see also my previous 2 posts. There seems to be something wrong when trying to dump stray buckets. This command works:

[root@rit-tceph ~]# ceph tell mds.0 dump tree '/' | jq ".[] | .dirfrags |.[] | .path"
2022-08-07T17:25:34.430+0200 7fbcfbfff700  0 client.439291 ms_handle_reset on v2:10.41.24.14:6812/3943985176
2022-08-07T17:25:34.473+0200 7fbd017fa700  0 client.456018 ms_handle_reset on v2:10.41.24.14:6812/3943985176
"/data/blobs"
"/data"
""
However, this does not:

[root@rit-tceph ~]# ceph tell mds.0 dump tree '~mds0' | jq ".[] | .dirfrags |.[] | .path"
2022-08-07T17:27:16.623+0200 7fb294ff9700  0 client.439345 ms_handle_reset on v2:10.41.24.14:6812/3943985176
2022-08-07T17:27:16.665+0200 7fb295ffb700  0 client.456072 ms_handle_reset on v2:10.41.24.14:6812/3943985176
root inode is not in cache

The dir "~mds0" *is* in cache though (see the dump in my previous post). Is there a problem interpreting the "~" character?

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Frank Schilder
Sent: 07 August 2022 17:18:48
To: 胡 玮文
Cc: ceph-users@xxxxxxx
Subject: Re:  Re: ceph mds dump tree - root inode is not in cache

Hi Weiwen,

please see also my previous post. I managed to get a cache dump and it looks like the root inode *is* in cache:

[inode 0x1 [...3a,head] / auth v64340 snaprealm=0x557751112000 f(v0 m2022-07-28T10:01:37.819634+0200 1=0+1) n(v10574 rc2022-08-05T06:26:53.074073+0200 3=0+3)/n(v0 rc2022-07-26T14:27:33.756075+0200 1=0+1) (inest lock) (iversion lock) caps={451155=pAsLsXsFs/-@2} | dirtyscattered=0 request=0 lock=0 dirfrag=1 caps=1 openingsnapparents=0 dirty=0 waiter=0 authpin=0 0x55775110e000]

and so seem all the other paths I was looking for:

[root@ceph-mds:tceph-02 /]# grep "\[dir " /var/lib/ceph/mds-cache
 [dir 0x602 ~mds0/stray2/ [2,head] auth v=4597793 cv=4597793/4597793 state=1073750017|complete|sticky f(v2 m2022-08-05T06:26:53.074073+0200)/f(v2 m2022-08-05T06:26:53.074073+0200 96085=81003+15082) n(v33 rc2022-08-05T06:26:53.074073+0200)/n(v33 rc2022-08-05T06:26:53.074073+0200 b2874298196 96085=81003+15082) hs=0+0,ss=0+0 | child=0 sticky=1 dirty=0 waiter=0 authpin=0 0x557752a35b80]
 [dir 0x10000000000 /data/blobs/ [2,head] auth v=181851 cv=181851/181851 state=1073741824 f(v0 m2022-08-05T06:26:53.074073+0200) n(v12372 rc2022-08-05T06:26:53.074073+0200) hs=0+0,ss=0+0 0x557750cdb600]
 [dir 0x1 / [2,head] auth v=145874 cv=145874/145874 dir_auth=0 state=1074003969|complete f(v0 m2022-07-28T10:01:37.819634+0200 1=0+1) n(v10574 rc2022-08-05T06:26:53.074073+0200 2=0+2) hs=1+0,ss=0+0 | child=1 subtree=1 subtreetemp=0 dirty=0 waiter=0 authpin=0 0x557750cda580]
 [dir 0x603 ~mds0/stray3/ [2,head] auth v=4597188 cv=4597188/4597188 state=1073750017|complete|sticky f(v2 m2022-08-05T06:26:26.907743+0200)/f(v2 m2022-08-05T06:26:26.907743+0200 97924=81625+16299) n(v47 rc2022-08-05T06:26:52.614067+0200)/n(v47 rc2022-08-05T06:26:52.614067+0200 b4035703251 97544=81625+15919) hs=0+0,ss=0+0 | child=0 sticky=1 dirty=0 waiter=0 authpin=0 0x557752a37180]
 [dir 0x609 ~mds0/stray9/ [2,head] auth v=4609066 cv=4413069/4413069 state=1610620929|complete|sticky f(v2 m2022-08-05T06:26:44.666967+0200)/f(v2 m2022-08-05T06:26:44.666967+0200 98543=82777+15766) n(v51 rc2022-08-05T06:26:52.863070+0200)/n(v51 rc2022-08-05T06:26:52.863070+0200 b4665727214 98095=82777+15318) hs=0+97998,ss=0+0 dirty=97998 | child=1 sticky=1 dirty=1 waiter=0 authpin=0 0x5577529fab00]
 [dir 0x604 ~mds0/stray4/ [2,head] auth v=4591327 cv=4591327/4591327 state=1073750017|complete|sticky f(v2 m2022-08-05T06:26:31.932807+0200)/f(v2 m2022-08-05T06:26:31.932807+0200 96712=80727+15985) n(v29 rc2022-08-05T06:26:52.109061+0200)/n(v29 rc2022-08-05T06:26:52.109061+0200 b2479976682 96621=80727+15894) hs=0+0,ss=0+0 | child=0 sticky=1 dirty=0 waiter=0 authpin=0 0x557752a43600]
 [dir 0x600 ~mds0/stray0/ [2,head] auth v=4599260 cv=4599260/4599260 state=1073750017|complete|sticky f(v2 m2022-08-05T06:26:49.671030+0200)/f(v2 m2022-08-05T06:26:49.671030+0200 97254=80694+16560) n(v55 rc2022-08-05T06:26:52.837070+0200)/n(v55 rc2022-08-05T06:26:52.837070+0200 118=0+118) hs=0+0,ss=0+0 | child=0 sticky=1 dirty=0 waiter=0 authpin=0 0x557752f98680]
 [dir 0x1000007021e /data/ [2,head] auth v=146646 cv=146646/146646 state=1073741825|complete f(v0 m2022-07-28T10:01:37.819634+0200 1=0+1) n(v7819 rc2022-08-05T06:26:53.074073+0200 1=0+1) hs=1+0,ss=0+0 | child=1 dirty=0 waiter=0 authpin=0 0x557750cdb080]
 [dir 0x601 ~mds0/stray1/ [2,head] auth v=4599585 cv=4599585/4599585 state=1073750017|complete|sticky f(v2 m2022-08-05T06:26:50.245037+0200)/f(v2 m2022-08-05T06:26:50.245037+0200 97137=81336+15801) n(v42 rc2022-08-05T06:26:52.457065+0200)/n(v42 rc2022-08-05T06:26:52.457065+0200 55=0+55) hs=0+0,ss=0+0 | child=0 sticky=1 dirty=0 waiter=0 authpin=0 0x557752f98c00]
 [dir 0x608 ~mds0/stray8/ [2,head] auth v=4578631 cv=4578631/4578631 state=1073750017|complete|sticky f(v2 m2022-08-05T06:26:44.333963+0200)/f(v2 m2022-08-05T06:26:44.333963+0200 91898=76187+15711) n(v32 rc2022-08-05T06:26:52.964072+0200)/n(v32 rc2022-08-05T06:26:52.964072+0200 b3085898503 91762=76187+15575) hs=0+0,ss=0+0 | child=0 sticky=1 dirty=0 waiter=0 authpin=0 0x557752a45180]
 [dir 0x605 ~mds0/stray5/ [2,head] auth v=4598617 cv=4598617/4598617 state=1073750017|complete|sticky f(v2 m2022-08-05T06:26:32.601815+0200)/f(v2 m2022-08-05T06:26:32.601815+0200 96490=80224+16266) n(v28 rc2022-08-05T06:26:51.655055+0200)/n(v28 rc2022-08-05T06:26:51.655055+0200 b2138910967 96352=80224+16128) hs=0+0,ss=0+0 | child=0 sticky=1 dirty=0 waiter=0 authpin=0 0x557752a44c00]
 [dir 0x606 ~mds0/stray6/ [2,head] auth v=4584499 cv=4584499/4584499 state=1073750017|complete|sticky f(v2 m2022-08-05T06:26:36.775868+0200)/f(v2 m2022-08-05T06:26:36.775868+0200 92769=77025+15744) n(v31 rc2022-08-05T06:26:52.897071+0200)/n(v31 rc2022-08-05T06:26:52.897071+0200 b3068679378 92769=77025+15744) hs=0+0,ss=0+0 | child=0 sticky=1 dirty=0 waiter=0 authpin=0 0x5577529fa000]
 [dir 0x100 ~mds0/ [2,head] auth v=27708976 cv=27708976/27708976 dir_auth=0 state=1073741825|complete f(v0 10=0+10) n(v17779 rc2022-08-05T06:26:53.074073+0200 b26615951175 766896=640367+126529)/n(v17779 rc2022-08-05T06:26:53.074073+0200 b29434092347 865600=721703+143897) hs=10+0,ss=0+0 | child=1 subtree=1 subtreetemp=0 dirty=0 waiter=0 authpin=0 0x557750cdab00]
 [dir 0x607 ~mds0/stray7/ [2,head] auth v=4598900 cv=4598900/4598900 state=1073750017|complete|sticky f(v2 m2022-08-05T06:26:37.998883+0200)/f(v2 m2022-08-05T06:26:37.998883+0200 97750=80799+16951) n(v27 rc2022-08-05T06:26:51.964059+0200)/n(v27 rc2022-08-05T06:26:51.964059+0200 b4266756984 97485=80799+16686) hs=0+0,ss=0+0 | child=0 sticky=1 dirty=0 waiter=0 authpin=0 0x5577529fb080]

Going back to the code fragment, is it possible that the test (!in) is actually not testing if the root inode is in cache:

  CInode *in = mdcache->cache_traverse(filepath(root.c_str()));
  if (!in) {
    ss << "root inode is not in cache";
    return;
  }

but rather if the path in root.c_str() is not empty? The path may be any path if I interpret the initialisation correctly:

void MDSRank::command_dump_tree(const cmdmap_t &cmdmap, std::ostream &ss, Formatter *f)
{
  std::string root;
  int64_t depth;
  cmd_getval(cmdmap, "root", root);
  if (root.empty()) {
    root = "/";
  }

What if this is just a malformed error message and the test is not able to distinguish between "a dir being empty" and "a dir not being in cache"? In my case, stray0 is empty at the moment.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Frank Schilder <frans@xxxxxx>
Sent: 07 August 2022 16:21:46
To: 胡 玮文
Cc: ceph-users@xxxxxxx
Subject:  Re: ceph mds dump tree - root inode is not in cache

Hi Weiwen,

sorry, I sent the output of ceph fs status after I unmounted the clients. I just wanted to see if this gets rid of the handle reset messages. The tests were done with the clients mounted. Nevertheless, here the output of a new test in sequence:

[root@rit-tceph bench]# ceph fs status
fs - 2 clients
==
RANK  STATE     MDS        ACTIVITY     DNS    INOS
 0    active  tceph-02  Reqs:    0 /s  98.0k    15
  POOL      TYPE     USED  AVAIL
fs-meta1  metadata  2053M   784G
fs-meta2    data       0    784G
fs-data     data       0   1569G
STANDBY MDS
  tceph-01
  tceph-03
MDS version: ceph version 15.2.16 (d46a73d6d0a67a79558054a3a5a72cb561724974) octopus (stable)

[root@rit-tceph bench]# ls -la /mnt/adm/cephfs/
total 0
drwxrwsr-x. 3 root ansible  1 Jul 28 10:01 .
drwxr-xr-x. 3 root root    20 Jul 28 09:45 ..
drwxr-sr-x. 3 root ansible  1 Jul 28 10:01 data

[root@rit-tceph bench]# ceph tell mds.0 dump tree '~mds0/stray0'
2022-08-07T15:55:24.199+0200 7f96897fa700  0 client.455712 ms_handle_reset on v2:10.41.24.14:6812/3943985176
2022-08-07T15:55:24.264+0200 7f968a7fc700  0 client.438997 ms_handle_reset on v2:10.41.24.14:6812/3943985176
root inode is not in cache

[root@rit-tceph bench]# ceph tell mds.0 dump tree '~mdsdir/stray0'
2022-08-07T15:55:31.075+0200 7f5c4e7fc700  0 client.439009 ms_handle_reset on v2:10.41.24.14:6812/3943985176
2022-08-07T15:55:31.115+0200 7f5c4f7fe700  0 client.439015 ms_handle_reset on v2:10.41.24.14:6812/3943985176
root inode is not in cache

[root@rit-tceph bench]# mount | grep ceph
10.41.24.13,10.41.24.14,10.41.24.15:/ on /mnt/adm/cephfs type ceph (rw,relatime,name=fs-admin,secret=<hidden>,acl)
10.41.24.13,10.41.24.14,10.41.24.15:/data on /mnt/cephfs type ceph (rw,relatime,name=fs,secret=<hidden>,acl)

The FS root is mounted on /mnt/adm/cephfs. Apart from that, I would assume that the root inode is always in cache. If not, it should be pulled up when missing, for example, here: https://github.com/ceph/ceph/blob/main/src/mds/MDSRank.cc#L3123 . In the test

  CInode *in = mdcache->cache_traverse(filepath(root.c_str()));
  if (!in) {
    ss << "root inode is not in cache";
    return;
  }

it seems it would be better to pull the inode into the cache (on mds.0).

I'm trying to dump the MDS cache in case that might help. Unfortunately, its just hanging:

[root@ceph-mds:tceph-02 /]# ceph daemon mds.tceph-02 dump cache

The command "ceph" is at 100% CPU, but I can't see any output. There is also no disk activity. I remember this command returning much faster on a mimic cluster and a really large cache. Is the command different in octopus?

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: 胡 玮文 <huww98@xxxxxxxxxxx>
Sent: 07 August 2022 15:52:28
To: Frank Schilder
Cc: ceph-users@xxxxxxx
Subject: Re:  ceph mds dump tree - root inode is not in cache

I see you have 0 client. Can you try just mount a client and do a “ls” in your cephfs root directory?

> 在 2022年8月7日,20:38,Frank Schilder <frans@xxxxxx> 写道:
>
> Hi Weiwen,
>
> I get the following results:
>
> # ceph fs status
> fs - 0 clients
> ==
> RANK  STATE     MDS        ACTIVITY     DNS    INOS
> 0    active  tceph-03  Reqs:    0 /s   997k   962k
>  POOL      TYPE     USED  AVAIL
> fs-meta1  metadata  6650M   780G
> fs-meta2    data       0    780G
> fs-data     data       0   1561G
> STANDBY MDS
>  tceph-01
>  tceph-02
> MDS version: ceph version 15.2.16 (d46a73d6d0a67a79558054a3a5a72cb561724974) octopus (stable)
>
> # ceph tell mds.0 dump tree '~mds0/stray0'
> 2022-08-07T14:28:00.735+0200 7fb6827fc700  0 client.434599 ms_handle_reset on v2:10.41.24.15:6812/2903519715
> 2022-08-07T14:28:00.776+0200 7fb6837fe700  0 client.434605 ms_handle_reset on v2:10.41.24.15:6812/2903519715
> root inode is not in cache
>
> # ceph tell mds.0 dump tree '~mdsdir/stray0'
> 2022-08-07T14:30:07.370+0200 7f364d7fa700  0 client.434623 ms_handle_reset on v2:10.41.24.15:6812/2903519715
> 2022-08-07T14:30:07.411+0200 7f364e7fc700  0 client.434629 ms_handle_reset on v2:10.41.24.15:6812/2903519715
> root inode is not in cache
>
> Whatever I try, it says the same: "root inode is not in cache". Are the ms_handle_reset messages possibly hinting at a problem with my installation? The MDS is the only daemon type for which these appear when I use ceph tell commands.

Possibly not. I also get these messages every time.
>
> This is a test cluster, I can do all sorts of experiments with it. Please let me know if I can try something and pull extra information out.
>
> With the default settings, this is all that's in today's log after trying a couple of times, the sighup comes from logrotate:
>
> 2022-08-07T04:02:06.693+0200 7f7856b1c700 -1 received  signal: Hangup from Kernel ( Could be generated by pthread_kill(), raise(), abort(), alarm() ) UID: 0
> 2022-08-07T14:27:01.298+0200 7f785731d700  1 mds.tceph-03 asok_command: dump tree {prefix=dump tree,root=~mdsdir/stray0} (starting...)
> 2022-08-07T14:27:07.581+0200 7f785731d700  1 mds.tceph-03 asok_command: dump tree {prefix=dump tree,root=~mds0/stray0} (starting...)
> 2022-08-07T14:27:48.976+0200 7f785731d700  1 mds.tceph-03 asok_command: dump tree {prefix=dump tree,root=~mds0/stray0} (starting...)
> 2022-08-07T14:28:00.776+0200 7f785731d700  1 mds.tceph-03 asok_command: dump tree {prefix=dump tree,root=~mds0/stray0} (starting...)
> 2022-08-07T14:30:07.410+0200 7f785731d700  1 mds.tceph-03 asok_command: dump tree {prefix=dump tree,root=~mdsdir/stray0} (starting...)
> 2022-08-07T14:31:15.839+0200 7f785731d700  1 mds.tceph-03 asok_command: dump tree {prefix=dump tree,root=~mdsdir/stray0} (starting...)
> 2022-08-07T14:31:19.900+0200 7f785731d700  1 mds.tceph-03 asok_command: dump tree {prefix=dump tree,root=~mds0/stray0} (starting...)
>
> Please let me know if/how I can provide more info.
>
> Thanks and best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> ________________________________________
> From: 胡 玮文 <huww98@xxxxxxxxxxx>
> Sent: 05 August 2022 03:43:05
> To: Frank Schilder
> Cc: ceph-users@xxxxxxx
> Subject: [Warning Possible spam]  Re:  ceph mds dump tree - root inode is not in cache
>
> Hi Frank,
>
> I have not experienced this before. Maybe mds.tceph-03 is in standby state? Could you show the output of “ceph fs status”?
>
> You can also try “ceph tell mds.0 …” and let ceph find the correct daemon for you.
>
> You may also try dumping “~mds0/stray0”.
>
> Weiwen Hu
>
>> 在 2022年8月4日,23:22,Frank Schilder <frans@xxxxxx> 写道:
>>
>> Hi all,
>>
>> I'm stuck with a very annoying problem with a ceph octopus test cluster (latest stable version). I need to investigate the contents of the MDS stray buckets and something like this should work:
>>
>> [root@ceph-adm:tceph-03 ~]# ceph daemon mds.tceph-03 dump tree '~mdsdir' 3
>> [root@ceph-adm:tceph-03 ~]# ceph tell mds.tceph-03 dump tree '~mdsdir/stray0'
>> 2022-08-04T16:57:54.010+0200 7f3475ffb700  0 client.371437 ms_handle_reset on v2:10.41.24.15:6812/2903519715
>> 2022-08-04T16:57:54.052+0200 7f3476ffd700  0 client.371443 ms_handle_reset on v2:10.41.24.15:6812/2903519715
>> root inode is not in cache
>>
>> However, I either get nothing or an error message. Whatever I try, I cannot figure out how to pull the root inode into the MDS cache - if this is even the problem here. I also don't understand why the annoying ms_handle_reset messages are there. I found the second command in a script:
>>
>> Code line: https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgist.github.com%2Fhuww98%2F91cbff0782ad4f6673dcffccce731c05%23file-cephfs-reintegrate-conda-stray-py-L11&amp;data=05%7C01%7C%7C5199ba2ee5ff483150f908da7871b2c0%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637954727005239422%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=r30mA5v4pLdB4g3%2BUAgMLj8Raih8aNvN9K%2BArZqr0Qg%3D&amp;reserved=0
>>
>> that came up in this conversation: https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.ceph.io%2Fhyperkitty%2Flist%2Fceph-users%40ceph.io%2Fmessage%2F4TDASTSWF4UIURKUN2P7PGZZ3V5SCCEE%2F&amp;data=05%7C01%7C%7C5199ba2ee5ff483150f908da7871b2c0%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637954727005239422%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=6umPa%2FPV6acxZLQqsBySHd3Uqxh6QM9665kCYJzGZmk%3D&amp;reserved=0
>>
>> The only place I can find "root inode is not in cache" is https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftracker.ceph.com%2Fissues%2F53597%23note-14&amp;data=05%7C01%7C%7C5199ba2ee5ff483150f908da7871b2c0%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637954727005239422%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=LCVZXwxEP1CmyjC586CgljqiX0vUZJkzENWBNps1Qlo%3D&amp;reserved=0, where it says that the above commands should return the tree. I have ca. 1mio stray entries and they must be somewhere. mds.tceph-03 is the only active MDS.
>>
>> Can someone help me out here?
>>
>> Thanks and best regards,
>> =================
>> Frank Schilder
>> AIT Risø Campus
>> Bygning 109, rum S14
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux