Re: ceph-fuse crash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



If the crash is easily reproducible at your end, could you set debug_client
to 20 in the client-side conf file and then reattempt the operation.

You could then send over the collected logs and we could take a look at
them.

FYI - there's also a bug tracker that has identified a similar problem:
https://tracker.ceph.com/issues/56288, but we don't have the logs for it.



On Fri, Jun 30, 2023 at 11:16 AM <hakesudu@xxxxxxxxx> wrote:

> Hi,
>
> I've deployed a ceph-quincy for HPC. Recently, I always encounter the
> problem of ceph-fuse crash
> kernel version is 4.18.0-348.el8.0.2.x86_64
>
> here is part of ceph-fuse log:
>
>    -59> 2023-06-28T09:51:00.452+0800 155546ff7700  3 client.159239
> ll_lookup 0x200017f674a.head anaconda3
>    -58> 2023-06-28T09:51:00.452+0800 15554cc49700  3 client.159239
> ll_opendir 0x10003e1408d.head
>    -57> 2023-06-28T09:51:00.452+0800 15554cc49700  3 client.159239
> may_open 0x1554e79123d0 = 0
>    -56> 2023-06-28T09:51:00.452+0800 15554cc49700  3 client.159239
> ll_opendir 0x10003e1408d.head = 0 (0x155328079380)
>    -55> 2023-06-28T09:51:00.453+0800 1555473f9700  3 client.159239
> seekdir(0x155328079380, 0)
>    -54> 2023-06-28T09:51:00.452+0800 155546bf5700  5 client.159239
> put_cap_ref dropped last FILE_CACHE ref on 0x20004a6e548.head(faked_ino=0
> nref=14 ll_ref=2 cap_refs={1024=0,2048=1} open={1=1} mode=100644
> size=5626/0 nlink=1 btime=2023-06-05T14:38:36.471178+0800
> mtime=2023-06-05T14:38:36.471178+0800 ctime=2023-06-05T14:38:36.471178+0800
> change_attr=1 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[0x20004a6e548 ts
> 0/0 objects 1 dirty_or_tx 0] 0x1554e7902300)
>    -53> 2023-06-28T09:51:00.453+0800 155546bf5700  3 client.159239 ll_read
> 0x15531806b970 0~8192 = 5626
>    -52> 2023-06-28T09:51:00.453+0800 155546ff7700  3 client.159239
> may_lookup 0x155420004840 = 0
>    -51> 2023-06-28T09:51:00.453+0800 1554a5dee700  3 client.159239
> ll_lookup 0x10003e1406f.head html
>    -50> 2023-06-28T09:51:00.453+0800 155546ff7700  3 client.159239
> ll_lookup 0x200017f674a.head anaconda3 -> 0 (100019b7a43)
>    -49> 2023-06-28T09:51:00.453+0800 1554a5dee700  3 client.159239
> may_lookup 0x1554c1a89e40 = 0
>    -48> 2023-06-28T09:51:00.453+0800 1554a7dfe700  3 client.159239
> ll_flush 0x15531806b970 0x20004a6e548
>    -47> 2023-06-28T09:51:00.453+0800 1555469f4700  3 client.159239
> ll_lookup 0x100019b7a43.head envs
>    -46> 2023-06-28T09:51:00.453+0800 1555469f4700  3 client.159239
> may_lookup 0x155420343eb0 = 0
>    -45> 2023-06-28T09:51:00.453+0800 1555469f4700  3 client.159239
> ll_lookup 0x100019b7a43.head envs -> 0 (200035de5d8)
>    -44> 2023-06-28T09:51:00.453+0800 155545fef700  3 client.159239
> ll_release (fh)0x15531806b970 0x20004a6e548
>    -43> 2023-06-28T09:51:00.453+0800 155546bf5700  3 client.159239
> seekdir(0x15544c054010, 1152360438801891331)
>    -42> 2023-06-28T09:51:00.453+0800 155546ff7700  3 client.159239
> seekdir(0x15544c029ee0, 1152690945930559491)
>    -41> 2023-06-28T09:51:00.453+0800 15554cc49700  3 client.159239
> ll_releasedir 0x15544c029ee0
>    -40> 2023-06-28T09:51:00.453+0800 1555452da700  3 client.159239
> ll_lookup 0x20004a66459.head tests
>    -39> 2023-06-28T09:51:00.453+0800 15554c244700  3 client.159239
> ll_lookup 0x100040bd04e.head att5410-w -> 0 (200047cbc02)
>    -38> 2023-06-28T09:51:00.453+0800 1554a7dfe700  3 client.159239
> ll_lookup 0x200035de5d8.head steven-colossal
>    -37> 2023-06-28T09:51:00.453+0800 1554a7dfe700  3 client.159239
> may_lookup 0x1554c20229c0 = 0
>    -36> 2023-06-28T09:51:00.453+0800 1554a7dfe700  3 client.159239
> ll_lookup 0x200035de5d8.head steven-colossal -> 0 (100040bcf3a)
>    -35> 2023-06-28T09:51:00.453+0800 1555452da700  3 client.159239
> may_lookup 0x155533c970f0 = 0
>    -34> 2023-06-28T09:51:00.453+0800 1555452da700  3 client.159239
> ll_lookup 0x20004a66459.head tests -> 0 (20004a6e3e0)
>    -33> 2023-06-28T09:51:00.453+0800 155545bed700  3 client.159239
> ll_releasedir 0x15544c054010
>    -32> 2023-06-28T09:51:00.453+0800 155546bf5700  3 client.159239
> ll_getattr 0x200047cbc02.head = 0
>    -31> 2023-06-28T09:51:00.453+0800 1555452da700  3 client.159239
> ll_lookup 0x200017f674a.head anaconda3
>    -30> 2023-06-28T09:51:00.453+0800 1555452da700  3 client.159239
> may_lookup 0x155420004840 = 0
>    -29> 2023-06-28T09:51:00.453+0800 1555452da700  3 client.159239
> ll_lookup 0x200017f674a.head anaconda3 -> 0 (100019b7a43)
>    -28> 2023-06-28T09:51:00.453+0800 155545fef700  3 client.159239
> ll_lookup 0x100040bcf3a.head lib
>    -27> 2023-06-28T09:51:00.453+0800 155545fef700  3 client.159239
> may_lookup 0x1554d7d1d240 = 0
>    -26> 2023-06-28T09:51:00.453+0800 155545fef700  3 client.159239
> ll_lookup 0x100040bcf3a.head lib -> 0 (100040bd018)
>    -25> 2023-06-28T09:51:00.453+0800 1555469f4700  3 client.159239
> ll_lookup 0x20004a6e3e0.head test_trainer
>    -24> 2023-06-28T09:51:00.453+0800 1555469f4700  3 client.159239
> may_lookup 0x155531e5a6c0 = 0
>    -23> 2023-06-28T09:51:00.453+0800 15554cc49700  3 client.159239
> ll_lookup 0x200017f674a.head anaconda3
>    -22> 2023-06-28T09:51:00.453+0800 15554cc49700  3 client.159239
> may_lookup 0x155420004840 = 0
>    -21> 2023-06-28T09:51:00.453+0800 15554cc49700  3 client.159239
> ll_lookup 0x200017f674a.head anaconda3 -> 0 (100019b7a43)
>    -20> 2023-06-28T09:51:00.453+0800 15554c244700  3 client.159239
> ll_lookup 0x100019b7a43.head envs
>    -19> 2023-06-28T09:51:00.453+0800 15554c244700  3 client.159239
> may_lookup 0x155420343eb0 = 0
>    -18> 2023-06-28T09:51:00.453+0800 15554c244700  3 client.159239
> ll_lookup 0x100019b7a43.head envs -> 0 (200035de5d8)
>    -17> 2023-06-28T09:51:00.453+0800 155546ff7700  3 client.159239
> ll_lookup 0x100019b7a43.head envs
>    -16> 2023-06-28T09:51:00.453+0800 155546ff7700  3 client.159239
> may_lookup 0x155420343eb0 = 0
>    -15> 2023-06-28T09:51:00.453+0800 155546ff7700  3 client.159239
> ll_lookup 0x100019b7a43.head envs -> 0 (200035de5d8)
>    -14> 2023-06-28T09:51:00.453+0800 155545bed700  3 client.159239
> ll_lookup 0x100040bd018.head terminfo
>    -13> 2023-06-28T09:51:00.453+0800 1555475fa700  3 client.159239
> ll_lookup 0x1000383ca0a.head train.py -> 0 (1000383caf7)
>    -12> 2023-06-28T09:51:00.453+0800 1555465f2700  3 client.159239
> ll_lookup 0x200017f674a.head anaconda3
>    -11> 2023-06-28T09:51:00.453+0800 1555465f2700  3 client.159239
> may_lookup 0x155420004840 = 0
>    -10> 2023-06-28T09:51:00.453+0800 1555465f2700  3 client.159239
> ll_lookup 0x200017f674a.head anaconda3 -> 0 (100019b7a43)
>     -9> 2023-06-28T09:51:00.453+0800 1555452da700  3 client.159239
> ll_lookup 0x200035de5d8.head steven-colossal
>     -8> 2023-06-28T09:51:00.453+0800 1555452da700  3 client.159239
> may_lookup 0x1554c20229c0 = 0
>     -7> 2023-06-28T09:51:00.453+0800 1555452da700  3 client.159239
> ll_lookup 0x200035de5d8.head steven-colossal -> 0 (100040bcf3a)
>     -6> 2023-06-28T09:51:00.453+0800 15554cc49700  3 client.159239
> ll_lookup 0x100019b7a43.head envs
>     -5> 2023-06-28T09:51:00.453+0800 1555465f2700  3 client.159239
> ll_getattr 0x1000383caf7.head = 0
>     -4> 2023-06-28T09:51:00.453+0800 155546bf5700  3 client.159239
> ll_lookup 0x200035de5d8.head llmzoo
>     -3> 2023-06-28T09:51:00.453+0800 155546bf5700  3 client.159239
> may_lookup 0x1554c20229c0 = 0
>     -2> 2023-06-28T09:51:00.453+0800 155546bf5700  3 client.159239
> ll_lookup 0x200035de5d8.head llmzoo -> 0 (10003e11a9e)
>     -1> 2023-06-28T09:51:00.453+0800 15554c646700  3 client.159239
> ll_lookup 0x10003e11a9e.head lib
>      0> 2023-06-28T09:51:00.458+0800 1554a77fb700 -1 *** Caught signal
> (Segmentation fault) **
>  in thread 1554a77fb700 thread_name:ceph-fuse
>
>  ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy
> (stable)
>  1: /lib64/libpthread.so.0(+0x12ce0) [0x1555535eece0]
>  2: (Client::_readdir_cache_cb(dir_result_t*, int (*)(void*, dirent*,
> ceph_statx*, long, Inode*), void*, int, bool)+0x2f4) [0x555555647d64]
>  3: (Client::readdir_r_cb(dir_result_t*, int (*)(void*, dirent*,
> ceph_statx*, long, Inode*), void*, unsigned int, unsigned int, bool)+0xae7)
> [0x55555564cd37]
>  4: ceph-fuse(+0xadbf8) [0x555555601bf8]
>  5: /lib64/libfuse.so.2(+0x16706) [0x1555550fd706]
>  6: /lib64/libfuse.so.2(+0x17868) [0x1555550fe868]
>  7: /lib64/libfuse.so.2(+0x14440) [0x1555550fb440]
>  8: /lib64/libpthread.so.0(+0x81cf) [0x1555535e41cf]
>  9: clone()
>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed
> to interpret this.
>
> --- logging levels ---
>    0/ 5 none
>    0/ 1 lockdep
>    0/ 1 context
>    1/ 1 crush
>    1/ 5 mds
>    1/ 5 mds_balancer
>    1/ 5 mds_locker
>    1/ 5 mds_log
>    1/ 5 mds_log_expire
>    1/ 5 mds_migrator
>    0/ 1 buffer
>    0/ 1 timer
>    0/ 1 filer
>    0/ 1 striper
>    0/ 1 objecter
>    0/ 5 rados
>    0/ 5 rbd
>    0/ 5 rbd_mirror
>    0/ 5 rbd_replay
>    0/ 5 rbd_pwl
>    0/ 5 journaler
>    0/ 5 objectcacher
>    0/ 5 immutable_obj_cache
>    1/ 5 client
>    1/ 5 osd
>    0/ 5 optracker
>    0/ 5 objclass
>    1/ 3 filestore
>    1/ 3 journal
>    0/ 0 ms
>    1/ 5 mon
>    0/10 monc
>    1/ 5 paxos
>    0/ 5 tp
>    1/ 5 auth
>    1/ 5 crypto
>    1/ 1 finisher
>    1/ 1 reserver
>    1/ 5 heartbeatmap
>    1/ 5 perfcounter
>    1/ 5 rgw
>    1/ 5 rgw_sync
>    1/ 5 rgw_datacache
>    1/10 civetweb
>    1/ 5 javaclient
>    1/ 5 asok
>    1/ 1 throttle
>    0/ 0 refs
>    1/ 5 compressor
>    1/ 5 bluestore
>    1/ 5 bluefs
>    1/ 3 bdev
>    1/ 5 kstore
>    4/ 5 rocksdb
>    4/ 5 leveldb
>    4/ 5 memdb
>    1/ 5 fuse
>    2/ 5 mgr
>    1/ 5 mgrc
>    1/ 5 dpdk
>    1/ 5 eventtrace
>    1/ 5 prioritycache
>    0/ 5 test
>    0/ 5 cephfs_mirror
>    0/ 5 cephsqlite
>    0/ 5 seastore
>    0/ 5 seastore_onode
>    0/ 5 seastore_odata
>    0/ 5 seastore_omap
>    0/ 5 seastore_tm
>    0/ 5 seastore_cleaner
>    0/ 5 seastore_lba
>    0/ 5 seastore_cache
>    0/ 5 seastore_journal
>    0/ 5 seastore_device
>    0/ 5 alienstore
>    1/ 5 mclock
>    1/ 5 ceph_exporter
>   -2/-2 (syslog threshold)
>   -1/-1 (stderr threshold)
> --- pthread ID / name mapping for recent threads ---
>   1554947e3700 / ceph-fuse
>   155494fe7700 / ceph-fuse
>   1554975fa700 / ceph-fuse
>   1554a57eb700 / ceph-fuse
>   1554a59ec700 / ceph-fuse
>   1554a5dee700 / ceph-fuse
>   1554a5fef700 / ceph-fuse
>   1554a61f0700 /
>   1554a63f1700 / ceph-fuse
>   1554a65f2700 / ceph-fuse
>   1554a67f3700 / ceph-fuse
>   1554a6bf5700 / ceph-fuse
>   1554a6df6700 / ceph-fuse
>   1554a71f8700 / ceph-fuse
>   1554a73f9700 / ceph-fuse
>   1554a75fa700 /
>   1554a77fb700 / ceph-fuse
>   1554a7dfe700 / ceph-fuse
>   1554a7fff700 / ceph-fuse
>   1555452da700 / ceph-fuse
>   1555455ea700 / ceph-fuse
>   1555459ec700 / ceph-fuse
>   155545bed700 / ceph-fuse
>   155545fef700 / ceph-fuse
>   1555465f2700 / ceph-fuse
>   1555469f4700 / ceph-fuse
>   155546bf5700 / ceph-fuse
>   155546df6700 / ceph-fuse
>   155546ff7700 / ceph-fuse
>   1555471f8700 / ceph-fuse
>   1555473f9700 / ceph-fuse
>   1555475fa700 / ceph-fuse
>   1555477fb700 /
>   155547dfe700 / ceph-fuse
>   155547fff700 / ceph-fuse
>   15554c244700 / ceph-fuse
>   15554c646700 / ceph-fuse
>   15554c847700 / ceph-fuse
>   15554cc49700 / ceph-fuse
>   15554ce4a700 / ceph-fuse
>   15554da50700 / ms_dispatch
>   max_recent     10000
>   max_new        10000
>   log_file
> /var/lib/ceph/crash/2023-06-28T01:51:00.459615Z_3ddbaa44-d8cd-437b-908b-c3772520c7a6/log
> --- end dump of recent events ---
>
>
> Has anyone encountered this kind of problem?
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>

-- 
Milind
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux