Re: Hung CephFS client

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2019-10-11 at 15:55 -0700, Robert LeBlanc wrote:
> We had a docker container that seems to be hung in the CephFS code
> path. We were able to extract the following:
> 
>            <...>-77292 [003] 1175858.326638: function:
> _raw_spin_lock
> 
>            <...>-77292 [003] 1175858.326640: function:             __wake_up
> 
>            <...>-77292 [003] 1175858.326641: function:
> _raw_spin_lock_irqsave
> 
>            <...>-77292 [003] 1175858.326641: function:
> __wake_up_common
> 
>            <...>-77292 [003] 1175858.326641: function:
> _raw_spin_unlock_irqrestore
> 
>            <...>-77292 [003] 1175858.326641: function:             __wake_up
> 
>            <...>-77292 [003] 1175858.326641: function:
> _raw_spin_lock_irqsave
> 
>            <...>-77292 [003] 1175858.326641: function:
> __wake_up_common
> 
>            <...>-77292 [003] 1175858.326641: function:
>   autoremove_wake_function
> 
>            <...>-77292 [003] 1175858.326641: function:
>      default_wake_function
> 
>            <...>-77292 [003] 1175858.326641: function:
>         try_to_wake_up
> 
>            <...>-77292 [003] 1175858.326642: function:
>            _raw_spin_lock_irqsave
> 
>            <...>-77292 [003] 1175858.326642: function:
>            task_waking_fair
> 
>            <...>-77292 [003] 1175858.326642: function:
>            select_task_rq_fair
> 
>            <...>-77292 [003] 1175858.326642: function:
>               source_load
> 
>            <...>-77292 [003] 1175858.326643: function:
>               target_load
> 
>            <...>-77292 [003] 1175858.326643: function:
>               effective_load.isra.45
> 
>            <...>-77292 [003] 1175858.326644: function:
>               effective_load.isra.45
> 
>            <...>-77292 [003] 1175858.326644: function:
>               select_idle_sibling
> 
>            <...>-77292 [003] 1175858.326645: function:
>                  idle_cpu
> 
>            <...>-77292 [003] 1175858.326645: function:
>            set_nr_if_polling
> 
>            <...>-77292 [003] 1175858.326645: function:
>            ttwu_stat
> 
>            <...>-77292 [003] 1175858.326646: function:
>            _raw_spin_unlock_irqrestore
> 
>            <...>-77292 [003] 1175858.326646: function:
> _raw_spin_unlock_irqrestore
> 
>            <...>-77292 [003] 1175858.326646: function:             irq_exit
> 
>            <...>-77292 [003] 1175858.326646: function:
> _raw_spin_unlock
> 
>            <...>-77292 [003] 1175858.326646: function:
> try_get_cap_refs
> 
>            <...>-77292 [003] 1175858.326647: function:
> _raw_spin_lock
> 
>            <...>-77292 [003] 1175858.326647: function:
> __ceph_caps_file_wanted
> 
>            <...>-77292 [003] 1175858.326647: function:
>   ceph_caps_for_mode
> 
>            <...>-77292 [003] 1175858.326647: function:
> __ceph_caps_issued
> 
>            <...>-77292 [003] 1175858.326647: function:
> _raw_spin_unlock
> 
>            <...>-77292 [003] 1175858.326647: function:             _cond_resched
> 
>            <...>-77292 [003] 1175858.326647: function:
> try_get_cap_refs
> 
>            <...>-77292 [003] 1175858.326647: function:
> _raw_spin_lock
> 
>            <...>-77292 [003] 1175858.326647: function:
> __ceph_caps_file_wanted
> 
>            <...>-77292 [003] 1175858.326647: function:
>   ceph_caps_for_mode
> 
>            <...>-77292 [003] 1175858.326647: function:
> __ceph_caps_issued
> 
>            <...>-77292 [003] 1175858.326647: function:
> _raw_spin_unlock
> 
>            <...>-77292 [003] 1175858.326647: function:
> prepare_to_wait_event
> 
>            <...>-77292 [003] 1175858.326647: function:
> try_get_cap_refs
> 
>            <...>-77292 [003] 1175858.326647: function:
> _raw_spin_lock
> 
>            <...>-77292 [003] 1175858.326647: function:
> __ceph_caps_file_wanted
> 
>            <...>-77292 [003] 1175858.326647: function:
>   ceph_caps_for_mode
> 
>            <...>-77292 [003] 1175858.326648: function:
> __ceph_caps_issued
> 
>            <...>-77292 [003] 1175858.326648: function:
> _raw_spin_unlock
> 
>            <...>-77292 [003] 1175858.326648: function:             finish_wait
> 
>            <...>-77292 [003] 1175858.326648: function:             ceph_get_caps
> 
>            <...>-77292 [003] 1175858.326648: function:
> ceph_pool_perm_check
> 
>            <...>-77292 [003] 1175858.326648: function:
>   _raw_spin_lock
> 
>            <...>-77292 [003] 1175858.326648: function:
>   _raw_spin_unlock
> 
>            <...>-77292 [003] 1175858.326648: function:
> try_get_cap_refs
> 
>            <...>-77292 [003] 1175858.326648: function:
>   _raw_spin_lock
> 
>            <...>-77292 [003] 1175858.326648: function:
>   __ceph_caps_file_wanted
> 
>            <...>-77292 [003] 1175858.326648: function:
>      ceph_caps_for_mode
> 
>            <...>-77292 [003] 1175858.326648: function:
>   __ceph_caps_issued
> 
>            <...>-77292 [003] 1175858.326648: function:
>   _raw_spin_unlock
> 
>            <...>-77292 [003] 1175858.326648: function:
> _cond_resched
> 
>            <...>-77292 [003] 1175858.326648: function:
> try_get_cap_refs
> 
>            <...>-77292 [003] 1175858.326648: function:
>   _raw_spin_lock
> 
>            <...>-77292 [003] 1175858.326649: function:
>   __ceph_caps_file_wanted
> 
>            <...>-77292 [003] 1175858.326649: function:
>      ceph_caps_for_mode
> 
>            <...>-77292 [003] 1175858.326649: function:
>   __ceph_caps_issued
> 
>            <...>-77292 [003] 1175858.326649: function:
>   _raw_spin_unlock
> 
>            <...>-77292 [003] 1175858.326649: function:
> prepare_to_wait_event
> 
>            <...>-77292 [003] 1175858.326649: function:
> try_get_cap_refs
> 
>            <...>-77292 [003] 1175858.326649: function:
>   _raw_spin_lock
> 
>            <...>-77292 [003] 1175858.326649: function:
>   __ceph_caps_file_wanted
> 
>            <...>-77292 [003] 1175858.326649: function:
>      ceph_caps_for_mode
> 
>            <...>-77292 [003] 1175858.326649: function:
>   __ceph_caps_issued
> 
>            <...>-77292 [003] 1175858.326649: function:
>   _raw_spin_unlock
> 
>            <...>-77292 [003] 1175858.326649: function:
> finish_wait
> 
>            <...>-77292 [003] 1175858.326649: function:             ceph_get_caps
> 
>            <...>-77292 [003] 1175858.326649: function:
> ceph_pool_perm_check
> 
>            <...>-77292 [003] 1175858.326649: function:
>   _raw_spin_lock
> 
>            <...>-77292 [003] 1175858.326649: function:
>   _raw_spin_unlock
> 
>            <...>-77292 [003] 1175858.326649: function:
> try_get_cap_refs
> 
>            <...>-77292 [003] 1175858.326649: function:
>   _raw_spin_lock
> 
>            <...>-77292 [003] 1175858.326649: function:
>   __ceph_caps_file_wanted
> 
>            <...>-77292 [003] 1175858.326650: function:
>      ceph_caps_for_mode
> 
>            <...>-77292 [003] 1175858.326650: function:
>   __ceph_caps_issued
> 
>            <...>-77292 [003] 1175858.326650: function:
>   _raw_spin_unlock
> 
>            <...>-77292 [003] 1175858.326650: function:
> _cond_resched
> 
>            <...>-77292 [003] 1175858.326650: function:
> try_get_cap_refs
> 
>            <...>-77292 [003] 1175858.326650: function:
>   _raw_spin_lock
> 
>            <...>-77292 [003] 1175858.326650: function:
>   __ceph_caps_file_wanted
> 
>            <...>-77292 [003] 1175858.326650: function:
>      ceph_caps_for_mode
> 
>            <...>-77292 [003] 1175858.326650: function:
>   __ceph_caps_issued
> 
>            <...>-77292 [003] 1175858.326650: function:
>   _raw_spin_unlock
> 
>            <...>-77292 [003] 1175858.326650: function:
> prepare_to_wait_event
> 
>            <...>-77292 [003] 1175858.326650: function:
> try_get_cap_refs
> 
>            <...>-77292 [003] 1175858.326650: function:
>   _raw_spin_lock
> 
>            <...>-77292 [003] 1175858.326650: function:
>   __ceph_caps_file_wanted
> 
>            <...>-77292 [003] 1175858.326650: function:
>      ceph_caps_for_mode
> 
>            <...>-77292 [003] 1175858.326650: function:
>   __ceph_caps_issued
> 
>            <...>-77292 [003] 1175858.326651: function:
>   _raw_spin_unlock
> 
>            <...>-77292 [003] 1175858.326651: function:
> finish_wait
> 
>            <...>-77292 [003] 1175858.326651: function:             ceph_get_caps
> ... (lots of similar output)
> 
> I think it may be related to https://lkml.org/lkml/2019/5/23/172, but
> I wanted to get a second opinion.
> 
> Thank you,
> Robert LeBlanc
> ----------------
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1

What kernel version is this? Do you happen to have a more readable stack
trace? Did this come from a hung task warning in the kernel?

>From this, it looks like it's stuck waiting on a spinlock, but it's
rather hard to tell for sure.
-- 
Jeff Layton <jlayton@xxxxxxxxxx>




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux