On 6 February 2018 at 12:21, Xuehan Xu <xxhdx1985126@xxxxxxxxx> wrote: > On 6 February 2018 at 12:19, Xuehan Xu <xxhdx1985126@xxxxxxxxx> wrote: >> Hi, everyone. >> >> Recently, in our online Luminous Cephfs clusters, we found that there >> are occasionally lots of slow requests: >> >> 2018-01-16 15:16:53.233925 7fa59f7fe700 0 log_channel(cluster) log >> [WRN] : 49 slow requests, 5 included below; oldest blocked for > >> 89.995722 secs >> 2018-01-16 15:16:53.233934 7fa59f7fe700 0 log_channel(cluster) log >> [WRN] : slow request 39.966171 seconds old, received at 2018-01-16 >> 15:16:13.267686: client_request(client.4925109:3759 getattr pAsLsXsFs >> #0x10000000fe0 2018-01-16 15:16:13.176343 caller_uid=33, >> caller_gid=33{}) currently failed to rdlock, waiting >> 2018-01-16 15:16:53.233938 7fa59f7fe700 0 log_channel(cluster) log >> [WRN] : slow request 39.949899 seconds old, received at 2018-01-16 >> 15:16:13.283958: client_request(client.4925109:3772 getattr pAsLsXsFs >> #0x10000000fe0 2018-01-16 15:16:13.193343 caller_uid=33, >> caller_gid=33{}) currently failed to rdlock, waiting >> 2018-01-16 15:16:53.233942 7fa59f7fe700 0 log_channel(cluster) log >> [WRN] : slow request 39.899975 seconds old, received at 2018-01-16 >> 15:16:13.333882: client_request(client.4925109:3785 getattr pAsLsXsFs >> #0x10000000fe0 2018-01-16 15:16:13.242344 caller_uid=33, >> caller_gid=33{}) currently failed to rdlock, waiting >> 2018-01-16 15:16:53.233946 7fa59f7fe700 0 log_channel(cluster) log >> [WRN] : slow request 39.965588 seconds old, received at 2018-01-16 >> 15:16:13.268269: client_request(client.4925109:3760 getattr pAsLsXsFs >> #0x10000000fe0 2018-01-16 15:16:13.177343 caller_uid=33, >> caller_gid=33{}) currently failed to rdlock, waiting >> 2018-01-16 15:16:53.233949 7fa59f7fe700 0 log_channel(cluster) log >> [WRN] : slow request 39.964976 seconds old, received at 2018-01-16 >> 15:16:13.268881: client_request(client.4925109:3761 getattr pAsLsXsFs >> #0x10000000fe0 2018-01-16 15:16:13.178343 caller_uid=33, >> caller_gid=33{}) currently failed to rdlock, waiting >> 2018-01-16 15:16:58.233997 7fa59f7fe700 0 log_channel(cluster) log >> [WRN] : 49 slow requests, 5 included below; oldest blocked for > >> 94.995785 secs >> 2018-01-16 15:16:58.234006 7fa59f7fe700 0 log_channel(cluster) log >> [WRN] : slow request 60.080351 seconds old, received at 2018-01-16 >> 15:15:58.153569: client_request(client.4925109:3742 getattr pAsLsXsFs >> #0x10000003e7d 2018-01-16 15:15:58.062900 caller_uid=99, >> caller_gid=99{}) currently failed to rdlock, waiting >> 2018-01-16 15:16:58.234010 7fa59f7fe700 0 log_channel(cluster) log >> [WRN] : slow request 44.929603 seconds old, received at 2018-01-16 >> 15:16:13.304317: client_request(client.4925109:3778 getattr pAsLsXsFs >> #0x10000000fe0 2018-01-16 15:16:13.213344 caller_uid=33, >> caller_gid=33{}) currently failed to rdlock, waiting >> 2018-01-16 15:16:58.234014 7fa59f7fe700 0 log_channel(cluster) log >> [WRN] : slow request 44.964894 seconds old, received at 2018-01-16 >> 15:16:13.269026: client_request(client.4925109:3762 getattr pAsLsXsFs >> #0x10000000fe0 2018-01-16 15:16:13.178343 caller_uid=33, >> caller_gid=33{}) currently failed to rdlock, waiting >> 2018-01-16 15:16:58.234017 7fa59f7fe700 0 log_channel(cluster) log >> [WRN] : slow request 44.948258 seconds old, received at 2018-01-16 >> 15:16:13.285662: client_request(client.4925109:3773 getattr pAsLsXsFs >> #0x10000000fcf 2018-01-16 15:16:13.194343 caller_uid=33, >> caller_gid=33{}) currently failed to rdlock, waiting >> 2018-01-16 15:16:58.234020 7fa59f7fe700 0 log_channel(cluster) log >> [WRN] : slow request 44.817911 seconds old, received at 2018-01-16 >> 15:16:13.416009: client_request(client.4925109:3796 getattr pAsLsXsFs >> #0x10000000fe0 2018-01-16 15:16:13.325347 caller_uid=33, >> caller_gid=33{}) currently failed to rdlock, waiting >> >> Our use case is like this: >> There are one client issuing writes to a file while tens of >> other clients issue reads and stat to the same file. >> >> After a series of test, we found that these slow requests are all >> "getattr" operations, and they are all blocked by CInode::filelock in >> LOCK_SYNC_MIX state in which rdlock cannot be acquired. Further >> analysis showed that, in our use case, due tothe writing client's >> continuous "wants" of CEPH_CAP_GWR , nearly every "getattr" >> operation's finish would lead to the inode's filelock go into >> LOCK_SYNC_MIX state in which the mds must wait for all clients holding >> caps other than the allowed "rwl" to release their unallowed caps and >> blocks all requests to acquire "rdlock" of the inode's filelock. Due >> to the massive number of reading clients, the wait could lasts for >> about 950ms, which means, for every 950 ms, there can nearly be only >> one getattr operation gets finished. >> >> Why must the acquisition of "rdlock" of an inode's filelock be blocked >> when the inode's filelock is in LOCK_SYNC_MIX state? And why must the >> mds wait for all clients to release their unallowed caps? >> >> Thanks:-) > On the other hand, I found that the source code related to capabilities is very hard to understand, is there any document that I can resort to? Thanks:-) -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html