Ceph FS outage after blocked_op + mk_snap

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

today we were victim of a coincidence of what looks like 2 race conditions combined. We observed it on mimic and plan to upgrade to octopus soon, but if any of the devs reading this realises that "dang, we never looked at this case", please consider fixing it before we upgrade :)

Here the short story:

1) A client process managed to dead-lock itself with a CAPs request to itself. We got a blocked OP and a client not responding to CAPS release. The client processes hang in D-state.
2) A nightly cron job executes an "mkdir .snap/XYZ" on the FS root.

As far as our surveillance shows, the "mkdir .snap/XYZ" got stuck and at that moment all FS IO froze. Restarts of the MDS daemon did not help. Killing all user-processes from (1) immediately resolved the issue.

Hypothesis:

The snapshot creation is not moved to the front of the MDS OPS queue and waits for the blocked client OP to complete. However, this is clearly not what a snapshot should do. It should freeze the FS state *as is on disk* at the moment the mkdir is executed, bypassing pending operations that might get cancelled any ways and have not been committed yet. If this hypothesis holds, the behaviour should be changed. A snapshot creation should never hang.

About the client self-dead-lock: we have no idea how this is even possible. If a single client requests a CAP change for CAPs its already holding, and no *other* client is holding related CAPs, the change should happen *without* a CAPs recall first. In particular, there should not be a CAPs recall to the client making the request in the first place. We will investigate the code and try to find out what system call screws things up. However, if anyone knows there has been a patch for this problem since mimic or in the kernel client, please let me know.

The kernel client is quite old: 3.10.0-957.12.2.el7.x86_64

Below some info about the blocked OPS when the code in (1) goes into D-state. It is the "setattr mtime" that gets blocked first and then the "client does not respond to CAPS release" warning pops up. It seems to request itself to release a CAP, but doesn't do it. Like it goes into D-state before the MDS request arrives. I believe I have seen similar dead-locks when people were running an ls on a dir on one node that was busy on another node at the same time. On both ends, processes sometimes went into D-state waiting for a response from each other.

            "description": "client_request(client.44538512:8357463 setattr mtime=2022-06-28 23:15:43.129859 #0x100382facc4 2022-06-28 23:15:43.129859 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.44549536:13295780 getattr pAsLsXsFs #0x100377c6fbe 2022-06-28 23:15:43.388080 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.73035920:6172480 setattr size=0 #0x100375d694c 2022-06-28 23:15:43.212946 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.44538341:9135650 setattr size=0 #0x10037d213b2 2022-06-28 23:15:43.244596 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.44549536:13295669 setattr size=0 #0x10037def6c5 2022-06-28 23:15:43.248080 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.44537171:14217015 setattr mtime=2022-06-28 23:15:43.263703 #0x100377c6fb8 2022-06-28 23:15:43.263703 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.44549536:13295782 getattr pAsLsXsFs #0x100377c6fbe 2022-06-28 23:15:43.389080 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.65342555:3250051 getattr pAsLsXsFs #0x10037fae5c8 2022-06-28 23:18:08.188763 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.57239622:3351967 setattr mtime=2022-06-28 23:15:43.311708 #0x100383d4e12 2022-06-28 23:15:43.311708 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.65342555:3250052 setattr size=0 #0x10037fae5c9 2022-06-28 23:18:08.188763 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.73035920:6172486 setattr size=0 #0x100375d694d 2022-06-28 23:15:43.236946 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.57239597:7861594 getattr pAsLsXsFs #0x100382facc4 2022-06-28 23:15:43.349760 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.73035920:6172569 getattr pAsLsXsFs #0x100383d4e12 2022-06-28 23:15:43.372946 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.44549542:10386607 setattr mtime=2022-06-28 23:15:43.256180 #0x10037d4a790 2022-06-28 23:15:43.256180 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.44549536:13295670 setattr size=0 #0x10037def6c6 2022-06-28 23:15:43.249080 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.44537171:14217056 setattr mtime=2022-06-28 23:15:43.314704 #0x100377c6fbe 2022-06-28 23:15:43.314704 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.44550703:13632027 setattr size=0 #0x1003789a194 2022-06-28 23:15:43.241280 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.57239597:7861595 getattr pAsLsXsFs #0x100382facc4 2022-06-28 23:15:43.352760 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.44560938:18675673 setattr size=0 #0x100341c7f5e 2022-06-28 23:15:43.252913 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.57413311:4768768 setattr size=0 #0x10037fc9bce 2022-06-28 23:16:21.547747 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.44538512:8357469 getattr pAsLsXsFs #0x100382facc4 2022-06-28 23:15:43.150859 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.44537237:15327155 setattr size=0 #0x100373b1481 2022-06-28 23:15:43.246964 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.44537237:15327140 setattr size=0 #0x100373b147e 2022-06-28 23:15:43.213964 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.44538341:9135649 setattr size=0 #0x10037d213b1 2022-06-28 23:15:43.242596 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.44549536:13295759 getattr pAsLsXsFs #0x100383d4e12 2022-06-28 23:15:43.363080 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.44538341:9135625 setattr size=0 #0x10037d213ae 2022-06-28 23:15:43.202595 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.44560938:18675667 setattr size=0 #0x100341c7f5c 2022-06-28 23:15:43.220913 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.57239622:3351918 setattr mtime=2022-06-28 23:15:43.249708 #0x100383d4e09 2022-06-28 23:15:43.249708 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.44537171:14217013 setattr mtime=2022-06-28 23:15:43.260703 #0x100377c6fb6 2022-06-28 23:15:43.260703 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.44550703:13632030 setattr mtime=2022-06-28 23:15:43.250280 #0x1003789a193 2022-06-28 23:15:43.250280 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.44549542:10386623 setattr mtime=2022-06-28 23:15:43.308180 #0x10037d4a78f 2022-06-28 23:15:43.308180 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.73035920:6172487 setattr size=0 #0x100375d694e 2022-06-28 23:15:43.237946 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.44549536:13295773 getattr pAsLsXsFs #0x100377c6fbe 2022-06-28 23:15:43.385080 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.65342555:3250038 setattr size=0 #0x10037fae5c8 2022-06-28 23:18:08.185763 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.44550703:13632028 setattr mtime=2022-06-28 23:15:43.244280 #0x1003789a192 2022-06-28 23:15:43.244280 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.57239597:7861607 getattr pAsLsXsFs #0x100382facc4 2022-06-28 23:15:43.384760 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.65342555:3250039 setattr mtime=2022-06-28 23:18:08.186763 #0x10037fae5b8 2022-06-28 23:18:08.186763 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.57239622:3351914 setattr size=0 #0x100383d4e0a 2022-06-28 23:15:43.232708 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.57413311:4768868 lookup #0x1003789a17a/farm.yaml 2022-06-28 23:16:21.964748 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.44560920:17672115 setattr mtime=2022-06-28 23:15:43.205528 #0x10037e10602 2022-06-28 23:15:43.205528 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.44537171:14217014 setattr mtime=2022-06-28 23:15:43.263703 #0x100377c6fb7 2022-06-28 23:15:43.263703 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.44549536:13295672 setattr size=0 #0x10037def6c7 2022-06-28 23:15:43.251080 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.44560920:17672104 setattr mtime=2022-06-28 23:15:43.185528 #0x10037e10601 2022-06-28 23:15:43.185528 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.44560938:18675678 setattr mtime=2022-06-28 23:15:43.261913 #0x100341c7f5d 2022-06-28 23:15:43.261913 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.73035920:6172565 getattr pAsLsXsFs #0x100383d4e12 2022-06-28 23:15:43.368946 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.44537237:15327154 setattr size=0 #0x100373b1480 2022-06-28 23:15:43.243964 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.57413311:4768806 setattr size=0 #0x10037fc9bcf 2022-06-28 23:16:21.639747 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.73035920:6172567 getattr pAsLsXsFs #0x100383d4e12 2022-06-28 23:15:43.369946 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.44549536:13295778 getattr pAsLsXsFs #0x100377c6fbe 2022-06-28 23:15:43.387080 caller_uid=280689, caller_gid=280689{})",
            "description": "client_request(client.57239622:3351919 setattr mtime=2022-06-28 23:15:43.253708 #0x100383d4e0b 2022-06-28 23:15:43.253708 caller_uid=280689, caller_gid=280689{})",

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux