Re: MDS Behind on Trimming...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Xiubo,

Is there any way to possibly get a PR development release we could upgrade to, in order to test and see if the lock order bug per Bug #62123 could be the answer? Although I'm not sure that bug has been fixed yet?

-erich

On 4/21/24 9:39 PM, Xiubo Li wrote:
Hi Erich,

I raised one tracker for this https://tracker.ceph.com/issues/65607.

Currently I haven't figured out where was holding the 'dn->lock' in the 'lookup' request or somewhere else, since there is not debug log.

Hopefully we can get the debug logs, which we can push it further.

Thanks

- Xiubo

On 4/19/24 23:55, Erich Weiler wrote:
Hi Xiubo,

Nevermind I was wrong, most the blocked ops were 12 hours old. Ug.

I restarted the MDS daemon to clear them.

I just reset to having one active MDS instead of two, let's see if that makes a difference.

I am beginning to think it may be impossible to catch the logs that matter here.  I feel like sometimes the blocked ops are just waiting because of load and sometimes they are waiting because they are stuck. But, it's really hard to tell which, without waiting a while.  But, I can't wait while having debug turned on because my root disks (which are 150 GB large) fill up with debug logs in 20 minutes.  So it almost seems that unless I could somehow store many TB of debug logs we won't be able to catch this.

Let's see how having one MDS helps.  Or maybe I actually need like 4 MDSs because the load is too high for only one or two.  I don't know. Or maybe it's the lock issue you've been working on.  I guess I can test the lock order fix when it's available to test.

-erich

On 4/19/24 7:26 AM, Erich Weiler wrote:
So I woke up this morning and checked the blocked_ops again, there were 150 of them.  But the age of each ranged from 500 to 4300 seconds.  So it seems as if they are eventually being processed.

I wonder if we are thinking about this in the wrong way?  Maybe I should be *adding* MDS daemons because my current ones are overloaded?

Can a single server hold multiple MDS daemons?  Right now I have three physical servers each with one MDS daemon on it.

I can still try reducing to one.  And I'll keep an eye on blocked ops to see if any get to a very old age (and are thus wedged).

-erich

On 4/18/24 8:55 PM, Xiubo Li wrote:
Okay, please try it to set only one active mds.


On 4/19/24 11:54, Erich Weiler wrote:
We have 2 active MDS daemons and one standby.

On 4/18/24 8:52 PM, Xiubo Li wrote:
BTW, how man active mds you are using ?


On 4/19/24 10:55, Erich Weiler wrote:
OK, I'm sure I caught it in the right order this time, the logs should definitely show when the blocked/slow requests start. Check out these logs and dumps:

http://hgwdev.gi.ucsc.edu/~weiler/

It's a 762 MB tarball but it uncompresses to 16 GB.

-erichll


On 4/18/24 6:57 PM, Xiubo Li wrote:
Okay, could you try this with 18.2.0 ?

I just double it was introduce by:

commit e610179a6a59c463eb3d85e87152ed3268c808ff
Author: Patrick Donnelly <pdonnell@xxxxxxxxxx>
Date:   Mon Jul 17 16:10:59 2023 -0400

     mds: drop locks and retry when lock set changes

     An optimization was added to avoid an unnecessary gather on the inode      filelock when the client can safely get the file size without also      getting issued the requested caps. However, if a retry of getattr
     is necessary, this conditional inclusion of the inode filelock
     can cause lock-order violations resulting in deadlock.

     So, if we've already acquired some of the inode's locks then we must
     drop locks and retry.

     Fixes: https://tracker.ceph.com/issues/62052
     Fixes: c822b3e2573578c288d170d1031672b74e02dced
     Signed-off-by: Patrick Donnelly <pdonnell@xxxxxxxxxx>
     (cherry picked from commit b5719ac32fe6431131842d62ffaf7101c03e9bac)


On 4/19/24 09:54, Erich Weiler wrote:
I'm on 18.2.1.  I think I may have gotten the timing off on the logs and dumps so I'll try again.  Just really hard to capture because I need to kind of be looking at it in real time to capture it. Hang on, lemme see if I can get another capture...

-erich

On 4/18/24 6:35 PM, Xiubo Li wrote:

BTW, which ceph version you are using ?



On 4/12/24 04:22, Erich Weiler wrote:
BTW - it just happened again, I upped the debugging settings as you instructed and got more dumps (then returned the debug settings to normal).

Attached are the new dumps.

Thanks again,
erich

On 4/9/24 9:00 PM, Xiubo Li wrote:

On 4/10/24 11:48, Erich Weiler wrote:
Dos that mean it could be the locker order bug (https://tracker.ceph.com/issues/62123) as Xiubo suggested?

I have raised one PR to fix the lock order issue, if possible please have a try to see could it resolve this issue.

Thank you!  Yeah, this issue is happening every couple days now. It just happened again today and I got more MDS dumps. If it would help, let me know and I can send them!

Once this happen if you could enable the mds debug logs will be better:

debug mds = 20

debug ms = 1

And then provide the debug logs together with the MDS dumps.


I assume if this fix is approved and backported it will then appear in like 18.2.3 or something?

Yeah, it will be backported after being well tested.

- Xiubo

Thanks again,
erich











_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux