Re: stuck MDS warning: Client HOST failing to respond to cache pressure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


Hi Loïc,

thanks for the pointer. Its kind of the opposite extreme to dropping just everything. I need to know the file name that is in cache. I'm looking for a middle way, say, "drop_caches -u USER" that drops all caches of files owned by user USER. This way I could try dropping caches for a bunch of users who are *not* running a job.

I guess I have to wait for the jobs to end.

Best regards,
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

From: Loïc Tortay <tortay@xxxxxxxxxxx>
Sent: Tuesday, October 17, 2023 3:40 PM
To: Frank Schilder; ceph-users@xxxxxxx
Subject: Re:  Re: stuck MDS warning: Client HOST failing to respond to cache pressure

On 17/10/2023 11:27, Frank Schilder wrote:
> Hi Stefan,
> probably. Its 2 compute nodes and there are jobs running. Our epilogue script will drop the caches, at which point I indeed expect the warning to disappear. We have no time limit on these nodes though, so this can be a while. I was hoping there was an alternative to that, say, a user-level command that I could execute on the client without possibly affecting other users jobs.
If you know the names of the files to flush from the cache (from
/proc/$PID/fd, lsof, batch job script, ...), you can use something like
on the client.

See comments line 16 to 22 of the source code for caveats/limitations.

|       Loīc Tortay <tortay@xxxxxxxxxxx> - IN2P3 Computing Centre      |
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]

  Powered by Linux