Dietmar On 4/27/24 16:38, Erich Weiler wrote:
Actually should I be excluding my whole cephfs filesystem? Like, if I mount it as /cephfs, should my stanza looks something like:{ "files.watcherExclude": { "**/.git/objects/**": true, "**/.git/subtree-cache/**": true, "**/node_modules/*/**": true, "**/.cache/**": true, "**/.conda/**": true, "**/.local/**": true, "**/.nextflow/**": true, "**/work/**": true, "**/cephfs/**": true } } On 4/27/24 12:24 AM, Dietmar Rieder wrote:Hi Erich, hope it helps. Let us know. DietmarAm 26. April 2024 15:52:06 MESZ schrieb Erich Weiler <weiler@xxxxxxxxxxxx>:Hi Dietmar, We do in fact have a bunch of users running vscode on our HPC head node as well (in addition to a few of our general purpose interactive compute servers). I'll suggest they make the mods you referenced! Thanks for the tip. cheers, erich On 4/24/24 12:58 PM, Dietmar Rieder wrote: Hi Erich, in our case the "client failing to respond to cache pressure" situation is/was often caused by users how have vscode connecting via ssh to our HPC head node. vscode makes heavy use of file watchers and we have seen users with > 400k watchers. All these watched files must be held in the MDS cache and if you have multiple users at the same time running vscode it gets problematic. Unfortunately there is no global setting - at least none that we are aware of - for vscode to exclude certain files or directories from being watched. We asked the users to configure their vscode (Remote Settings -> Watcher Exclude) as follows: { "files.watcherExclude": { "**/.git/objects/**": true, "**/.git/subtree-cache/**": true, "**/node_modules/*/**": true, "**/.cache/**": true, "**/.conda/**": true, "**/.local/**": true, "**/.nextflow/**": true, "**/work/**": true } } ~/.vscode-server/data/Machine/settings.jsonTo monitor and find processes with watcher you may use inotify-info<https://github.com/mikesart/inotify-info <https://github.com/mikesart/inotify-info>> HTH Dietmar On 4/23/24 15:47, Erich Weiler wrote: So I'm trying to figure out ways to reduce the number of warnings I'm getting and I'm thinking about the one "client failing to respond to cache pressure". Is there maybe a way to tell a client (or all clients) to reduce the amount of cache it uses or to release caches quickly? Like, all the time? I know the linux kernel (and maybe ceph) likes to cache everything for a while, and rightfully so, but I suspect in my use case it may be more efficient to more quickly purge the cache or to in general just cache way less overall...? We have many thousands of threads all doing different things that are hitting our filesystem, so I suspect the caching isn't really doing me much good anyway due to the churn, and probably is causing more problems than it helping... -erich------------------------------------------------------------------------ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx------------------------------------------------------------------------ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx
Attachment:
OpenPGP_signature.asc
Description: OpenPGP digital signature
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx