Re: Direct ceph mount on desktops

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Followup.

Desktop system went to sleep overnight. I woke up to this:


HEALTH_WARN 1 client(s) laggy due to laggy OSDs; 1 clients failing to
respond to capability release; 1 MDSs report slow requests
[WRN] MDS_CLIENTS_LAGGY: 1 client(s) laggy due to laggy OSDs
    mds.ceefs.www2.lzjqgd(mds.0): Client 7513643 is laggy; not evicted
because some OSD(s) is/are laggy
[WRN] MDS_CLIENT_LATE_RELEASE: 1 clients failing to respond to
capability release
    mds.ceefs.www2.lzjqgd(mds.0): Client a64.mousetech.com: failing to
respond to capability release client_id: 7513643
[WRN] MDS_SLOW_REQUEST: 1 MDSs report slow requests
    mds.ceefs.www2.lzjqgd(mds.0): 1 slow requests are blocked > 30 secs

a64 is the sleeping desktop.

I restarted the the ceph target and a whole bunch of stuck pg's popped
up, but the whole w2 machine is apparently scrambled and even a reboot
hasn't made its processes happy so I'm going to have to go in and fix
them one by one. May be unrelated to the original laggy OSD problem,
though. Just wanted to mention it for completeness.

Also, the desktop has weirdly managed to mount both native ceph and the
ceph NFS at the same mount point. Most likely because I'd added a
pre/post sleep script to unmount/remount ceph when the system went into
or out of suspension and I lost track of it so I haven't fixed it.

Again, this is mostly just for background not related to ceph's own
internals, although I had wondered if the ceph mount was happening
slowly and the suspension wasn't waiting properly for it to complete.

On Tue, 2024-02-06 at 13:00 -0500, Patrick Donnelly wrote:
> On Tue, Feb 6, 2024 at 12:09 PM Tim Holloway <timh@xxxxxxxxxxxxx>
> wrote:
> > 
> > Back when I was battline Octopus, I had problems getting ganesha's
> > NFS
> > to work reliably. I resolved this by doing a direct (ceph) mount on
> > my
> > desktop machine instead of an NFS mount.
> > 
> > I've since been plagued by ceph "laggy OSD" complaints that appear
> > to
> > be due to a non-responsive client and I'm suspecting that the
> > client in
> > question is the desktop machine when it's suspended while the ceph
> > mount is in effect.
> 
> You should not see "laggy OSD" messages due to a client becoming
> unresponsive.
> 
> > So the question is: Should ceph native mounts be used on general
> > client
> > machines which may hibernate or otherwise go offline?
> 
> The mounts will eventually be evicted (generally) by the MDS if the
> machine hibernates/suspends. There are mechanisms for the mount to
> recover (see "recover_session" in the mount.ceph man page).  Any
> dirty
> data would be lost.
> 
> As for whether you should have clients that hibernate, it's not
> ideal.
> It could conceivably create problems if client machines hibernate
> longer than the blocklist duration (after eviction by the MDS).
> 
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux