Re: SWAP or not to swap

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Götz,

I have swap configured but disabled on all servers. Looks like this in the fstab:

PARTLABEL=swap swap swap defaults,noauto 0 0

The noauto flag prevents swap from being enabled at boot.

The reason to have swap at all is that there are a bunch of ceph-user cases where adding swap for a short while was the fastest way forward with a rescue operation, most commonly, an MDS with runaway log or cache. There were also OSD upgrade problems that needed a substantial amount of extra memory beyond what is required during normal operations (upgrades to octopus can get OOM, for example).

In such situations I can execute a "swapon -a", do the repair work, followed by a "swapoff -a". Its a pure emergency measure.

There is also a claim that swap enabled messes ceph's allocators up. They seem to start accounting wrong for system RAM. The reported anecdotal observation is that "memory leaks" disappear when disabling swap.

Ergo: RAM is cheap enough, just buy enough of it. If you want to configure swap for rescue, pick something that allows rescue faster than the waiting time for an extra delivery of DIMMs. I go with a bit of space on PCIe attached NVMe cards in the future.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Anthony D'Atri <anthony.datri@xxxxxxxxx>
Sent: 02 July 2022 09:51:59
To: Götz Reinicke
Cc: ceph-users
Subject:  Re: SWAP or not to swap

There no doubt are those with differing views, but here’s my take.

Back in the second half of the 1980s, it was not unusual for a *ix workstation to have all of 4MB of RAM.  Yes, that is the right unit.  Modules were limited in capacity compared to today, and were relatively much more expensive.  Diskless workstations were not uncommon, swapping over 10Mb/s ethernet via NFS or ND onto an SMD drive with a transfer rate of 9.6 Mb/s and loooooong seeks.

That was an era where swap was often a necessity.

But that was a long time ago, and our hardware landscrape is dramatically different today.  My view is that if in general a server-role system needs swap, what it really needs is more RAM.  There may be exceptions where, say, it might make sense to swap onto an NVMe or PMEM device.  With Ceph, I definitely am not a believer in swap.  Conventional swap, say to SATA/SAS HDD/SSD, will be much slower than RAM, and OSD performance will tank, resulting in dreaded slow requests, etc.  One of my mantras (*) is that Ceph is usually happier with a component down hard than with it up and crippled or flapping.

So my sense is that in almost all cases, one is better off having an OSD crash and restart if it fails to alloc memory than for it to slow to a crawl when swapping.  With the osd_memory_target in recent releases, OSD processes in my experience are much less prone to ballooning as well.




* Along with “There is no such thing as too much garlic"


>
> Dear ceph community,
>
> over the last years I read pros and cons regarding swap for different workloads or setups.
>
> Recently I came again across that question on ceph OSD nodes. The folks a croit disable it at all on their distribution an suggest it to ;) …. and from the v14.2.22 Nautilus released notes I got:
>
> … the most consistent performing combination is to enable bluefs_buffered_io and disable system level swap …
>
> Out of curiosity: How do you configure your systems? Do you enable swap or do you disable it at all?
>
> Assume that regarding the OSD configuration and expected workload there should be enough RAM.
>
>       Thanks for your thoughts and regards . Götz
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux