On la, 23 tammi 2021, Chris Murphy wrote:
On Sat, Jan 23, 2021 at 4:29 AM Zbigniew Jędrzejewski-Szmek
<zbyszek@xxxxxxxxx> wrote:
Hi,
the proposal for Fedora 34 is to use zram-size == 1.0 * ram.
(Which I think is OK for the reasons listed in the Change page [0].)
But the original motivation for this change was boosting the size on
machines with little ram [1]. I wrote an exploratory patch [2] to specify
the size as a formula. From the docs:
> An alternative way to set the zram device size as a mathematical expressoin
> that can be used instead of 'zram-fraction' and 'max-zram-size'. Basic arithmetic
> operators like '*', '+', '-', '/', are supported, as well as 'min()' and 'max()'
> and the variable 'ram' which specifies size of RAM in megabytes.
>
> Examples:
>
> # this is the same as the default config
> zram-size = min(0.5 * ram, 4096)
>
> # fraction 1.0 for first 4GB, and then fraction 0.5 above that
> zram-size = 1.0 * min(ram, 4096) + 0.5 * max(ram - 4096, 0)
Now I'm a bit torn: the code is nice enough, but it seems to be a solution
in search of a problem. So I thought I'd try a little crowd-sourcing:
Would we have a real use for something like this?
(One possible direction: one thing I want to explore next is using zram
or zwap based on whether the machine has a physical swap device. Maybe
such a language would be useful then — with additional variables
specifying e.g. the physical swap size…)
I think everything discussed so far is neutral to good for all use
cases (all editions and spins) being discussed.
But I also think it's a good idea for zram-generator to be a bit more
biased toward setups with one or more of:
- low memory (4G or less, somewhat subjective)
- limited life storage (eMMC, SD Card, USB stick)
- slow drive (primarily rotational, but also all of the above)
- no swap (zram-based swap is better than no swap)
The above categories pretty much mean that the improvements for
disk-based swap that are in-progress upstream, aren't likely to be
used. That means zram-based swap usage will continue to provide
significant benefit.
With today's OpenQA tests I can point out that using zram on 2048MB RAM
VMs actually breaks FreeIPA deployment:
https://openqa.fedoraproject.org/tests/763006#step/role_deploy_domain_controller/35
OpenQA uses 2048MB RAM for QEMU VMs and this was typically OK for
FreeIPA deployment with integrated CA and DNS server. Not anymore with
zram activated:
Jan 27 21:17:47 fedora zram_generator::generator[25243]: Creating unit dev-zram0.swap (/dev/zram0 with 1384MB)
which ends up eating 2/3rds of the whole memory budget and FreeIPA
installer fails:
2021-01-28T02:18:31Z DEBUG ipa-server-install was invoked with arguments [] and options: {'unattended': True, 'ip_addresses': None, 'domain_name': 'test.openqa.fedoraproject.org', 'realm_name': 'TEST.OPENQA.FEDORAPROJECT.ORG', 'host_name': None, 'ca_cert
2021-01-28T02:18:31Z DEBUG IPA version 4.9.1-1.fc34
2021-01-28T02:18:31Z DEBUG IPA platform fedora
2021-01-28T02:18:31Z DEBUG IPA os-release Fedora 34 (Server Edition Prerelease)
2021-01-28T02:18:31Z DEBUG Available memory is 823529472B
...
2021-01-28T02:18:31Z DEBUG The ipa-server-install command failed, exception: ScriptError: Less than the minimum 1.2GB of RAM is available, 0.77GB available
2021-01-28T02:18:31Z ERROR Less than the minimum 1.2GB of RAM is available, 0.77GB available
2021-01-28T02:18:31Z ERROR The ipa-server-install command failed. See /var/log/ipaserver-install.log for more information
While we can ask Adam to increase memory in those VMs, 2GB RAM was our
(FreeIPA) recommended lower level target for home deployments with
Celeron or RPI4 systems. Now zram use will force those systems to be
unusable out of the box.
That's the tl;dr and now it's giant text wall time...
The remaining category is "everyone else", i.e. >= 8G RAM, and a
reasonably performant SATA SSD or NVMe. This category benefits overall
with the swap on zram approach, mainly because swap thrashing is just
so terrible. However, I expect the future is a return to disk based
swap for two reasons: (1) given highly variable workloads, having 100%
eviction efficacy decomplicates memory management and resource control
(2) there are upstream improvements happening incrementally that are
improving swap performance. e.g. the anonymous memory balancing logic
has been totally reworked.
Neither zram nor zswap support cgroupvs2. There's work happening on
getting zswap cgroups compatible as well as integrating it into memory
management rather than having all these different buffet-style add-ons
that distros and users have to evaluate and integrate. The swap
improvements started happening in kernel 5.8, and I'd say they're
opt-in testable [1] for folks using kernel 5.10+ - whereby they can
switch back and forth between exclusively zram-based and disk-based
swap, to help evaluate what's working better and what isn't.
This is not a case of us moving back to disk-based swap soon. There
still is no cgroups support in device-mapper, and right now the only
way to secure swap is to put it on dm-crypt. One of the benefits of
zram-based swap is, it's volatile, so any leaks of personal
information can be ignored (at least if the system is powered off).
Another issue is dynamically creating/removing swapfiles. We kinda
want to avoid partitions because that preallocation siphons away a
possibly limited resource that may not get used at all; and also
related is still pending work (not yet happening) on Secure Boot and
hibernation images which would necessarily need disk based swap.
Anyway there's quite a lot of work happening, and even though it isn't
ready to be used by default in Fedora, it is a good time for early
adopters to do performance testing as this work continues. I
anticipate the server and desktop will eventually move away from
zram-based swap, but I can't give a time frame for it.
[1]
For the early adopters who want to experiment with their swap
dependent workloads and different configurations:
https://github.com/facebookexperimental/resctl-demo
[2]
One thing not discussed much is where to put the swapfile on Btrfs.
This is my current suggestion:
btrfs sub create /var/swap
chattr +C /var/swap
fallocate -l 4G /var/swap/swapfile1
mkswap /var/swap/swapfile1
swapon /var/swap/swapfile1
Be sure to read the limitations in 'man 5 btrfs' - the above takes
care of most concerns, the other one is it needs to be a single device
Btrfs. Other arrangements are possible. Ping me on irc (cmurf) if you
have questions about alternatives.
--
Chris Murphy
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx
--
/ Alexander Bokovoy
Sr. Principal Software Engineer
Security / Identity Management Engineering
Red Hat Limited, Finland
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx