On Wed, Jan 22, 2020 at 01:01:42PM +0100, Michal Privoznik wrote: > On 1/22/20 11:11 AM, Michal Privoznik wrote: > > On 1/22/20 10:03 AM, R. Diez wrote: > > > Hi all: > > > > > > I am using the libvirt version that comes with Ubuntu 18.04.3 LTS. > > > > I'm sorry, I don't have Ubuntu installed anywhere to look the version > > up. Can you run 'virsh version' to find it out for me please? > > Nevermind, I've managed to reproduce with the latest libvirt anyway. > > > > > > > > > I have written a script that backs up my virtual machines every > > > night. I want to limit the amount of memory that this backup > > > operation consumes, mainly to prevent page cache thrashing. I have > > > described the Linux page cache thrashing issue in detail here: > > > > > > http://rdiez.shoutwiki.com/wiki/Today%27s_Operating_Systems_are_still_incredibly_brittle#The_Linux_Filesystem_Cache_is_Braindead > > > > > > > > > The VM virtual disk weighs 140 GB at the moment. I thought 500 MiB > > > of RAM should be more than enough to back it up, so I added the > > > following options to the systemd service file associated to the > > > systemd timer I am using: > > > > > > MemoryLimit=500M > > > > > > However, the OOM is killing "virsh vol-download": > > > > > > Jan 21 23:40:00 GS-CEL-L kernel: [55535.913525] [ pid ] uid > > > tgid total_vm rss pgtables_bytes swapents oom_score_adj name > > > Jan 21 23:40:00 GS-CEL-L kernel: [55535.913527] [ 13232] 1000 > > > 13232 5030 786 77824 103 0 > > > BackupWindows10 > > > Jan 21 23:40:00 GS-CEL-L kernel: [55535.913528] [ 13267] 1000 > > > 13267 5063 567 73728 132 0 > > > BackupWindows10 > > > Jan 21 23:40:00 GS-CEL-L kernel: [55535.913529] [ 13421] 1000 > > > 13421 5063 458 73728 132 0 > > > BackupWindows10 > > > Jan 21 23:40:00 GS-CEL-L kernel: [55535.913530] [ 13428] 1000 > > > 13428 712847 124686 5586944 523997 0 virsh > > > Jan 21 23:40:00 GS-CEL-L kernel: [55535.913532] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=/,mems_allowed=0,oom_memcg=/system.slice/VmBackup.service,task_memcg=/system.slice/VmBackup.service,task=virsh,pid=13428,uid=1000 > > > > > > Jan 21 23:40:00 GS-CEL-L kernel: [55535.913538] Memory cgroup out of > > > memory: Killed process 13428 (virsh) total-vm:2851388kB, > > > anon-rss:486180kB, file-rss:12564kB, shmem-rss:0kB > > > > > > I wonder why "virsh vol-download" needs so much RAM. It does not get > > > killed straight away, it takes a few minutes to get killed. It > > > starts using a VMSIZE of around 295 MiB, which is not really frugal > > > for a file download operation, but then it grows and grows. > > > > This is very likely a memory leak somewhere. > > Actually, it is not. It's caused by our design of the client event loop. If > there are any incoming data, read as much as possible placing them at the > end of linked list of incoming stream data (stream is a way that libvirt > uses to transfer binary data). Problem is that instead of returning NULL to > our malloc()-s once the limit is reached, kernel decides to kill us. > > For anybody with libvirt insight: virNetClientIOHandleInput() -> > virNetClientCallDispatch() -> virNetClientCallDispatchStream() -> > virNetClientStreamQueuePacket(). > > > The obvious fix would be to stop processing incoming packets if stream has > "too much" data cached (define "too much"). But this may lead to > unresponsive client event loop - if the client doesn't pull data from > incoming stream fast enough they won't be able to make any other RPC. IMHO if they're not pulling stream data and still expecting to make other RPC calls in a timely manner, then their code is broken. Having said that, in retrospect I rather regret ever implementing our stream APIs as we did. We really should have just exposed an API which lets you spawn an NBD server associated with a storage volume, or tunnelled NBD over libvirtd. The former is probably our best strategy these days, now that NBD has native TLS support. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|