On Fri, Feb 2, 2024 at 12:18 AM Steve Traylen <steve.traylen@xxxxxxx> wrote: > > > On 01/02/2024 14:48, Steve Traylen wrote: > > On 01/02/2024 13:45, Andrei Borzenkov wrote: > > > >> On Thu, Feb 1, 2024 at 3:25 PM Steve Traylen <steve.traylen@xxxxxxx> > >> wrote: > >>> Hi, > >>> > >>> I'm trying to understand why I am only retaining just a couple of days > >>> of logs when I would like to have more. > >>> > >>> The system journalctl head of the logs is only today: > >>> Feb 01 10:47:14 nodeX.example.ch systemd-journald[722]: Data hash table > >>> of /var/log/journal/c33ef6d0ada04ec4abc79c567a7d94b0/system.journal has > >>> a fill level at 75.0 (174765 of 233016 items, 58720256 file size, 335 > >>> bytes per hash table item), suggesting rotation. > >>> Feb 01 10:47:14 nodeX.example.ch systemd-journald[722]: > >>> /var/log/journal/c33ef6d0ada04ec4abc79c567a7d94b0/system.journal: > >>> Journal header limits reached or header out-of-date, rotating. > >>> > >>> > >>> # journalctl --disk-usage > >>> Archived and active journals take up 8.1G in the file system. > >>> > >>> Reality is system journal is tiny: > >>> > >>> # du -sh system.journal > >>> 17M system.journal > >>> > >>> However we do have many > >>> > >>> # ls -l user-*journal | wc -l > >>> 1044 > >>> > >>> and indeed > >>> > >>> # du -sh /var/log/journal/c33ef6d0ada04ec4abc79c567a7d94b0 > >>> 8.2G /var/log/journal/c33ef6d0ada04ec4abc79c567a7d94b0 > >>> > >>> The vast majority of these user journals are empty and offline > >>> > >>> # file user-*journal | awk '{print $4, $5}' | sort | uniq -c > >>> 940 empty, offline > >>> 102 offline > >>> 2 online > >>> > >>> > >>> These user journals are all 8.0M is size > >>> > >>> So I think I have two questions: > >>> > >>> 1) Why am I loosing old logs sooner than I would like - what limit is " > >>> fill level at 75.0 (174765 of 233016 items" > >> You did not provide any evidence that logs are lost. Archived > >> (offline) logs are processed and searched by journalctl so the oldest > >> available log is the oldest archive file, not the current online file. > >> > >> The limit is the fill grade of the hash table in the individual log > >> file. It is hard coded and unrelated to the limits configured in the > >> journald.conf. It may affect how long logs are kept if you configured > >> retention by the number of log files. > > Thanks for reply. > > > > There are no archive files I believe: > > > > # ls /var/log/journal/514fed82c54d4a89b9f7f8f33eca1c8e/*system* > > /var/log/journal/514fed82c54d4a89b9f7f8f33eca1c8e/system.journal > > > > The archive files would be alongside the live file I believe. > > > > Just tried an explicit " journalctl --rotate" which logs: > > > > Feb 01 14:36:33 nodeX.example.ch systemd-journald[658]: System Journal > > (/var/log/journal/514fed82c54d4a89b9f7f8f33eca1c8e) is 8.0G, max 3.0G, > > 0B free. > > Feb 01 14:36:40 nodeX.example.ch systemd-journald[658]: Received > > client request to rotate journal, rotating. > > Feb 01 14:36:40nodeX.example.ch systemd-journald[658]: Deleted empty > > archived journal > > /var/log/journal/514fed82c54d4a89b9f7f8f33eca1c8e/user-1234@537a18390e124dd6b4cf41a69ef5780d-0000000000000000-0000000000000000.journal > > (3.5M). > > Feb 01 14:36:40 lxplus978.cern.ch systemd-journald[658]: Deleted empty > > archived journal > > /var/log/journal/514fed82c54d4a89b9f7f8f33eca1c8e/user-1235@d7d23966c1454001a714ee5aef039c60-0000000000000000-0000000000000000.journal > > (3.5M). > > > > So now maybe I understand at rotation I am over the configured max of > > 3GB so perhaps no archive is generated. Looking at another node with > > fewer number of users having ever logged in I have the archive of > > of the system log and a longer history. Those 940 "empty, offline" > > user journals consume the space providing no particular value. > > > > No other indication that rotation may not have worked. > > > > > >>> 2) Is there a safe mechanism to delete those empty offline user > >>> journals? > >>> > >> Just delete them. > > Wrote a tiny script to delete them: > > for FILE in /var/log/journal/$(cat > /etc/machine-id)/user-+([0-9]*).journal ; do > if [ "$(file --brief $FILE)" == 'Journal file empty, offline' ] > ; then > rm -f $FILE > echo "$(basename $FILE) was empty and offline so removed" > fi > done > > works perfectly - unfortunately about 20 seconds later journald (I > presume) re-creates them all despite the vast majority > of users having no current processes on the nodes. > > Try enabling debug logs for journald. Empty files should be removed by journal anyway, so maybe they are not considered really empty? > >> > >>> Thanks. > >>> > >>> Steve. > >>> > >>> Version and configuration: > >>> > >>> systemd-252-18.el9 - RHEL9 with a configuration of: > >>> > >>> [Journal] > >>> Storage = persistent > >>> SplitMode = uid > >>> SystemMaxUse = 3G > >>> SystemKeepFree = 10G > >>> MaxRetentionSec = 1year > >>> > >>> # df -h / > >>> Filesystem Size Used Avail Use% Mounted on > >>> /dev/vda1 80G 65G 16G 81% / > >>> > >>>