Re: Bump: Journal file disk usage on frequently rebooted systems ... again

Jens Schmidt <farblos@xxxxxxxxxxxxxxx> · Mon, 27 May 2024 23:16:52 +0200

On 2024-05-27  10:28, Lennart Poettering wrote:

> It stores structured logs for each of these entries, see "journalctl
> -o verbose", i.e. a *lot* more data than you see in the simple output.
> 
> It also maintains an index for field, so that "systemctl status" can
> reasonably quickly show only only the data for a specific unit. and so
> on.
> 
> Which systemd version are you using?
> 
> In v252 many of the 64bit fields of the original journal format were
> optionally reduced to 32bit, which makes the format a lot smaller.

Thanks for these pointers.  Here is what I made of them, with some
detours.  I may be wrong at some of my conclusions, please let me know
if so.

TL;DR:

In my use case, where each journal file is less than ~ 6MiB in size,
it's not the compact mode (on 255.4-1 on Debian testing) that makes a
difference, but the data hash table size.  In my previous test case I
had the following numbers for an almost empty journal file (mrajfn =
most-recent-archived-journal-file):

  [~]$ journalctl --file "$mrajfn" --disk-usage
  Archived and active journals take up 3.4M in the file system.

After setting SystemMaxFileSize=4M I get the following with the exact
same test sequence:

  [~]$ journalctl --file "$mrajfn" --disk-usage
  Archived and active journals take up 128.0K in the file system.

Since my journal files never get bigger than 6MiB - 3.4MiB, anyway,
that's a reasonable optimization for me.  (In case somebody feels like
accusing me of parsimony: This saves me ~ 50% on journal file size, not
just "a few Megabytes".)

Detailed findings below.  Just a couple of follow-up questions:

1. Do you think this is worth mentioning in the section on
   SystemMaxFileSize= in journald.conf(5)?

2. A completely different approach would be to have journald reuse
   journal files across reboots, which does not seem to happen in my
   (default) journal configuration.  Is that possible at all?

Detailed Findings:

Using systemd/docs/JOURNAL_FILE_FORMAT.md I checked my journal file
structure (there might be tools available to get that information
faster, but I was too lazy to look anything up).

Before setting SystemMaxFileSize=4M:

  Header size:           0x00000110
  Field hash table size: 0x000014e0
  Data hash table size:  0x003770f0 (== 3.5MiB)

  Tail object offset:    0x0037a668 (type 6 == OBJECT_ENTRY_ARRAY)
  Tail object size:      0x00000028

So for the "non-data part" we have the following size:

    0x00000110
  + 0x000014e0
  + 0x003770f0

    0x003786e0 (== 3.5MiB)

And for the "data part" (== everything after the data hash table) the
following:

    0x0037a668
  + 0x00000028
  - 0x003786e0

    0x00001fb0 (== 7.9kiB)

After setting SystemMaxFileSize=4M:

  Header size:           0x00000110
  Field hash table size: 0x000014e0
  Data hash table size:  0x0001c720 (== 113.8kiB)

  Tail object offset:    0x0001fe38 (type 3 == OBJECT_ENTRY)
  Tail object size:      0x0000009c

So for the "non-data part" we have the following size:

    0x00000110
  + 0x000014e0
  + 0x0001c720

    0x0001dd10 (== 119.2 kiB)

And for the "data part" the following:

    0x0001fe38
  + 0x0000009c
  - 0x0001dd10

    0x000021c4 (== 8.4 kiB)

The data hash table size is calculated in function
journal_file_setup_data_hash_table as

  /* We estimate that we need 1 hash table entry per 768 bytes
     of journal file and we want to make sure we never get
     beyond 75% fill level. Calculate the hash table size for
     the maximum file size based on these metrics. */

  s = (f->metrics.max_size * 4 / 768 / 3) * sizeof(HashItem);
  if (s < DEFAULT_DATA_HASH_TABLE_SIZE)
          s = DEFAULT_DATA_HASH_TABLE_SIZE;

So it all seems to fit ...