On Wed 21-02-24 13:30:51, Carlos Galo wrote: > On Tue, Feb 20, 2024 at 11:55 PM Michal Hocko <mhocko@xxxxxxxx> wrote: > > > > Hi, > > sorry I have missed this before. > > > > On Thu 11-01-24 21:05:30, Carlos Galo wrote: > > > The current implementation of the mark_victim tracepoint provides only > > > the process ID (pid) of the victim process. This limitation poses > > > challenges for userspace tools that need additional information > > > about the OOM victim. The association between pid and the additional > > > data may be lost after the kill, making it difficult for userspace to > > > correlate the OOM event with the specific process. > > > > You are correct that post OOM all per-process information is lost. On > > the other hand we do dump all this information to the kernel log. Could > > you explain why that is not suitable for your purpose? > > Userspace tools often need real-time visibility into OOM situations > for userspace intervention. Our use case involves utilizing BPF > programs, along with BPF ring buffers, to provide OOM notification to > userspace. Parsing kernel logs would be significant overhead as > opposed to the event based BPF approach. Please put that into the changelog. > > > In order to mitigate this limitation, add the following fields: > > > > > > - UID > > > In Android each installed application has a unique UID. Including > > > the `uid` assists in correlating OOM events with specific apps. > > > > > > - Process Name (comm) > > > Enables identification of the affected process. > > > > > > - OOM Score > > > Allows userspace to get additional insights of the relative kill > > > priority of the OOM victim. > > > > What is the oom score useful for? > > > The OOM score provides us a measure of the victim's importance. On the > android side, it allows us to identify if top or foreground apps are > killed, which have user perceptible impact. But the value on its own (wihtout knowing scores of other tasks) doesn't really tell you anything, does it? > > Is there any reason to provide a different information from the one > > reported to the kernel log? > > __oom_kill_process: > > pr_err("%s: Killed process %d (%s) total-vm:%lukB, anon-rss:%lukB, file-rss:%lukB, shmem-rss:%lukB, UID:%u pgtables:%lukB oom_score_adj:%hd\n", > > message, task_pid_nr(victim), victim->comm, K(mm->total_vm), > > K(get_mm_counter(mm, MM_ANONPAGES)), > > K(get_mm_counter(mm, MM_FILEPAGES)), > > K(get_mm_counter(mm, MM_SHMEMPAGES)), > > from_kuid(&init_user_ns, task_uid(victim)), > > mm_pgtables_bytes(mm) >> 10, victim->signal->oom_score_adj); > > > > We added these fields we need (UID, process name, and OOM score), but > we're open to adding the others if you prefer that for consistency > with the kernel log. yes, I think the consistency would be better here. For one it reports numbers which can tell quite a lot about the killed victim. It is a superset of what you already asking for. With a notable exception of the oom_score which is really dubious without a wider context. -- Michal Hocko SUSE Labs