On Aug 4, 2024, at 10:47 AM, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > On Sun, 4 Aug 2024 at 08:23, Oleg Nesterov <oleg@xxxxxxxxxx> wrote: >> >> What do you think? > > Eww. I really don't like giving the dumper ptrace rights. > > I think the real limitations of the "dump to pipe" is that it's just > being very stupid. Which is fine in the sense that core dumps aren't > likely to be a huge priority. But if or when they _are_ a priority, > it's not a great model. > > So I prefer the original patch because it's also small, but it's > conceptually much smaller. > > That said, even that simplified v2 looks a bit excessive to me. > > Does it really help so much to create a new array of core_vma_metadata > pointers - could we not just sort those things in place? Hi Linus, Thanks for taking the time to reply. Yep, I don't see any immediate reason for why we can't sort this in place to begin with. Thanks, Eric, for originally bringing this up. Will send out a v3 with these edits. > Also, honestly, if the issue is core dump truncation, at some point we > should just support truncating individual mappings rather than the > whole core file anyway. No? Do you mean support truncating VMAs in addition to sorting or as a replacement to sorting? If you mean in addition, then I agree, there may be some VMAs that are known to not contain information critical to debugging, but may aid, and therefore have less priority. If you mean as a replacement to sorting, then we'd need to know exactly which VMAs to keep/discard, which is a non-trivial task, as discussed in v1 of my patch, and so it doesn't seem like a viable alternative. > Depending on what the major issue is, we might also tweak the > heuristics for which vmas get written out. > > For example, I wouldn't be surprised if there's a fair number of "this > read-only private file mapping gets written out because it has been > written to" due to runtime linking. And I kind of suspect that in many > cases that's not all that interesting. > > Anyway, I assume that Brian had some specific problem case that > triggered this all, and I'd like to know a bit more. Yes, there were a couple problem cases that triggered the need for this patch. I'll repeat what i said in v1 about this: At Juniper, we have some daemons that can consume a lot of memory, where upon crash, can result in core dumps of several GBs. While dumping, we've encountered these two scenarios resulting in a unusable core: 1. Disk space is low at the time of core dump, resulting in a truncated core once the disk is full. 2. A daemon has a TimeoutStopSec option configured in its systemd unit file, where upon core dumping, could timeout (triggering a SIGKILL) if the core dump is too large and is taking too long to dump. In both scenarios, we see that the core file is already several GB, and still does not contain the information necessary to form a backtrace, thus creating the need for this change. In the second scenario, we are unable to increase the timeout option due to our recovery time objective requirements. Best, Brian Mak > Linus