Brian Mak <makb@xxxxxxxxxxx> writes: > On Jul 31, 2024, at 7:52 PM, Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote: > >> Brian Mak <makb@xxxxxxxxxxx> writes: >> >>> Large cores may be truncated in some scenarios, such as daemons with stop >>> timeouts that are not large enough or lack of disk space. This impacts >>> debuggability with large core dumps since critical information necessary to >>> form a usable backtrace, such as stacks and shared library information, are >>> omitted. We can mitigate the impact of core dump truncation by dumping >>> smaller VMAs first, which may be more likely to contain memory for stacks >>> and shared library information, thus allowing a usable backtrace to be >>> formed. >> >> This sounds theoretical. Do you happen to have a description of a >> motivating case? A situtation that bit someone and resulted in a core >> file that wasn't usable? >> >> A concrete situation would help us imagine what possible caveats there >> are with sorting vmas this way. >> >> The most common case I am aware of is distributions setting the core >> file size to 0 (ulimit -c 0). > > Hi Eric, > > Thanks for taking the time to reply. We have hit these scenarios before in > practice where large cores are truncated, resulting in an unusable core. > > At Juniper, we have some daemons that can consume a lot of memory, where > upon crash, can result in core dumps of several GBs. While dumping, we've > encountered these two scenarios resulting in a unusable core: > > 1. Disk space is low at the time of core dump, resulting in a truncated > core once the disk is full. > > 2. A daemon has a TimeoutStopSec option configured in its systemd unit > file, where upon core dumping, could timeout (triggering a SIGKILL) if the > core dump is too large and is taking too long to dump. > > In both scenarios, we see that the core file is already several GB, and > still does not contain the information necessary to form a backtrace, thus > creating the need for this change. In the second scenario, we are unable to > increase the timeout option due to our recovery time objective > requirements. > >> One practical concern with this approach is that I think the ELF >> specification says that program headers should be written in memory >> order. So a comment on your testing to see if gdb or rr or any of >> the other debuggers that read core dumps cares would be appreciated. > > I've already tested readelf and gdb on core dumps (truncated and whole) > with this patch and it is able to read/use these core dumps in these > scenarios with a proper backtrace. > >>> We implement this by sorting the VMAs by dump size and dumping in that >>> order. >> >> Since your concern is about stacks, and the kernel has information about >> stacks it might be worth using that information explicitly when sorting >> vmas, instead of just assuming stacks will be small. > > This was originally the approach that we explored, but ultimately moved > away from. We need more than just stacks to form a proper backtrace. I > didn't narrow down exactly what it was that we needed because the sorting > solution seemed to be cleaner than trying to narrow down each of these > pieces that we'd need. At the very least, we need information about shared > libraries (.dynamic, etc.) and stacks, but my testing showed that we need a > third piece sitting in an anonymous R/W VMA, which is the point that I > stopped exploring this path. I was having a difficult time narrowing down > what this last piece was. > >> I expect the priorities would look something like jit generated >> executable code segments, stacks, and then heap data. >> >> I don't have enough information what is causing your truncated core >> dumps, so I can't guess what the actual problem is your are fighting, >> so I could be wrong on priorities. >> >> Though I do wonder if this might be a buggy interaction between >> core dumps and something like signals, or io_uring. If it is something >> other than a shortage of storage space causing your truncated core >> dumps I expect we should first debug why the coredumps are being >> truncated rather than proceed directly to working around truncation. > > I don't really see any feasible workarounds that can be done for preventing > truncation of these core dumps. Our truncated cores are also not the result > of any bugs, but rather a limitation. Thanks that clarifies things.