Dave Hansen <dave@xxxxxxxxxxxxxxxxxx> writes: > On Wed, 2009-03-18 at 13:03 -0700, Mike Waychison wrote: >> Polluting the dmesg buffer with messages from common failures (consider >> a multi-user cluster where checkpoints may or may not succeed) isn't >> very useful. > > Yeah, I've already gotten an earful from Serge and Dan S. about this. :) > > Serge suggested that, perhaps, the audit framework could be used. We > might also use an ftrace buffer if we want to keep a whole ton of > messages around, too. > > dmesg is definitely not workable long-term at all. How about having place holder objects in the generated checkpoint. Then instead of having a failure you have a non-restoreable checkpoint. But you know which fd, or which mmaped region, or which other thing is causing the problem and if you want more information you can look at that resource. That gives user space the freedom and scrub out the non-checkpointable bits and replace them with something like /dev/null so that we can continue on and restore the checkpoint anyway, if we think our app can cope with some things going away. Eric _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/containers