On Thu, May 15, 2014 at 03:46:10PM +0200, Jan Kara wrote: > > Saving it in the superblock would require changing a bunch of file > > systems. What if we store this information in memory, and print it > > out under certain conditions (i.e., after a soft lockup detection, or > > upon request of some magic sysrq request)? > By 'superblock' I meant 'struct super_block' ;) So we are in agreement I > believe. Ah, yes, we're in agreement. I thought you were talking about the on-disk superblock. > > Or we could create a tunable threshold and print a message after a > > file system has been frozen more than a particular specified duration, > > with that duration set conservatively to something like 60 or 120 > > seconds by default. > I was thinking about this as well but all these "warn after X seconds" > warnings tend to have quite a few false positives in practice so dumping > this in emergency-thaw sysrq handler or exposing the information somewhere > in proc (e.g. mountinfo) would look like a better option to me. Well, we already have the soft lockup warning, which sometimes has some false positives, but in practice, if a process is runable but doesn't get to run in 2 minutes (the default is 20 seconds, but we've used 2 minutes to avoid the false positive problem on a super busy system), something is probably clearly wrong. Similarly, if a process is trying to write to a frozen file system, and can't after two minutes, something is almost certainly wrong, or least, it's something a system administrator should know about it. We can argue over whether the default threshold should be 20 seconds, or 120 seconds, or 2 hours, but I think there would be agreement that for pretty much any configuration, there is some delay after which printing a message is actually the right thing to do. (Yes, "time that a process is waiting to write to a frozen file system != time the file system is frozen" --- the latter is easier to implement, but if people feel strongly about it, the former isn't that much more difficult.) The problem with using an sysrq handler is the user has to know how to use it. If the user files a bug saying the system has mysteriously hung, the fact that the system log contains a hint as to what might be going on would be very useful for an enterprise distribiution's help desk. (Yes, this won't help if it's the root file system is the one that's been frozen, unless the customer has configured remote syslog. But for many cases, it might provide a vital clue that could save a lot of time and support costs.) - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html