On 2012/10/13 10:06, Dave Chinner wrote:
On Fri, Oct 12, 2012 at 06:47:32PM +0900, Fernando Luis Vázquez Cao wrote:
Any process attempting to write to a frozen filesystem uninterruptibly and
unkillably waits for the filesystem to be thawed. This wait is of unbounded
length. Ignore such waits in the hung_task detector.
Filesystems should not be frozen for long enough to trigger the hung
task detector under normal usage. IMO, if you are freezing a
filesystem for that long, then you're either doing something wrong
or something has gone wrong, and in either case I think we should be
emitting warnings...
The problem is that in production systems situations where
a filesystem remains brozen for long periods are not uncommon.
A typical example is as follows: the control daemon or script that
controls the freeze/thaw using the fsfreeze ioctls dies, the next
day the system administrator finds the system log flooded with
kernel stack dumps (of course, since fsfreeze lacks check ioctls
there is no easy way for the administrator to find out what is
going on) or, if hung_task_panic happened to be set, is welcomed
with a panic message. IMHO, this behaviour is not appropriate
(nothing has gone wrong with the kernel after all) and my patch
fixes it.
If we were to emit warning in such cases, it certainly should not
be through hung_task (panics and stack dumps from seemingly
arbitrary tasks are not what a system administrator needs). We
would need to add some kind of per-superblock timer for fsfreeze
(this could arguably be useful for thaw_bdev initiated freezes,
where a failure to thaw the filesystem reasonably fast can be
indicative of a kernel problem), which I think is overkill and
have no plans to implement.
Ingo, who is maintaining hung_task? If accepted, would this patch
go through your tree?
Thanks,
Fernando
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html