Re: [PATCH] fsfreeze: tell hung_task about processes put to sleep

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2012/10/16 06:02, Dave Chinner wrote:
On Mon, Oct 15, 2012 at 03:51:34PM +0900, Fernando Luis Vazquez Cao wrote:
As I mentioned in my previous email if you want to emit a
warning do it in the right place and make sure that it is
something informative. hung_check certainly isn't the
right place to do it.
So, how do we now know when a freeze fails to complete, as opposed
to a thaw that hasn't occurred? We won't get any reports from
threads that are stuck waiting for the freeze to complete, and so
we'll end up with a silent hang.

Are you referring to a situation where freeze_super() fails
to complete? If so hung_check will detect that the task
that called the freeze ioctl is stuck and will dump its
stack's contents, which is *precisely* the information we
want.

If freeze_super() completes we know that the filesystem
is in either frozen state (SB_FREEZE_COMPLETE) or
thawed state (SB_UNFROZEN) with no tasks waiting
(we take care of things properly even when
->freeze_fs(sb) fails; we print an error message and wake
up any tasks that may be waiting for the freeze to
complete). Once the ioctl returns we know
that there is nothing wrong with the kernel and spewing
random stack dumps or causing a kernel panic is not
called for.


Indeed, if you have a daemon that freezes the filesystem, and you
haven't architected it with a watchdog to handle restarts due to
failures, then you don't have a resilient system at all, regardless
of these warnings. If it's a HA daemon/agent that doesn't get
restarted and clean up it's mess automatically, then IMO it is
fundamentally broken and that's the problem that needs fixing.

Absolutely. By the way, to handle restarts properly
we need check ioctls or a sysfs/procfs equivalent for
fsfreeze, which my previous patch set implements.


Removing kernel warnings doesn't change the fact that the
application doing freeze/thaw is broken by design...

It is precisely because we want to handle things
in user space that we need to get hung_task
related panics and unneeded warnings out of
the way. As I mentioned above, if the freeze
failed to complete we already got the warnings
we need from the kernel. If it completed but
->freeze_fs() failed we get a warning and the
ioctl returns an error code so the application
will know that something went wrong. If
the ioctl returns without errors the application
can take care of things by itself (specially once
we get the check API, in whatever form, merged).

Thanks,
Fernando
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux