Re: Please stop apps going into state D uninterrupted sleep !!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 08May2012 16:02, Patrick O'Callaghan <pocallaghan@xxxxxxxxx> wrote:
| On Tue, 2012-05-08 at 13:16 -0700, Joe Zeff wrote:
| > On 05/08/2012 12:39 PM, Patrick O'Callaghan wrote:
| > > On Tue, 2012-05-08 at 19:42 +0100, Andrew Gray wrote:
| > >> >  Hi
| > >> >
| > >> >  Either give use a way to kill a hung cp or rsync  when the VPN goes down
| > >> >  and they end up is state D uninterrupted sleep or stop apps being able
| > >> >  to go into uninterrupted sleep !!
| > > It is*not possible*  to kill a process in D state. D state can be
| > > defined as "the state which cannot be interrupted".
| > 
| > I think it's fairly clear that Mr. O'Callaghan knows that.
| 
| I think you mean Mr. Gray.
| 
| >   He's 
| > complaining about the consequences of there being an uninterruptable 
| > sleep.  If I read him right, he's saying that it should always be 
| > possible for the user to force a hung app to die when it's clear to the 
| > user that something has happened that makes it impossible for the app to 
| > continue, such as rsync completing when the remote server's known to 
| > have crashed.

Frankly, I think a SIGKILL should have this semantic: cancel the program
_now_, and queue whatever is needed in the OS to clean up.

| > At this point, probably the best way to proceed is to 
| > request that whoever maintains the programs in question modify them so 
| > that they don't enter this state when accessing a remote file system or 
| > that there's some way to get the app's attention and force it to abort. 

Not, very wrong. This state is out of the user's control (i.e. the
program's control).

| As I tried to explain, rewriting a couple of apps is not going to hack
| it. The apps don't *know* they're using a networked filesystem, they're
| just accessing files. They could find out and try to take measures, but
| then what about all the other apps that also write files? Rewrite tar,
| cpio, dd, cat, ...?
| 
| The price of treating a networked fs as equivalent to a local one is
| that you get screwed when it doesn't behave like a local one. Dealing
| with this in a coherent and consistent way is hard. See the literature
| on distributed filesystems. The semantics of an NFS system are *not* the
| same as a local system. We brush this under the carpet most of the time
| because it usually works, but sometimes the differences bite.

It's not that hard to save userspace in the kernel. Make SIGKILL abort
the OS call and terminate the process. Have the kernel mark the I/O as
cancelled in whatever form is necessary for the subsystem in use.

This would:

  - allow process cleanup, whih allows higher level things like shell
    scripts to quit when the things they call abort in a timely fashion

  - allow the kernel flexibility to cancel filesystem mounts more freely,
    because no processes are lying around claiming use of the FS

  - _then_ you can give umount some kind of "force" mode to tell the
    kernel that we no longer care about any outstanding I/Os on this
    network filesystem; the existing umount "-f" can be more effective

"D" state is all very well for a stalled process, but there should be a
way to say "enough" and abort the process and all its entanglements.

Cheers,
-- 
Cameron Simpson <cs@xxxxxxxxxx> DoD#743
http://www.cskk.ezoshosting.com/cs/

[...] every time you touch something, if your security systems rely
on biometric ID, then you're essentially leaving your pin number on a
post-it note.  - Ben Goldacre, http://www.badscience.net//?p=585
-- 
users mailing list
users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
Have a question? Ask away: http://ask.fedoraproject.org


[Index of Archives]     [Older Fedora Users]     [Fedora Announce]     [Fedora Package Announce]     [EPEL Announce]     [EPEL Devel]     [Fedora Magazine]     [Fedora Summer Coding]     [Fedora Laptop]     [Fedora Cloud]     [Fedora Advisory Board]     [Fedora Education]     [Fedora Security]     [Fedora Scitech]     [Fedora Robotics]     [Fedora Infrastructure]     [Fedora Websites]     [Anaconda Devel]     [Fedora Devel Java]     [Fedora Desktop]     [Fedora Fonts]     [Fedora Marketing]     [Fedora Management Tools]     [Fedora Mentors]     [Fedora Package Review]     [Fedora R Devel]     [Fedora PHP Devel]     [Kickstart]     [Fedora Music]     [Fedora Packaging]     [Fedora SELinux]     [Fedora Legal]     [Fedora Kernel]     [Fedora OCaml]     [Coolkey]     [Virtualization Tools]     [ET Management Tools]     [Yum Users]     [Yosemite News]     [Gnome Users]     [KDE Users]     [Fedora Art]     [Fedora Docs]     [Fedora Sparc]     [Libvirt Users]     [Fedora ARM]

  Powered by Linux