Re: [PATCH 00/33] [RFC] Non disruptive application core dump infrastructure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 03/20/2014 09:39 AM, Janani Venkataraman wrote:
> Hi all,
> 
> The following series implements an infrastructure for capturing the core of an
> application without disrupting its process.
> 
> Kernel Space Approach:
> 
> 1) Posted an RFD to LKML explaining the various kernel-methods being analysed.
> 
> https://lkml.org/lkml/2013/9/3/122
> 
> 2) Went ahead to implement the same using the task_work_add approach and posted an
> RFC to LKML.
> 
> http://lwn.net/Articles/569534/
> 
> Based on the responses, the present approach implements the same in User-Space.
> 
> User Space Approach:
> 
> We didn't adopt the CRIU approach because our method would give us a head
> start, as all that the distro would need is the PTRACE_functionality and nothing
> more which is available from kernel versions 3.4 and above.
> 
> Basic Idea of User Space:
> 
> 1) The threads are held using PTRACE_SEIZE and PTRACE_INTERRUPT.
> 
> 2) The dump is then taken using the following:
>     1) The register sets namely general purpose, floating point and the arch
>     specific register sets are collected through PTRACE_GETREGSET calls by
>     passing the appropriate register type as parameter.
>     2) The virtual memory maps are collected from /proc/pid/maps.
>     3) The auxiliary vector is collected from /proc/pid/auxv.
>     4) Process state information for filling the notes such as PRSTATUS and
>     PRPSINFO are collected from /proc/pid/stat and /proc/pid/status.
>     5) The actual memory is read through process_vm_readv syscall as suggested
>     by Andi Kleen.
>     6) Command line arguments are collected from /proc/pid/cmdline
> 
> 3) The threads are then released using PTRACE_DETACH.
> 
> Self Dump:
> 
> A self dump is implemented with the following approach which was adapted
> from CRIU:
> 
> Gencore Daemon
> 
> The programs can request a dump using gencore() API, provided through
> libgencore. This is implemented through a daemon which listens on a UNIX File
> socket. The daemon is started immediately post installation.
> 
> We have provided service scripts for integration with systemd.
> 
> NOTE:
> 
> On systems with systemd, we could make use of socket option, which will avoid
> the need for running the gencore daemon always. The systemd can wait on the
> socket for requests and trigger the daemon as and when required. However, since
> the systemd socket APIs are not exported yet, we have disabled the supporting
> code for this feature.
> 
> libgencore:
> 
> 1) The client interface is a standard library call. All that the dump requester
> does is open the library and call the gencore() API and the dump will be
> generated in the path specified(relative/absolute).
> 
> To Do:
> 
> 1) Presently we wait indefinitely for the all the threads to seize. We can add
> a time-out to decide how much time we need to wait for the threads to be
> seized. This can be passed as command line argument in the case of a third
> party dump and in the case of the self-dump through the library call. We need
> to work on how much time to wait.
> 
> 2) Like mentioned before, the systemd socket APIs are not exported yet and
> hence this option is disabled now. Once these API's are available we can enable
> the socket option.
> 
> We would like to push this to one of the following packages:
> a) util-linux
> b) coreutils
> c) procps-ng
> 
> We are not sure which one would suit this application the best.
> Please let us know your views on the same.

Well from coreutils persepective, they're generally
non Linux specific _commands_, and so wouldn't be
a natural home for this (despite the _core_ in the name :)).

thanks,
Pádraig.
--
To unsubscribe from this list: send the line "unsubscribe util-linux" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux