From: Avidan Borisov <avidanborisov@xxxxxxxxx> This patch series adds support for daemonizing trace-cmd record after tracing has started. The motivation is to be able to use trace-cmd more easily in automated setups. The main reason to do this in trace-cmd itself, is that it allows us to do the daemonization at the latest point possible. This has few advantages over daemonizing trace-cmd as a whole: 1. We can cleanly fail if something failed during initialization, and signal it up to the caller normally with a status code. This is not possible if we daemonize trace-cmd as a whole, since the the error could happen after the daemonization, but at that point trace-cmd is already a background process, and it's cumbersome to check if it failed or not. 2. The next thing that executes after trace-cmd exits will definitely be recorded in the trace, since we exit to the caller only after tracing has started. This is especially important when we trace guests as well, since the KVM/PTP time synchronization can take a while to complete, and we can't know for sure when the actual tracing starts. With --daemonize we know that once trace-cmd exits, the tracing has started. Here's an example how naive daemonization fails: $ sudo setsid trace-cmd record -p nop -e 'net:sched_process_exec' -A guest -p nop -e net & \ > ping -c 1 10.20.1.2 && > sudo kill -SIGINT $! && > sudo trace-cmd report -i trace.dat -i trace-guest.dat | head [1] 3087529 PING 10.20.1.2 (10.20.1.2) 56(84) bytes of data. 64 bytes from 10.20.1.2: icmp_seq=1 ttl=64 time=0.233 ms --- 10.20.1.2 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.233/0.233/0.233/0.000 ms [1]+ Interrupt sudo setsid trace-cmd record -p nop -e 'net:sched_process_exec' -A guest -p nop -e net trace-cmd: No such file or directory opening 'trace.dat' Here we have stopped trace-cmd too early, and no trace was generated. Now with --daemonize: $ sudo trace-cmd record --daemonize -p nop -e 'sched:sched_process_exec' -A guest -p nop -e net && > ping -c 1 10.20.1.2 && > sudo start-stop-daemon --stop --signal INT --retry 20 --pidfile /var/run/trace-cmd-record.pid && > sudo trace-cmd report -i trace.dat -i trace-guest.dat | head Negotiated kvm time sync protocol with guest guest Send SIGINT to pid 3071371 to stop recording PING 10.20.1.2 (10.20.1.2) 56(84) bytes of data. 64 bytes from 10.20.1.2: icmp_seq=1 ttl=64 time=0.134 ms --- 10.20.1.2 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.134/0.134/0.134/0.000 ms CPU0 data recorded at offset=0x14f000 229 bytes in size (4096 uncompressed) .... trace.dat: cpus=28 trace-guest.dat: cpus=1 trace.dat: ping-3071450 [013] 1196830.834258: sched_process_exec: filename=/bin/ping pid=3071450 old_pid=3071450 trace-guest.dat: <idle>-0 [000] 1196830.835990: napi_gro_receive_entry: dev=eth1 napi_id=0x2002 queue_mapping=1 skbaddr=0xffff95d051a5c400 vlan_tagged=0 vlan_proto=0x0000 vlan_tci=0x0000 protocol=0x0800 ip_summed=0 hash=0x00000000 l4_hash=0 len=84 data_len=0 truesize=768 mac_header_valid=1 mac_header=-14 nr_frags=0 gso_size=0 gso_type=0 trace-guest.dat: <idle>-0 [000] 1196830.835997: napi_gro_receive_exit: ret=3 trace-guest.dat: <idle>-0 [000] 1196830.835998: netif_receive_skb: dev=eth1 skbaddr=0xffff95d051a5c400x len=84 trace-guest.dat: <idle>-0 [000] 1196830.836021: net_dev_queue: dev=eth1 skbaddr=0xffff95d051a5c700x len=98 trace-guest.dat: <idle>-0 [000] 1196830.836024: net_dev_start_xmit: dev=eth1 queue_mapping=0 skbaddr=0xffff95d051a5c700 vlan_tagged=0 vlan_proto=0x0000 vlan_tci=0x0000 protocol=0x0800 ip_summed=0 len=98 data_len=0 network_offset=14 transport_offset_valid=1 transport_offset=34 tx_flags=0 gso_size=0 gso_segs=0 gso_type=0 trace-guest.dat: <idle>-0 [000] 1196830.836069: net_dev_xmit: dev=eth1 skbaddr=0xffff95d051a5c700 len=98 rc=0 trace.dat: sudo-3071451 [015] 1196830.838262: sched_process_exec: filename=/usr/bin/sudo pid=3071451 old_pid=3071451 As for the implementation itself - unfortunately since daemon(3) calls fork(), we can't call it after creating other processes/threads (if we call it after creating the recorder processes it'll mess up the process hierarchy, and if after the tsync thread creation, it will mess up memory sharing). However the point in time where we want to exit to the caller is after those processes/threads have started, since that's when tracing actually starts. Therefore, we implement the daemonization ourselves in the following way: 1. We fork() early, and make the parent waitpid on the child. 2. When the child finishes initializing everything, it detaches from the session and sends a signal to the parent. 3. The parent can now exit, causing the entire process hierarchy of the child to be inherited by init. 4. The daemonization is now complete, and the caller to trace-cmd now executes the next thing, which will be traced. Avidan Borisov (3): trace-cmd record: Add --daemonize trace-cmd: export pidfile functions from trace-listen.c trace-cmd record: Create a pidfile when using --daemonize .../trace-cmd/trace-cmd-record.1.txt | 4 + tracecmd/include/trace-local.h | 4 + tracecmd/trace-listen.c | 32 ++-- tracecmd/trace-record.c | 141 +++++++++++++++++- tracecmd/trace-usage.c | 3 + 5 files changed, 164 insertions(+), 20 deletions(-) -- 2.25.1