Re: Can pid be reused ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I just realized what it is.  The way killall is used when stopping a vstart cluster, is to kill all processes by name!  You can't stop vstarted tests running in parallel.

David Zafman
Senior Developer
http://www.inktank.com




> On Oct 21, 2014, at 7:55 PM, Loic Dachary <loic@xxxxxxxxxxx> wrote:
> 
> Hi,
> 
> Something strange happens on fedora20 with linux 3.11.10-301.fc20.x86_64. Running make -j8 check on https://github.com/ceph/ceph/pull/2750 a process gets killed from time to time. For instance it shows as
> 
> TEST_erasure_crush_stripe_width: 124: stripe_width=4096
> TEST_erasure_crush_stripe_width: 125: ./ceph osd pool create pool_erasure 12 12 erasure
> *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH ***
> ./test/mon/osd-pool-create.sh: line 120: 27557 Killed                  ./ceph osd pool create pool_erasure 12 12 erasure
> TEST_erasure_crush_stripe_width: 126: ./ceph --format json osd dump
> TEST_erasure_crush_stripe_width: 126: tee osd-pool-create/osd.json
> 
> in the test logs. Note the 27557 Killed . I originally thought it was because some ulimit was crossed and set them to very generous / unlimited hard / soft thresholds.
> 
> core file size          (blocks, -c) 0                                                                                     
> data seg size           (kbytes, -d) unlimited                                                                             
> scheduling priority             (-e) 0                                                                                     
> file size               (blocks, -f) unlimited                                                                             
> pending signals                 (-i) 515069                                                                                
> max locked memory       (kbytes, -l) unlimited                                                                             
> max memory size         (kbytes, -m) unlimited                                                                             
> open files                      (-n) 400000                                                                                
> pipe size            (512 bytes, -p) 8                                                                                     
> POSIX message queues     (bytes, -q) 819200                                                                                
> real-time priority              (-r) 0                                                                                     
> stack size              (kbytes, -s) unlimited                                                                             
> cpu time               (seconds, -t) unlimited                                                                             
> max user processes              (-u) unlimited                                                                             
> virtual memory          (kbytes, -v) unlimited                                                                             
> file locks                      (-x) unlimited    
> 
> Benoit Canet suggested that I installed systemtap ( https://www.sourceware.org/systemtap/wiki/SystemtapOnFedora ) and ran https://sourceware.org/systemtap/examples/process/sigkill.stp to watch what was sending the kill signal. It showed the following:
> 
> ...
> SIGKILL was sent to ceph-osd (pid:27557) by vstart_wrapper. uid:1001
> SIGKILL was sent to python (pid:27557) by vstart_wrapper. uid:1001
> ....
> 
> which suggests that pid 27557 used by ceph-osd was reused for the python script that was killed above. Because the script that kills daemons is very agressive and kill -9 the pid to check if it really is dead
> 
> https://github.com/ceph/ceph/blob/giant/src/test/mon/mon-test-helpers.sh#L64
> 
> it explains the problem.
> 
> However, as Dan Mick suggests, reusing pid quickly could break a number of things and it is a surprising behavior. Maybe something else is going on. A loop creating processes sees their pid increasing and not being reused.
> 
> Any idea about what is going on would be much appreciated :-)
> 
> Cheers
> 
> -- 
> Loïc Dachary, Artisan Logiciel Libre
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux