On Thu, Mar 1, 2018 at 10:59 AM, Sage Weil <sweil@xxxxxxxxxx> wrote: > > Hi Haomai, > > I'm having trouble fixing an issue with AsyncMessenger vs fork with the > new wip-config branch. I'm wondering if you can help? > > The thing that has changed is that global_init now sets up a messenger > instance to talk to the monitor, fetch the config and monmap, and then > tears it down again. A side-effect of this is that the singletons for the > EventCenter and Stack in AsyncMessenger are present at the time of the > fork(). And it's not quite behaving as it should. The symptom I'm seeing > is that rbd-nbd is hanging strangely (and somewhat non-deterministically) > when trying to map an image. > > wip-sage3-testing-2018-02-28-1744 is the branch I'm working against > currently. It includes wip-config, and fix rbd-nbd to adjust the timing > of the prefork calls, and a patch on top that allows the cct->_log::stop() > to be called if it hasn't been started. > > The way things work in this branch, global_init does the temp connection > and tears everything down, except that the singleton Stack and EventCenter > are still there. At fork time, the fork hooks get called for > AsyncMessenger so it does the stop() before and start() after to recreate > the threads. That seems to work.. except I think something is not > quite right because it doesn't reliably talk to the cluster afterwards. I > see exchanges like the adjust with global_init_postfork_start isn't right, the daemonize will terminate all existing threads, we must respawn all threads after fork. so the later thing happen as expected > > gnit:build (wip-nbd-hell) 05:03 PM $ sudo CEPH_ARGS="--debug-ms 1 --debug-rbd 10 --log-to-stderr" bin/rbd-nbd map barr > [sudo] password for sage: > 2018-02-28 17:03:59.293 7f06a6b7c880 -1 WARNING: all dangerous and experimental features are enabled. > 2018-02-28 17:03:59.296 7f06a6b7c880 1 Processor -- start > 2018-02-28 17:03:59.296 7f06a6b7c880 1 -- - start start > 2018-02-28 17:03:59.297 7f06a6b7c880 1 -- - --> 127.0.0.1:40023/0 -- auth(proto 0 34 bytes epoch 0) v1 -- 0x56089092b550 con 0 > 2018-02-28 17:03:59.297 7f0693fff700 1 -- 127.0.0.1:0/1758722310 learned_addr learned my addr 127.0.0.1:0/1758722310 > 2018-02-28 17:03:59.298 7f0692ffd700 1 -- 127.0.0.1:0/1758722310 <== mon.0 127.0.0.1:40023/0 1 ==== mon_map magic: 0 v1 ==== 175+0+0 (698871189 0 0) 0x7f0684002630 con 0x56089093e730 > 2018-02-28 17:03:59.298 7f0692ffd700 1 -- 127.0.0.1:0/1758722310 <== mon.0 127.0.0.1:40023/0 2 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 33+0+0 (3411293466 0 0) 0x7f06840029b0 con 0x56089093e730 > 2018-02-28 17:03:59.298 7f0692ffd700 1 -- 127.0.0.1:0/1758722310 --> 127.0.0.1:40023/0 -- auth(proto 2 32 bytes epoch 0) v1 -- 0x7f06800017c0 con 0 > 2018-02-28 17:03:59.299 7f0692ffd700 1 -- 127.0.0.1:0/1758722310 <== mon.0 127.0.0.1:40023/0 3 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 206+0+0 (711149518 0 0) 0x7f0684002fb0 con 0x56089093e730 > 2018-02-28 17:03:59.299 7f0692ffd700 1 -- 127.0.0.1:0/1758722310 --> 127.0.0.1:40023/0 -- mon_subscribe({config=0+,monmap=0+}) v3 -- 0x56089079d2f0 con 0 > 2018-02-28 17:03:59.300 7f0692ffd700 1 -- 127.0.0.1:0/1758722310 <== mon.0 127.0.0.1:40023/0 4 ==== config(3 keys) v1 ==== 100+0+0 (3814196429 0 0) 0x5608908cd1e0 con 0x56089093e730 > 2018-02-28 17:03:59.300 7f0692ffd700 1 -- 127.0.0.1:0/1758722310 <== mon.0 127.0.0.1:40023/0 5 ==== mon_map magic: 0 v1 ==== 175+0+0 (698871189 0 0) 0x7f0684001390 con 0x56089093e730 > 2018-02-28 17:03:59.300 7f06a6b7c880 1 -- 127.0.0.1:0/1758722310 >> 127.0.0.1:40023/0 conn(0x56089093e730 :-1 s=STATE_OPEN pgs=62 cs=1 l=1).mark_down > 2018-02-28 17:03:59.300 7f06a6b7c880 1 -- 127.0.0.1:0/1758722310 shutdown_connections > 2018-02-28 17:03:59.301 7f06a6b7c880 1 -- 127.0.0.1:0/1758722310 shutdown_connections > 2018-02-28 17:03:59.301 7f06a6b7c880 1 -- 127.0.0.1:0/1758722310 wait complete. > 2018-02-28 17:03:59.301 7f06a6b7c880 1 -- 127.0.0.1:0/1758722310 >> 127.0.0.1:0/1758722310 conn(0x560890927300 :-1 s=STATE_NONE pgs=0 cs=0 l=0).mark_down > > (^ the initial pre-fork exchange to get the config) > > 2018-02-28 17:03:59.302 7f06a6b7c880 -1 WARNING: all dangerous and experimental features are enabled. > 2018-02-28 17:03:59.302 7f06a6b7c880 0 ceph version 13.0.1-2356-g3f9027014b (3f9027014ba1fa805ab596155d6f4b599ab00806) mimic (dev), process (unknown), pid 15464 > 2018-02-28 17:03:59.303 7f06a6b7c880 0 pidfile_write: ignore empty --pid-file > 2018-02-28 17:03:59.303 7f06a6b7c880 0 pidfile_write: ignore empty --pid-file > 2018-02-28 17:03:59.307 7f06a6b7c880 -1 WARNING: all dangerous and experimental features are enabled. > 2018-02-28 17:03:59.309 7f06a6b7c880 1 Processor -- start > 2018-02-28 17:03:59.309 7f06a6b7c880 1 -- - start start > 2018-02-28 17:03:59.310 7f06a6b7c880 1 -- - --> 127.0.0.1:40023/0 -- auth(proto 0 34 bytes epoch 0) v1 -- 0x560890868100 con 0 > 2018-02-28 17:03:59.310 7f0693fff700 1 -- 127.0.0.1:0/4125566264 learned_addr learned my addr 127.0.0.1:0/4125566264 > 2018-02-28 17:03:59.311 7f0691ffb700 1 -- 127.0.0.1:0/4125566264 <== mon.0 127.0.0.1:40023/0 1 ==== mon_map magic: 0 v1 ==== 175+0+0 (698871189 0 0) 0x7f0678002490 con 0x560890868670 > 2018-02-28 17:03:59.311 7f0691ffb700 1 -- 127.0.0.1:0/4125566264 <== mon.0 127.0.0.1:40023/0 2 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 33+0+0 (2983701339 0 0) 0x7f0678002810 con 0x560890868670 > 2018-02-28 17:03:59.311 7f0691ffb700 1 -- 127.0.0.1:0/4125566264 --> 127.0.0.1:40023/0 -- auth(proto 2 32 bytes epoch 0) v1 -- 0x7f068c001980 con 0 > 2018-02-28 17:03:59.312 7f0691ffb700 1 -- 127.0.0.1:0/4125566264 <== mon.0 127.0.0.1:40023/0 3 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 206+0+0 (3905746370 0 0) 0x7f0678002d40 con 0x560890868670 > 2018-02-28 17:03:59.312 7f0691ffb700 1 -- 127.0.0.1:0/4125566264 --> 127.0.0.1:40023/0 -- mon_subscribe({config=0+,monmap=0+}) v3 -- 0x5608907f6750 con 0 > 2018-02-28 17:04:02.313 7f06a6b7c880 -1 monclient: get_monmap_and_config failed to get config > 2018-02-28 17:04:02.313 7f06a6b7c880 1 -- 127.0.0.1:0/4125566264 >> 127.0.0.1:40023/0 conn(0x560890868670 :-1 s=STATE_OPEN pgs=63 cs=1 l=1).mark_down > 2018-02-28 17:04:02.314 7f06a6b7c880 1 -- 127.0.0.1:0/4125566264 --> 127.0.0.1:40023/0 -- auth(proto 0 34 bytes epoch 1) v1 -- 0x56089086c1a0 con 0 > 2018-02-28 17:09:02.314 7f06a6b7c880 0 monclient(hunting): authenticate timed out after 300 > 2018-02-28 17:09:02.314 7f06a6b7c880 1 -- 127.0.0.1:0/4125566264 >> 127.0.0.1:40023/0 conn(0x56089086c400 :-1 s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0 l=0).mark_down > 2018-02-28 17:09:02.315 7f06a6b7c880 1 -- 127.0.0.1:0/4125566264 --> 127.0.0.1:40023/0 -- auth(proto 0 34 bytes epoch 1) v1 -- 0x56089086c1a0 con 0 > 2018-02-28 17:09:02.315 7f06a6b7c880 1 -- 127.0.0.1:0/4125566264 >> 127.0.0.1:40023/0 conn(0x56089086c400 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=0).mark_down > 2018-02-28 17:09:02.316 7f06a6b7c880 1 -- 127.0.0.1:0/4125566264 --> 127.0.0.1:40023/0 -- auth(proto 0 34 bytes epoch 1) v1 -- 0x56089086c1a0 con 0 > 2018-02-28 17:09:02.316 7f06a6b7c880 1 -- 127.0.0.1:0/4125566264 >> 127.0.0.1:40023/0 conn(0x56089086c400 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=0).mark_down > 2018-02-28 17:09:02.317 7f06a6b7c880 1 -- 127.0.0.1:0/4125566264 --> 127.0.0.1:40023/0 -- auth(proto 0 34 bytes epoch 1) v1 -- 0x56089086c1a0 con 0 > 2018-02-28 17:09:02.317 7f06a6b7c880 1 -- 127.0.0.1:0/4125566264 >> 127.0.0.1:40023/0 conn(0x56089086c400 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=0).mark_down > 2018-02-28 17:09:02.318 7f06a6b7c880 1 -- 127.0.0.1:0/4125566264 --> 127.0.0.1:40023/0 -- auth(proto 0 34 bytes epoch 1) v1 -- 0x56089086c1a0 con 0 > 2018-02-28 17:09:02.318 7f06a6b7c880 1 -- 127.0.0.1:0/4125566264 >> 127.0.0.1:40023/0 conn(0x56089086c400 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=0).mark_down > 2018-02-28 17:09:02.319 7f06a6b7c880 1 -- 127.0.0.1:0/4125566264 --> 127.0.0.1:40023/0 -- auth(proto 0 34 bytes epoch 1) v1 -- 0x56089086c1a0 con 0 > 2018-02-28 17:09:02.319 7f06a6b7c880 1 -- 127.0.0.1:0/4125566264 >> 127.0.0.1:40023/0 conn(0x56089086c400 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=0).mark_down > 2018-02-28 17:09:02.320 7f06a6b7c880 1 -- 127.0.0.1:0/4125566264 --> 127.0.0.1:40023/0 -- auth(proto 0 34 bytes epoch 1) v1 -- 0x56089086c1a0 con 0 > 2018-02-28 17:09:02.320 7f06a6b7c880 1 -- 127.0.0.1:0/4125566264 >> 127.0.0.1:40023/0 conn(0x56089086c400 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=0).mark_down > 2018-02-28 17:09:02.321 7f06a6b7c880 1 -- 127.0.0.1:0/4125566264 --> 127.0.0.1:40023/0 -- auth(proto 0 34 bytes epoch 1) v1 -- 0x56089086c1a0 con 0 > 2018-02-28 17:09:02.321 7f06a6b7c880 1 -- 127.0.0.1:0/4125566264 >> 127.0.0.1:40023/0 conn(0x56089086c400 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=0).mark_down > 2018-02-28 17:09:02.322 7f06a6b7c880 1 -- 127.0.0.1:0/4125566264 --> 127.0.0.1:40023/0 -- auth(proto 0 34 bytes epoch 1) v1 -- 0x56089086c1a0 con 0 > 2018-02-28 17:09:02.322 7f06a6b7c880 1 -- 127.0.0.1:0/4125566264 >> 127.0.0.1:40023/0 conn(0x56089086c400 :-1 s=STATE_CONNECTING pgs=0 cs=0 l=0).mark_down > 2018-02-28 17:09:02.322 7f06a6b7c880 1 -- 127.0.0.1:0/4125566264 shutdown_connections > 2018-02-28 17:09:29.499 7f06a6b7c880 1 -- 127.0.0.1:0/4125566264 shutdown_connections > 2018-02-28 17:09:29.499 7f06a6b7c880 1 -- 127.0.0.1:0/4125566264 wait complete. > 2018-02-28 17:09:29.499 7f06a6b7c880 1 -- 127.0.0.1:0/4125566264 >> 127.0.0.1:0/4125566264 conn(0x560890863d80 :-1 s=STATE_NONE pgs=0 cs=0 l=0).mark_down > rbd-nbd: failed to map, status: (110) Connection timed out > > Notice we get a few messages back but the all of them; the mon > log indicates it never got the subscribe message. Other times I run it I > get *no* successful interaction with the mon. > > My guess would be some stop/start issue with the singletons. What > confuses me is that this behavior similarish to what the OSD is doing > (creating a messenger before the fork, forking, and then using it) but it > doesn't seem to have any problems. > > Any ideas? It's easy to reproduce with that branch and > > bin/init-ceph stop ; MON=1 OSD=1 MDS=0 ../src/vstart.sh -d -n -x -l > bin/ceph osd pool create rbd 1 > sudo CEPH_ARGS="--debug-ms 1 --log-to-stderr" bin/rbd-nbd map doesnotexist > > sage > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html