On 13-1-2017 12:45, Willem Jan Withagen wrote: > On 13-1-2017 09:07, Christian Balzer wrote: >> >> Hello, >> >> Something I came across a while agao, but the recent discussion here >> jolted my memory. >> >> If you have a cluster configured with just a "public network" and that >> network being in RFC space like 10.0.0.0/8, you'd think you'd be "safe", >> wouldn't you? >> >> Alas you're not: >> --- >> root@ceph-01:~# netstat -atn |grep LIST |grep 68 >> tcp 0 0 0.0.0.0:6813 0.0.0.0:* LISTEN >> tcp 0 0 0.0.0.0:6814 0.0.0.0:* LISTEN >> tcp 0 0 10.0.0.11:6815 0.0.0.0:* LISTEN >> tcp 0 0 10.0.0.11:6800 0.0.0.0:* LISTEN >> tcp 0 0 0.0.0.0:6801 0.0.0.0:* LISTEN >> tcp 0 0 0.0.0.0:6802 0.0.0.0:* LISTEN >> etc.. >> --- >> >> Something that people most certainly would NOT expect to be the default >> behavior. >> >> Solution, define a complete redundant "cluster network" that's identical >> to the public one and voila: >> --- >> root@ceph-02:~# netstat -atn |grep LIST |grep 68 >> tcp 0 0 10.0.0.12:6816 0.0.0.0:* LISTEN >> tcp 0 0 10.0.0.12:6817 0.0.0.0:* LISTEN >> tcp 0 0 10.0.0.12:6818 0.0.0.0:* LISTEN >> etc. >> --- >> >> I'd call that a security bug, simply because any other daemon on the >> planet will bloody bind to the IP it's been told to in its respective >> configuration. > > I do agree that this would not be the expected result if one specifies > specific addresses. But it could be that this is how is was designed. > > I have been hacking a bit in the networking code, and my more verbose > code (HEAD) tells me: > 1: starting osd.0 at - osd_data td/ceph-helpers/0 td/ceph-helpers/0/journal > 1: 2017-01-13 12:24:02.045275 b7dc000 -1 Processor -- bind:119 trying > to bind to 0.0.0.0:6800/0 > 1: 2017-01-13 12:24:02.045429 b7dc000 -1 Processor -- bind:119 trying > to bind to 0.0.0.0:6800/0 > 1: 2017-01-13 12:24:02.045603 b7dc000 -1 Processor -- bind:119 trying > to bind to 0.0.0.0:6801/0 > 1: 2017-01-13 12:24:02.045669 b7dc000 -1 Processor -- bind:119 trying > to bind to 0.0.0.0:6800/0 > 1: 2017-01-13 12:24:02.045715 b7dc000 -1 Processor -- bind:119 trying > to bind to 0.0.0.0:6801/0 > 1: 2017-01-13 12:24:02.045758 b7dc000 -1 Processor -- bind:119 trying > to bind to 0.0.0.0:6802/0 > 1: 2017-01-13 12:24:02.045810 b7dc000 -1 Processor -- bind:119 trying > to bind to 0.0.0.0:6800/0 > 1: 2017-01-13 12:24:02.045857 b7dc000 -1 Processor -- bind:119 trying > to bind to 0.0.0.0:6801/0 > 1: 2017-01-13 12:24:02.045903 b7dc000 -1 Processor -- bind:119 trying > to bind to 0.0.0.0:6802/0 > 1: 2017-01-13 12:24:02.045997 b7dc000 -1 Processor -- bind:119 trying > to bind to 0.0.0.0:6803/0 > > So binding factually occurs on 0.0.0.0. > > Here in sequence are bound: > Messenger *ms_public = Messenger::create(g_ceph_context, > Messenger *ms_cluster = Messenger::create(g_ceph_context, > Messenger *ms_hbclient = Messenger::create(g_ceph_context, > Messenger *ms_hb_back_server = Messenger::create(g_ceph_context, > Messenger *ms_hb_front_server = Messenger::create(g_ceph_context, > Messenger *ms_objecter = Messenger::create(g_ceph_context, > > But a specific address indication is not passed. > > I have asked on the dev-list if this is the desired behaviour. > And if not I'll if I can come up with a fix. A fix for this has been merged into the HEAD code.... https://github.com/ceph/ceph/pull/12929 If you do define public_network and not define cluster_network then public is used for cluster as well. Not sure if this will get back-ported to earlier releases. --WjW _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com