Re: Remote execution in ceph-medic

Alfredo Deza <adeza@xxxxxxxxxx> · Mon, 26 Jun 2017 14:20:33 -0400

On Mon, Jun 26, 2017 at 1:55 PM, John Spray <jspray@xxxxxxxxxx> wrote:
> On Mon, Jun 26, 2017 at 6:21 PM, Alfredo Deza <adeza@xxxxxxxxxx> wrote:
>> On Mon, Jun 26, 2017 at 12:56 PM, John Spray <jspray@xxxxxxxxxx> wrote:
>>> On Mon, Jun 26, 2017 at 5:34 PM, Alfredo Deza <adeza@xxxxxxxxxx> wrote:
>>>> On Mon, Jun 26, 2017 at 11:52 AM, John Spray <jspray@xxxxxxxxxx> wrote:
>>>>> Hi guys,
>>>>>
>>>>> I was pondering this and wondered if you had any existing plans...
>>>>>
>>>>> For doing network testing between two remote nodes, we'll need to be
>>>>> able to spin up some sort of listener on one end, presumably via SSH
>>>>> from a third party node.
>>>>
>>>> What kind of testing warrants this type of setup?
>>>
>>> I'm talking about opening a TCP connection between two remote nodes to
>>> verify that the network connectivity is working, and probably doing
>>> this across a large set of pairs e.g. doing an all-to-all ping pong
>>> between OSD nodes.  Obviously, there is just the standard `ping`, but
>>> I'm expecting that we'll want to test using actual TCP traffic in the
>>> port ranges that the OSDs would use.
>>
>> If we are checking connectivity between OSD nodes, wouldn't it be
>> sufficient to test
>> if the port where the OSD is listening can be reached? Again, we can
>> use plain Python to send actual TCP traffic here.
>
> I don't think we should assume a running OSD process for this --
> partly to enable someone to fully test their pre-Ceph configuration
> before they install Ceph, but also because we would like to
> distinguish between network issues and "OSD isn't
> listening/responding" issues.

Aha, important distinction: "pre Ceph" vs. "post Ceph".

ceph-medic is currently a "post Ceph" tool. Not that it means that it
can't (or shouldn't) have some kind
of pre-flight checks.

Those types of checks have to be lenient and approached fairly
different, and I believe that it was discussed as something
that will get implemented.

>
> John
>
>> I think that what you are proposing was going to be checked as part of
>> the 'network' collection. Managing processes to listen to each
>> other and report if they do/don't, sounds like it can be avoided.
>>
>> I have created a ticket to make sure that we collect the inter-node connectivity
>>
>>     https://github.com/ceph/ceph-medic/issues/18
>>
>>
>>>
>>> John
>>>
>>>>>
>>>>> I guess the choice here is whether to depend on having ceph-medic
>>>>> already installed on all the nodes (and invoke it with a special
>>>>> --receiver type argument) or whether the tool should inject its code
>>>>> over SSH (e.g. run a big fat python command line with a script in it
>>>>> over SSH).
>>>>>
>>>>
>>>> That is kind of how this works already, borrowing from ceph-deploy: it
>>>> uses SSH to connect
>>>> to remote nodes and execute either system commands or Python code.
>>>>
>>>> In what scenario using a system call or Python code will not gather
>>>> enough information that a server/client
>>>> setup would?
>>>>
>>>>> I lean towards the latter in the interests of making the deployment
>>>>> simple, but I'm not sure what the story is with e.g. selinux in
>>>>> situations like this, whether a server is going to get unhappy about
>>>>> an SSH session that tries to open ports.
>>>>
>>>> Having two processes running to check connectivity sounds a bit
>>>> complicated to handle. One of the things the tool does
>>>> is to cross-check against other nodes in the system, so this would
>>>> potentially mean running an exponential amount of
>>>> processes: for every node to each node in the cluster.
>>>>
>>>> It will be cheaper to perform those checks with either plain Python or
>>>> a system call.
>>>>
>>>> Or maybe you mean some other type of check? What are your ideas on
>>>> "network testing" ?
>>>>>
>>>>> John
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html