Re: F18: unpredictable 'Predictable Network Interface Names'?

Marko Vojinovic <vvmarko@xxxxxxxxx> · Tue, 26 Feb 2013 10:33:35 +0100

On Mon, 25 Feb 2013 22:43:09 -0500
Sam Varshavchik <mrsam@xxxxxxxxxxxxxxx> wrote:
> Marko Vojinovic writes:
> > On Mon, 25 Feb 2013 21:33:04 -0500
> > Sam Varshavchik <mrsam@xxxxxxxxxxxxxxx> wrote:
> > > Marko Vojinovic writes:
> > Also, have you ever built a cluster? Typically, you install and
> > configure everything on one system, and then push the harddrive
> > image to all other (headless) nodes. Once the whole thing boots,
> > you find out that all config files for all NICs are wrong (since
> > MAC addresses will be different), and you have no network access to
> > any of the nodes... On a 100-node cluster this can be a very big
> > pain.
> 
> In that kind of a situaton you're not going to have those nodes
> multihomed. They'll typically have one LAN port, and that's about it,
> so you don't care about MAC addresses.

The performance of a cluster made of single-NIC nodes connected to a
switch is limited by the speed of the switch, which is a Bad Idea.

Rather, nodes on a cluster usually have at least 4 NICs, though this
varies greatly with the intended purpose of the cluster. Typically, the
first NIC is connected to the switch, the second and third are
crossovers directly to the left and right nearest-neighbor nodes, and
fourth is used as a backup. Depending on the purpose of the cluster,
the connection topology may even be far more complicated.

Now imagine the kernel which randomly assigns eth* names to those four
ports at boot, differently for every single node. It's a very big mess
to fix, and a big headache to maintain.

> This is more important when you're building a router with a LAN and a
> WAN port. I can see a situation like this when you're provisioning a
> bunch of nodes in a web farm.
> 
> But I could probably think up of a scripted approach to hack up an
> image that gets pushed to all the nodes, then have them on their
> first boot figure out their NICs and their MAC addresses, and fix up
> the ifcfg* rules and udev rules to bind them the way I have it set up
> on my piddly router.

Yes, that's exactly what was being done prior to biosdevname --- a
script would be launched on every node of the cluster, to "align its
virtual position", i.e. figure out how the node is connected to the
rest of the system. But this is more complicated than you might think
--- the hardware technicians would connect the node such that the
leftmost NIC port goes to the left node, the rightmost goes to the
right one, and the "middle" nodes go to the primary and backup
switches. Once your script reads off the eth*-to-MAC relationship, how
does it figure out which MAC is connected to the left neighbor, and
which to the right one? This requires sending test signals back and
forth among nodes, and your script gets very ugly, very fast. Especially
in circular topologies, where the leftmost node is on the right of the
rightmost one.

Besides, these solutions just don't scale well. Even if the script
gets it right initially, what happens when a node dies and gets
replaced by a backup? Run the script again for the whole cluster? No
way, other nodes may be live and doing something. Or when new nodes are
attached to the cluster? Or when your problem changes and you need to
rewire the virtual geometry?

I agree that solutions to these problems exist and can be
implemented, but just compare all that to the biosdevname situation,
where you just need to plug all nodes physically correctly --- something
which was typically already done by the technicians --- and be done
with it.

Best, :-)
Marko

-- 
users mailing list
users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
Have a question? Ask away: http://ask.fedoraproject.org