Hello Harald, Thanks for taking an interest. Answers are in line. On 01/31/2012 02:12 AM, Harald St?rzebecher wrote: > Are there some network admins or someone with networking experience > you could talk to? Only in the university's over stretched central IT Services department. I daren't distract them from doing various other things I'm waiting for them to do for me. > Do you use the management functions that your current switches support? Not at the moment. I bought managed switches in case I had to set up a VLAN, but that turned out not to be necessary. > How many server and clients do you have now?. The 8 compute nodes in my Rocks cluster are the important ones. > > Do you plan to increase the numbers in the near future? I might be buying another 4 compute nodes some time this year > In that case I'd suggest to get stackable switches for easier expansion. > I agree that stackable switches are a good idea even if there is no immediate need for extra switches. > How are the servers and clients distributed to the switches now? The clients are all connected to one switch but some of the servers are connected to switches in adjacent racks. > > How are the switches connected to each other? Single cables. There aren't enough spare ports to connect the other rack switches via a LAG, so I am planning to buy a 48-port switch and connect all the clients and servers to that. > > Can you tell where your bottleneck is? > Is it the connection between your switches or is it something else? I'm pretty sure it's the connection between the switches. I tested the application that has been causing the most concern on a conventional NFS server, which was connected to the main Rocks compute cluster switch. I then connected the NFS server to a different switch and the application ran 5-6 times more slowly, even more slowly that when its data is on a GlusterFS volume in fact (when it only runs 3 times more slowly than conventional NFS...). I don't think all the applications that run on the cluster are affected so severely, but this is the one my boss has heard about... > > Could you plug all the servers and some of the clients into one switch > and the rest of the clients into the other switch(es) for a short > period of time? There should be a big difference in speed between the > two groups of clients. How does that compare to the speed you have > now? Good idea; I'll give that a try. > What happens if one switch fails? Can the remaining equipment do some > useful work until a replacement arrives? > Depending on the answers it might be better to have two or more > smaller stackable switches instead of one big switch, even if the big > one might be cheaper. I hadn't thought about redundancy I must admit, but buying two stackable 24 port switches instead of one 48 port switch is an interesting idea. I would have to connect one set of servers to one switch and the GlusterFS replica servers to the other. The clients would be distributed across both switches, and half of them would be able to connect to half the servers in the event of a switch failure. One thing that worries me about this scenario is the time it would take to self-heal all the volumes after running with the replica servers missing for a day or two. In theory GlusterFS should be able to cope with this, but there is a possibility of a server failing during the mammoth self-heal, and if were one of the up-to-date servers that was connected to the live switch when the other switch failed then the users would find themselves looking at old data. The only way to avoid this would be to have two stackable 48-port switches. I think I'll have to put this on my "nice to have" list. > > I don't have much experience with network administration so I cannot > recommend a brand or type of switch. > > I just looked at the Netgear website and googled for prices: > Two Netgear GS724TS stackable switches seem to cost nearly the same as > one GS748TS, both are supposed to have "a 20 Gbps, dual-ring, highly > redundant stacking bus". I'll have a look at those. > > > > Regards, > > Harald Thanks again. Regards, Dan.