isplist@xxxxxxxxxxxx wrote:
It's not virtualization. It is equivalent to mounting an NFS share, and
then exporting it again from the machine that mounted it.
Ok, so a single machine where all storage is attached to it. Won't that bog it
down big time pretty quickly?
Not if it can handle the I/O. You just need enough CPU and enough bonded
gigabit ethernet NICs in it. At the end of the day, all a SAN appliance
is just a PC with a few NICs and a bunch of disks in it, and those can
handle quite a few machines using it simultaneously.
So, if this is correct, I can see how I could export everything from that one
machine but overall I/O would be unreal?
How much I/O do you actually need? If you have 10 disk nodes, each with
a 1Gb NIC, then you could just have couple of 10Gb NICs in the
aggregator (one on the client, one on the disk node side), and you'll
get no bottleneck. In reality, you can overbook it quite a lot unless
all the machines are going flat out all the time. Caching on the
aggregator and the client nodes will also help reduce the I/O on the
disk node side.
How would this machine be turned into an aggregator? Would it handle knowing
where everything is or would servers still need to know which share to connect
to in order to get the needed data?
The disk nodes export their space via iSCSI as volumes. The aggregator
connects to each of those iSCSI volumes as normal SCSI device nodes, and
creates virtual software RAID stripe over them. It then exports this
back out via iSCSI. All the client nodes then connect to the single big
iSCSI node that is the aggregator.
I also happen to have a BlueArc i7500 machine which can offer up NFS shares. I
didn't want to use anything like that because I've read too many message about
NFS not being a good protocol to grow on. Do you disagree?
NFS can give considerably better performance than GFS under some
circumstances. If you don't need POSIX compliant file locking, you may
find that NFS works better for your application. You'll just have to try
it and see. There is no reason the aggregator box couldn't export an NFS
share to the aggregated space (i.e. be a NAS rather than a SAN).
Exactly. You have a machine that pretends to be a SAN when it in fact
has no space on it. Instead, it connects to all the individual storage
nodes, mounts their volumes, merges them into one big volume, and then
presents that one big volume via iSCSI.
Ok, I like it :). I don't get how I aggregate it all into a single volume,
guess I've not played with software RAID which expands to different storage
devices and volumes. I get the idea though.
For hardware, would this aggregator need massive resources in terms of CPU or
memory? I have IBM's which have 8-way CPU's and can have up to 64GB of memory.
I suspect that possibly overkill. It's NIC I/O you'll need more than
anything. Jumbo frames, as big as your hardware can handle, will also help.
Would the aggregator be a potential cluster candidate perhaps? Might it be
possible to run a cluster of them to be safe and to offload?
There is no reason why the aggregator couldn't mind it's own exports,
and run as one of the client cluster nodes.
This is interesting. I can see that if I could get to VM/shareroot and
something like this, I would have something quite nice going.
It's a central connection point AND a router, only it isn't just
straight routing, because the data is RAID striped for redundancy.
Right, I just don't yet get how the aggregator handles all of that I/O. Or
perhaps it just tells the servers which storage device to connect to so that
it doesn't actually have to take on all of the I/O?
No, it handles all of the I/O through itself. The client nodes don't
connect to the disk nodes directly, ever.
Gordan
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster