On Wed, Jul 6, 2016 at 12:24 AM, Shyam <srangana@xxxxxxxxxx> wrote:
On 07/01/2016 01:45 AM, B.K.Raghuram wrote:
I have not gone through this implementation nor the new iscsi
implementation being worked on for 3.9 but I thought I'd share the
design behind a distributed iscsi implementation that we'd worked on
some time back based on the istgt code with a libgfapi hook.
The implementation used the idea of using one file to represent one
block (of a chosen size) thus allowing us to use gluster as the backend
to store these files while presenting a single block device of possibly
infinite size. We used a fixed file naming convention based on the block
number which allows the system to determine which file(s) needs to be
operated on for the requested byte offset. This gave us the advantage of
automatically accessing all of gluster's file based functionality
underneath to provide a fully distributed iscsi implementation.
Would this be similar to the new iscsi implementation thats being worked
on for 3.9?
<will let others correct me here, but...>
Ultimately the idea would be to use sharding, as a part of the gluster volume graph, to distribute the blocks (or rather shard the blocks), rather than having the disk image on one distribute subvolume and hence scale disk sizes to the size of the cluster. Further, sharding should work well here, as this is a single client access case (or are we past that hurdle already?).
Not yet, we need common transaction frame in place to reduce the latency for synchronization.
What this achieves is similar to the iSCSI implementation that you talk about, but gluster doing the block splitting and hence distribution, rather than the iSCSI implementation (istgt) doing the same.
< I did a cursory check on the blog post, but did not find a shard reference, so maybe others could pitch in here, if they know about the direction>
There are two directions which will eventually converge.
1) Granular data self-heal implementation so that taking snapshot becomes as simple as reflink.
2) Bring in snapshots of file with shards - this is a bit involved compared to the solution above.
Once 2) is also complete we will have both 1) + 2) combined so that data-self-heal will heal the exact blocks inside each shard.
If the users are not worried about snapshots 2) is the best option.
Further, in your original proposal, how do you maintain device properties, such as size of the device and used/free blocks? I ask about used and free, as that is an overhead to compute, if each block is maintained as a separate file by itself, or difficult to achieve consistency of the size and block update (as they are separate operations). Just curious.
--
Pranith
_______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel