On 03/06/15 04:41, Niels de Vos wrote:
On Tue, Jun 02, 2015 at 06:18:54PM -0400, Rick Macklem wrote:
Jiffin Tony Thottan wrote:
Hi Rick,
There is already support for pNFS in gluster volumes using
nfs-ganesha :
http://gluster.readthedocs.org/en/latest/Features/mount_gluster_volume_using_pnfs/
It supports normal FILE_LAYOUT architecture.
Yes, I am aware of this (although I'll admit I noticed it in the docs after I
posted the email).
Just fyi, if I wanted to set up a (near) production NFSv4.1/pNFS server, this would be
fine, but that's not me;-)
I'm interested in extending the NFSv4.1 server I've already written to do
pNFS. Why? Well, mostly because it interests me. (I've never been paid any $$
to do any of the FreeBSD NFS work I've done, so I pretty much do it as a hobby.)
If the result never works or never performs well enough to be useful for
production environments then...oh well, it was an interesting experiment.
Definitely sounds interesting! I don't have much to do with FreeBSD, but
I'm certainly happy to help on the Gluster side if you have any
questions.
+1. Also I can help you with pNFS related queries
If it ever is useful for (near) production environments, I suspect it would be
users that have set up a FreeBSD NFS server and it is outgrowing what a single
server can handle. In other words, they would come from the FreeBSD NFS server
side and not the GlusterFS side.
Other comments are inline
On 02/06/15 05:18, Rick Macklem wrote:
Hi,
Btw, I do most of the FreeBSD NFSv4 work.
I am interested in trying to use GlusterFS
to build a FreeBSD NFSv4.1 pNFS server.
My hope is that, by directing the NFSv4.1 client
to the host where the file resides, the client will
be able to do I/O on it efficiently via the NFSv3
server. (The new layout type called flex files allows
an NFSv3 server to be a storage/data server for pNFS.)
It will be good to use gluster-nfs as a data-server(which is more
tightly coupled with bricks)
CCing Anand who has better idea about flex file layout architecture
Flex file is pretty straightforward. It simply allows the NFSv3 server
to be what they call a storage server. All that it does is use a "fake"
uid/gid that is allowed rw/ro access to the file. (This implies that
the client is responsible for deciding if a user is allowed access to
the file. Not a big deal for AUTH_SYS, since the server "trusts" the
client's choice of uid/gid anyhow.)
--> As such, the NFSv3 server needs to have a small change applied to
it to allow access via this "fake" uid/gid.
This sounds simple enough to do. File a feature request and describe how
you can use this. Patches are welcome too, of course, but we can likely
code something up quickly.
https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS&component=nfs
Basically, the NFSv4.1 server needs to know what the NFSv3 server's
host IP address is and what FH to use for the file on it. (I do see
the code in the NFS xlator for generating an FH, but haven't looked
much yet.) As noted below in the original post.
The FH in Gluster/NFS is based on the volume-id and the GFID. Both are
UUIDs. The volume-id is a unique identifier for the volume, and the GFID
is like a volume-wide inode-nr (volumes consist out of multiple bricks
with their own filesystems, a storage server can host multiple bricks).
It is not required to create FH in MDS(which might not be consistent in
other gluster-nfs-server),
Instead create ds_wire(for me it was combination of GFID and IP of the
server) and handle will created at each
data server based on the ds_wire for the I/O's
There is no way to know which brick should handle a FH. Looking for the
GFID on all the bricks that participate in the volume is a rather
expensive operation (many LOOKUPs). You will always need to find the
location of the file with a request through FUSE.
To do this, I need to be able to "poke" the
glusterfs server and get the following information:
- The NFSv3 file handle and the IP address for
the host(s) the file lives on.
--> Using this, I am planning on creating a layout
that tells the NFSv4.1 client to use NFSv3 to
do I/O on the file. (What NFSv4.1 calls a storage
server, although some RFCs might call it a data
server.)
- I hope to use the fuse interface for the NFSv4.1 metadata
server.
I don't know how much it is feasible to implement meta data server
using
a fuse interface.
I guess I'll find out;-). The FreeBSD NFSv4.1 server is kernel based
and exports any local file system that has a VFS/VOP interface. So,
hopefully FUSE won't provide too many surprises.
I am curious to see how well it performs.
I have no idea how FreeBSD handles FUSE, but I'm sure you won't have an
issue with figuring that out. You should be able to get the details
about the location of the file through GETXATTR calls. In NFS-Ganesha,
these two functions parse the output:
- get_pathinfo_host
- glfs_get_ds_addr
These can be found here:
https://github.com/nfs-ganesha/nfs-ganesha/blob/next/src/FSAL/FSAL_GLUSTER/mds.c#L482
If anyone can point me to the area in the GlusterFS sources
that I should look at to do this and/or suggest a machanism
for getting the above information out of the GlusterFS server,
please let me know.
Also, any comments w.r.t. the above plan are welcome.
In my opinion, a hybrid approach will better. Use the current meta
data
server implemented in ganesha (support for flex files is already
added
in ganesha)
and might need to have some tweaks in write, read, commit api's of
gluster-nfs. In this implementation, we should keep away
meta-data-server from
trusted storage pool(T.S.P) i.e a dedicated server is required for
M.D.S
I think I answered this above. Also, I doubt ganesha-nfs is ported to
FreeBSD.
I am not sure about this , may be folks from nfs-ganesha community can
help you with that
You can either send a mail to nfs-ganesha-devel list or ping them in irc
at #ganesha in freenode.
Thanks for your comments, rick
ps: Given ganesha-nfs etc, I'll understand if GlusterFS isn't interested
in this. Any patches that I'll generate are a long way off anyhow.
Our path forward for a more recent version and current feature set for
NFS is based on NFS-Ganesha. But, there are many users of Gluster/NFS
(NFSv3 only) that would not like to see our NFS-server disappear. If
Gluster/NFS can help you with providing a FreeBSD pNFS server, we would
surely have some interest. It will not be on the top of our planning,
but we should try to assist you where we can.
Thanks for sharing your ideas, please keep us informed and let us know
where you hit issues related to Gluster.
Niels
Thanks in advance for any information, rick
ps: I haven't written any code yet, but I think the above
might be feasible.
You are mostly welcome in coding part :).
If you face any issue to implement current pNFS server for gluster
volumes , please feel free to enquire about the same.
Regards,
Jiffin
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel