Re: using GlusterFS to build an NFSv4.1 pNFS server

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 03/06/15 04:41, Niels de Vos wrote:
On Tue, Jun 02, 2015 at 06:18:54PM -0400, Rick Macklem wrote:
Jiffin Tony Thottan wrote:
Hi Rick,

There is already support for pNFS in gluster volumes using
nfs-ganesha :
http://gluster.readthedocs.org/en/latest/Features/mount_gluster_volume_using_pnfs/
It supports normal FILE_LAYOUT architecture.
Yes, I am aware of this (although I'll admit I noticed it in the docs after I
posted the email).

Just fyi, if I wanted to set up a (near) production NFSv4.1/pNFS server, this would be
fine, but that's not me;-)
I'm interested in extending the NFSv4.1 server I've already written to do
pNFS. Why? Well, mostly because it interests me. (I've never been paid any $$
to do any of the FreeBSD NFS work I've done, so I pretty much do it as a hobby.)

If the result never works or never performs well enough to be useful for
production environments then...oh well, it was an interesting experiment.
Definitely sounds interesting! I don't have much to do with FreeBSD, but
I'm certainly happy to help on the Gluster side if you have any
questions.

+1. Also I can  help  you with pNFS related queries

If it ever is useful for (near) production environments, I suspect it would be
users that have set up a FreeBSD NFS server and it is outgrowing what a single
server can handle. In other words, they would come from the FreeBSD NFS server
side and not the GlusterFS side.
Other comments are inline

On 02/06/15 05:18, Rick Macklem wrote:
Hi,

Btw, I do most of the FreeBSD NFSv4 work.
I am interested in trying to use GlusterFS
to build a FreeBSD NFSv4.1 pNFS server.
My hope is that, by directing the NFSv4.1 client
to the host where the file resides, the client will
be able to do I/O on it efficiently via the NFSv3
server. (The new layout type called flex files allows
an NFSv3 server to be a storage/data server for pNFS.)
It will be good to use gluster-nfs  as a data-server(which is more
tightly coupled with bricks)
CCing Anand who has better idea about flex file layout architecture

Flex file is pretty straightforward. It simply allows the NFSv3 server
to be what they call a storage server. All that it does is use a "fake"
uid/gid that is allowed rw/ro access to the file. (This implies that
the client is responsible for deciding if a user is allowed access to
the file. Not a big deal for AUTH_SYS, since the server "trusts" the
client's choice of uid/gid anyhow.)
--> As such, the NFSv3 server needs to have a small change applied to
     it to allow access via this "fake" uid/gid.
This sounds simple enough to do. File a feature request and describe how
you can use this. Patches are welcome too, of course, but we can likely
code something up quickly.

     https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS&component=nfs

Basically, the NFSv4.1 server needs to know what the NFSv3 server's
host IP address is and what FH to use for the file on it. (I do see
the code in the NFS xlator for generating an FH, but haven't looked
much yet.) As noted below in the original post.
The FH in Gluster/NFS is based on the volume-id and the GFID. Both are
UUIDs. The volume-id is a unique identifier for the volume, and the GFID
is like a volume-wide inode-nr (volumes consist out of multiple bricks
with their own filesystems, a storage server can host multiple bricks).

It is not required to create FH in MDS(which might not be consistent in other gluster-nfs-server), Instead create ds_wire(for me it was combination of GFID and IP of the server) and handle will created at each
data server based on the ds_wire for the I/O's

There is no way to know which brick should handle a FH. Looking for the
GFID on all the bricks that participate in the volume is a rather
expensive operation (many LOOKUPs). You will always need to find the
location of the file with a request through FUSE.

To do this, I need to be able to "poke" the
glusterfs server and get the following information:
- The NFSv3 file handle and the IP address for
    the host(s) the file lives on.
    --> Using this, I am planning on creating a layout
        that tells the NFSv4.1 client to use NFSv3 to
        do I/O on the file. (What NFSv4.1 calls a storage
        server, although some RFCs might call it a data
        server.)
- I hope to use the fuse interface for the NFSv4.1 metadata
    server.
I don't know how much it is feasible to implement meta data server
using
a fuse interface.

I guess I'll find out;-). The FreeBSD NFSv4.1 server is kernel based
and exports any local file system that has a VFS/VOP interface. So,
hopefully FUSE won't provide too many surprises.
I am curious to see how well it performs.
I have no idea how FreeBSD handles FUSE, but I'm sure you won't have an
issue with figuring that out. You should be able to get the details
about the location of the file through GETXATTR calls. In NFS-Ganesha,
these two functions parse the output:
  - get_pathinfo_host
  - glfs_get_ds_addr

     These can be found here:
     https://github.com/nfs-ganesha/nfs-ganesha/blob/next/src/FSAL/FSAL_GLUSTER/mds.c#L482


If anyone can point me to the area in the GlusterFS sources
that I should look at to do this and/or suggest a machanism
for getting the above information out of the GlusterFS server,
please let me know.

Also, any comments w.r.t. the above plan are welcome.
In my opinion, a hybrid approach will better. Use the current meta
data
server implemented in ganesha (support for flex files is already
added
in ganesha)
and might need to have some tweaks in write, read, commit api's of
gluster-nfs. In this implementation, we should keep away
meta-data-server from
trusted storage pool(T.S.P) i.e a dedicated server is required for
M.D.S

I think I answered this above. Also, I doubt ganesha-nfs is ported to
FreeBSD.

I am not sure about this , may be folks from nfs-ganesha community can help you with that You can either send a mail to nfs-ganesha-devel list or ping them in irc at #ganesha in freenode.

Thanks for your comments, rick
ps: Given ganesha-nfs etc, I'll understand if GlusterFS isn't interested
     in this. Any patches that I'll generate are a long way off anyhow.
Our path forward for a more recent version and current feature set for
NFS is based on NFS-Ganesha. But, there are many users of Gluster/NFS
(NFSv3 only) that would not like to see our NFS-server disappear. If
Gluster/NFS can help you with providing a FreeBSD pNFS server, we would
surely have some interest. It will not be on the top of our planning,
but we should try to assist you where we can.

Thanks for sharing your ideas, please keep us informed and let us know
where you hit issues related to Gluster.

Niels


Thanks in advance for any information, rick
ps: I haven't written any code yet, but I think the above
      might be feasible.
You are mostly welcome in coding part :).

If you face any issue to implement current pNFS server for gluster
volumes , please feel free to enquire about the same.

Regards,
Jiffin
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel




[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux