Re: Question about file copy through libgfapi

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Niels,

Thanks for your explanation. I'm happy you consider my proposal doable and that many ideas came to your mind. I would like to contribute to it, but I really don't know anything about the GlusterFS code and APIs, so I don't know where to start and, in this condition, I think it is not an easy job to do. I had a rapid look at the functions already existent in the APIs, thinking that perhaps a low level copy function was already present and I could use it (I saw that, for example, the rename function uses this approach), but could not find anything.
Therefore I hope someone has the time and feels like implementing this idea, I agree it would be a good improvement, not just for me.

Giacomo

Giacomo Fazio
IT Engineer

Tel. +41 91 910 7690
E-mail: giacomo.fazio@xxxxxxxxxxxxxxxxxxxx  |  Web: www.wcpmediaservices.com

Europe Office: Via Zurigo 35, 6900 Lugano, Switzerland
USA Office: 7083 Hollywood Boulevard Los Angeles, CA 90028


On Fri, Aug 22, 2014 at 3:27 PM, Niels de Vos <ndevos@xxxxxxxxxx> wrote:
On Fri, Aug 22, 2014 at 01:26:02PM +0200, Giacomo Fazio wrote:
> Hi there,
>
> Thanks to both Soumya and Prashanth. Actually you are both right. With the
> approach proposed by Soumya I would avoid the FUSE overhead but, as
> Prashanth says, the network transfer overhead would be always present. This
> is particularly important for me because I deal with very big files
> (usually around 100 GB and even more), so that network transfer have a big
> impact, while I don't think the impact of the FUSE overhead is that big.
> That's why what I would like to get is a "brick to brick" copy (just server
> side), so I would like to use the APIs to order the server to make a copy,
> so that the network transfer can be avoided.
>
> As far as I understood, it is not currently possible with libgfapi. Do you
> think it would be difficult to implement? Are there any other ways?
> Thank you and best regards,

"Difficult" is always relative, it depends on many factors :) But
I think implementing server-side copy is quite doable. You should start
with thinking of, and proposing a design. Some ideas that would work:

a.
    Have a server-side daemon (maybe glusterd) handle the copy. Some new
    libgfapi or gluster-cli function can then connect to the daemon and
    pass the instruction on (src brick + dst volume + src+dst filename).
    This daemon can then connect to it's instance on the server hosting
    the source-brick, and initiate the copy.

b.
    Add a new file operation to the GlusterFS protocol, something like
    copy-to-brick. This operation would receive the request from the
    client (the client talks to the src-brick hosting the src-file as
    usual), and the brick process needs to learn how to connect to an
    other brick (from a different volume) and create/write the file
    there. The client application should be smart enough to pass the
    path to the dst-brick that should contain the dst-file.

While writing this, I have convinced myself that (a) would surely be
easier to do.  GlusterD could spawn a special copy process (like
a libgfapi client) that connects to the source and destination volumes,
do the copy, and exit.

This also makes it much easier to start contributing!

1.
    A relatively simple libgfapi binary that implements "cp" with
    volume:/path/to/file as parameters should not be too difficult. Of
    course, you may need to mkdir the (parent) structure on the
    destination too, possibly adding a "-r" option for recursive
    copying.

2.
    A second step could then integrate this cp/libgfapi implementation
    in some gluster-cli/glusterd procedures.

3.
    Making it smarter and initiate the copy from one of the source
    bricks, can then be an other step.


For (1), it could be easier to extend some available copy-tool.
Something like rsync already supports different protocols. Maybe it is
possible to teach rsync how and what functions to call from libgfapi.
rsync supports many useful options already, writing a new cp/libgfapi
from scratch that matches only a subset from the features that rsync
has, will be a major project.

The above are just some ideas, thinking out loud... But, starting with
integrating libgfapi in rsync or similar sounds like a major usability
improvement for many Gluster users.

Niels


>
> *Giacomo Fazio*
> IT Engineer
>
> Tel. +41 91 910 7690
> E-mail: giacomo.fazio@xxxxxxxxxxxxxxxxxxxx  |  Web: www.wcpmediaservices.com
>
> Europe Office: Via Zurigo 35, 6900 Lugano, Switzerland
> USA Office: 7083 Hollywood Boulevard Los Angeles, CA 90028
>
>
> On Fri, Aug 22, 2014 at 9:36 AM, Prashanth Pai <ppai@xxxxxxxxxx> wrote:
>
> > Hi,
> >
> > Even with that approach, data would still be read (over the n/w) at the
> > client (the app using libgfapi). I think what he is looking for is a server
> > side copy (brick to brick) or within same brick _without_ the need for data
> > to go through client.
> >
> > Swift has this feature[1] and it would be really cool for glusterfs to
> > have it (may be as an external tool or as a API in libgfapi) :)
> >
> > # gluster-copy <src> <dest>
> > or
> > glfs_copy(src,dest)
> >
> > [1]
> > http://programmerthoughts.com/openstack/server-side-object-copy-in-openstack-storage/
> >
> >
> >
> > Regards,
> >  -Prashanth Pai
> >
> > ----- Original Message -----
> > From: "Soumya Koduri" <skoduri@xxxxxxxxxx>
> > To: "Giacomo Fazio" <giacomo.fazio@xxxxxxxxxxxxxxxxxxxx>, "John Mark
> > Walker" <johnmark@xxxxxxxxxxx>
> > Cc: gluster-devel@xxxxxxxxxxx, "Giovanni Contri" <
> > giovanni.contri@xxxxxxxxxxxxxxxxxxxx>, forge-admin@xxxxxxxxxxx
> > Sent: Friday, August 22, 2014 12:40:01 PM
> > Subject: Re: Question about file copy through libgfapi
> >
> > Hi Giacomo,
> >
> > If your requirement is to get away with fuse/protocol clients and do
> > server-side operations, I think its doable by writing a simple libgfapi
> > application. But since there is no libgfapi API equivalent to "cp"
> > command, you may need to implement that functionality using "glfs_open,
> > glfs_read & glfs_write" APIs.
> >
> > Here are the few links which Humble has documented on how to use
> > libgfapi and different APIs supported by it-
> >
> > http://humblec.com/libgfapi-interface-glusterfs/
> > https://github.com/gluster/glusterfs/blob/master/doc/features/libgfapi.md
> >
> >
> > Few sample examples (written in 'C' and 'python') are copied to -
> > https://github.com/gluster/glusterfs/tree/master/api/examples
> >
> >
> > Thanks,
> > Soumya
> >
> >
> >
> > On 08/21/2014 08:45 PM, Giacomo Fazio wrote:
> > > Hi John,
> > >
> > > Thanks for your quick answer. Do you mean that my question can be
> > > summarized in "can we do server-only operations?"? Yes, I think so.
> > > Please let me know as soon as you receive any answer or provide me a
> > > link where I can follow directly this case.
> > > Thanks in advance and best regards,
> > >
> > > *Giacomo Fazio*
> > > IT Engineer
> > >
> > > Tel. +41 91 910 7690
> > > E-mail:Â giacomo.fazio@xxxxxxxxxxxxxxxxxxxx
> > > <mailto:giacomo.fazio@xxxxxxxxxxxxxxxxxxxx>  |  Web:Â
> > > www.wcpmediaservices.com <http://www.wcpmediaservices.com>
> > >
> > > Europe Office:Â Via Zurigo 35, 6900 Lugano, Switzerland
> > > USA Office:Â 7083 Hollywood Boulevard Los Angeles, CA 90028
> > >
> > >
> > > On Thu, Aug 21, 2014 at 5:04 PM, John Mark Walker <johnmark@xxxxxxxxxxx
> > > <mailto:johnmark@xxxxxxxxxxx>> wrote:
> > >
> > >     Thanks, Giacomo. I'm sending this to the gluster-devel list - it's
> > >     an interesting question. Basically, can we do server-only operations?
> > >
> > >     -JM
> > >
> > >
> > >
> >  ------------------------------------------------------------------------
> > >
> > >         Hello,
> > >
> > >         I am currently using GlusterFS version 3.5 with two bricks. What
> > >         I currently do is mounting the whole storage in some Linux
> > >         clients (RedHat) through fuse.glusterfs that (I think) uses NFS
> > >         in the background.
> > >         What I would like to do is copying a file from a directory to
> > >         another one in the storage in the quickest way. Using a "cp
> > >         file1 file2" from my RedHat client is not the best option
> > >         because the data flows from the storage to my RedHat client
> > >         through the network and then back to the storage. I would like
> > >         instead to avoid this waste of time and copy the file directly
> > >         from the 1st directory to the 2nd one. So, in a nutshell, I
> > >         would like to have file1 -> file2  , instead of file1 ->
> > >         RedHatclient -> file2
> > >         Do you think is it possible, for example using libgfapi? Any
> > >         example to show me?
> > >         Thank you in advance and best regards,
> > >
> > >         *Giacomo Fazio*
> > >         IT Engineer
> > >
> > >         Tel. +41 91 910 7690 <tel:%2B41%2091%20910%207690>
> > >         E-mail:Â giacomo.fazio@xxxxxxxxxxxxxxxxxxxx
> > >         <mailto:giacomo.fazio@xxxxxxxxxxxxxxxxxxxx>  |  Web:Â
> > >         www.wcpmediaservices.com <http://www.wcpmediaservices.com>
> > >
> > >         Europe Office:Â Via Zurigo 35, 6900 Lugano, Switzerland
> > >         USA Office:Â 7083 Hollywood Boulevard Los Angeles, CA 90028
> > >
> > >
> > >
> > >
> > >
> > > _______________________________________________
> > > Gluster-devel mailing list
> > > Gluster-devel@xxxxxxxxxxx
> > > http://supercolony.gluster.org/mailman/listinfo/gluster-devel
> > >
> > _______________________________________________
> > Gluster-devel mailing list
> > Gluster-devel@xxxxxxxxxxx
> > http://supercolony.gluster.org/mailman/listinfo/gluster-devel
> >

> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel@xxxxxxxxxxx
> http://supercolony.gluster.org/mailman/listinfo/gluster-devel


_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux