The initial copy has to happen via gluster as I'm also using distribution as well as replication.... > -----Original Message----- > From: Stephan von Krawczynski [mailto:skraw at ithnet.com] > Sent: 06 October 2009 16:39 > To: Hiren Joshi > Cc: Pavan Vilas Sondur; gluster-users at gluster.org > Subject: Re: Rsync > > Remember, the gluster-team does not like my way of > data-feeding. If your setup > blows up, don't blame them (or me :-) > I can only tell you what I am doing: simply move (or copy) > the initial data to > the primary server of the replication setup and then start > glusterfsd for > exporting. > You will notice that the data gets replicated as soon as > stat is going on > (first ls or the like). If you already exported the data via > nfs before you > probably only need to setup up glusterfs on the very same box > and use it as > primary server. Then there is no data copying at all. > > After months of experiments I can say that glusterfs runs > pretty stable on > _low_ performance setups. But you have to do one thing: lengthen the > ping-timeout (something like "option ping-timeout 120"). > If you do not do that you will loose some of your server(s) > at any time and > that will turn your glusterfs setup in a mess. > If your environment is ok, it works. If your environment > fails it will fail, > too, sooner or later. In other words: it exports data, but it > does not fulfill > the promise of keeping your setup alive during failures - at > this stage. > My advice for the team is to stop whatever they may work on > and take for > physical boxes (2 server, 2 client), run a lot of bonnies and > unplug/re-plug > the servers non-deterministic. You can find all kinds of > weirdos this way. > > Regards, > Stephan > > > On Mon, 5 Oct 2009 16:49:53 +0100 > "Hiren Joshi" <josh at moonfruit.com> wrote: > > > My users are more pitch fork less shooting..... > > > > I don't understand what you're saying, should I have > locally copied all > > the files over not using gluster before attempting an rsync? > > > > > -----Original Message----- > > > From: Stephan von Krawczynski [mailto:skraw at ithnet.com] > > > Sent: 05 October 2009 14:13 > > > To: Hiren Joshi > > > Cc: Pavan Vilas Sondur; gluster-users at gluster.org > > > Subject: Re: Rsync > > > > > > It would be nice to remember my thread about _not_ copying > > > data initially to > > > gluster via the mountpoint. And one major reason for _local_ > > > feed was: speed. > > > Obviously a lot of cases are merely impossible because of the > > > pure waiting > > > time. If you had a live setup people would have already > shot you... > > > This is why I talked about a feature and not an accepted bug > > > behaviour. > > > > > > Regards, > > > Stephan > > > > > > > > > On Mon, 5 Oct 2009 11:00:36 +0100 > > > "Hiren Joshi" <josh at moonfruit.com> wrote: > > > > > > > Just a quick update: The rsync is *still* not finished. > > > > > > > > > -----Original Message----- > > > > > From: gluster-users-bounces at gluster.org > > > > > [mailto:gluster-users-bounces at gluster.org] On Behalf Of > > > Hiren Joshi > > > > > Sent: 01 October 2009 16:50 > > > > > To: Pavan Vilas Sondur > > > > > Cc: gluster-users at gluster.org > > > > > Subject: Re: Rsync > > > > > > > > > > Thanks! > > > > > > > > > > I'm keeping a close eye on the "is glusterfs DHT really > > > distributed?" > > > > > thread =) > > > > > > > > > > I tried nodelay on and unhashd no. I tarred about 400G to > > > the share in > > > > > about 17 hours (~6MB/s?) and am running an rsync now. > > > Will post the > > > > > results when it's done. > > > > > > > > > > > -----Original Message----- > > > > > > From: Pavan Vilas Sondur [mailto:pavan at gluster.com] > > > > > > Sent: 01 October 2009 09:00 > > > > > > To: Hiren Joshi > > > > > > Cc: gluster-users at gluster.org > > > > > > Subject: Re: Rsync > > > > > > > > > > > > Hi, > > > > > > We're looking into the problem on similar setups and > > > workng on it. > > > > > > Meanwhile can you let us know if performance > increases if you > > > > > > use this option: > > > > > > > > > > > > option transport.socket.nodelay on' in each of your > > > > > > protocol/client and protocol/server volumes. > > > > > > > > > > > > Pavan > > > > > > > > > > > > On 28/09/09 11:25 +0100, Hiren Joshi wrote: > > > > > > > Another update: > > > > > > > It took 1240 minutes (over 20 hours) to complete on > > > the simplified > > > > > > > system (without mirroring). What else can I do to debug? > > > > > > > > > > > > > > > -----Original Message----- > > > > > > > > From: gluster-users-bounces at gluster.org > > > > > > > > [mailto:gluster-users-bounces at gluster.org] On Behalf Of > > > > > > Hiren Joshi > > > > > > > > Sent: 24 September 2009 13:05 > > > > > > > > To: Pavan Vilas Sondur > > > > > > > > Cc: gluster-users at gluster.org > > > > > > > > Subject: Re: Rsync > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > > > > > > From: Pavan Vilas Sondur [mailto:pavan at gluster.com] > > > > > > > > > Sent: 24 September 2009 12:42 > > > > > > > > > To: Hiren Joshi > > > > > > > > > Cc: gluster-users at gluster.org > > > > > > > > > Subject: Re: Rsync > > > > > > > > > > > > > > > > > > Can you let us know the following: > > > > > > > > > > > > > > > > > > * What is the exact directory structure? > > > > > > > > /abc/def/ghi/jkl/[1-4] > > > > > > > > now abc, def, ghi and jkl are one of a thousand dirs. > > > > > > > > > > > > > > > > > * How many files are there in each individual > > > directory and > > > > > > > > > of what size? > > > > > > > > Each of the [1-4] dirs has about 100 files in, all > > > under 1MB. > > > > > > > > > > > > > > > > > * It looks like each server process has 6 export > > > > > > > > > directories. Can you run one server process each > > > for a single > > > > > > > > > export directory and check if the rsync speeds up? > > > > > > > > I had no idea you could do that. How? Would I need to > > > > > > create 6 config > > > > > > > > files and start gluster: > > > > > > > > > > > > > > > > /usr/sbin/glusterfsd -f /etc/glusterfs/export1.vol > > > or similar? > > > > > > > > > > > > > > > > I'll give this a go.... > > > > > > > > > > > > > > > > > * Also, do you have any benchmarks with a > > > similar setup on > > > > > > > > say, NFS? > > > > > > > > NFS will create the dir tree in about 20 > minutes then start > > > > > > > > copying the > > > > > > > > files over, it takes about 2-3 hours. > > > > > > > > > > > > > > > > > > > > > > > > > > Pavan > > > > > > > > > > > > > > > > > > On 24/09/09 12:13 +0100, Hiren Joshi wrote: > > > > > > > > > > It's been running for over 24 hours now. > > > > > > > > > > Network traffic is nominal, top shows about > > > 200-400% cpu > > > > > > > > (7 cores so > > > > > > > > > > it's not too bad). > > > > > > > > > > About 14G of memory used (the rest is being used as > > > > > > disk cache). > > > > > > > > > > > > > > > > > > > > Thoughts? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > <snip> > > > > > > > > > > > > > > > > > > > > > > > > > > > > An update, after running the rsync > for a day, > > > > > > I killed it > > > > > > > > > > > > > and remounted > > > > > > > > > > > > > > all the disks (the underlying > > > filesystem, not the > > > > > > > > gluster) > > > > > > > > > > > > > with noatime, > > > > > > > > > > > > > > the rsync completed in about 600 > > > minutes. I'm now > > > > > > > > going to > > > > > > > > > > > > > try one level > > > > > > > > > > > > > > up (about 1,000,000,000 dirs). > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > > > > > > > > > > > > From: Pavan Vilas Sondur > > > > > [mailto:pavan at gluster.com] > > > > > > > > > > > > > > > Sent: 23 September 2009 07:55 > > > > > > > > > > > > > > > To: Hiren Joshi > > > > > > > > > > > > > > > Cc: gluster-users at gluster.org > > > > > > > > > > > > > > > Subject: Re: Rsync > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Hiren, > > > > > > > > > > > > > > > What glusterfs version are you > using? Can you > > > > > > > > send us the > > > > > > > > > > > > > > > volfiles and the log files. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Pavan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On 22/09/09 16:01 +0100, Hiren > Joshi wrote: > > > > > > > > > > > > > > > > I forgot to mention, the mount is > > > mounted with > > > > > > > > > > > > > direct-io, would this > > > > > > > > > > > > > > > > make a difference? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > > > > > > > > > > > > > > From: > gluster-users-bounces at gluster.org > > > > > > > > > > > > > > > > > > > > [mailto:gluster-users-bounces at gluster.org] On > > > > > > > > > Behalf Of > > > > > > > > > > > > > > > Hiren Joshi > > > > > > > > > > > > > > > > > Sent: 22 September 2009 11:40 > > > > > > > > > > > > > > > > > To: gluster-users at gluster.org > > > > > > > > > > > > > > > > > Subject: Rsync > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hello all, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I'm getting what I think is bizarre > > > > > > > > > behaviour.... I have > > > > > > > > > > > > > > > about 400G to > > > > > > > > > > > > > > > > > rsync (rsync -av) onto a > gluster share, > > > > > > the data is > > > > > > > > > > > > > in a directory > > > > > > > > > > > > > > > > > structure which has about 1000 > > > directories > > > > > > > > > per parent and > > > > > > > > > > > > > > > about 1000 > > > > > > > > > > > > > > > > > directories in each of them. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > When I try to rsync an end leaf > > > > > directory (this > > > > > > > > > > > has about 4 > > > > > > > > > > > > > > > > > dirs and 100 > > > > > > > > > > > > > > > > > files in each) the operation > > > takes about 10 > > > > > > > > > > > seconds. When I > > > > > > > > > > > > > > > > > go one level > > > > > > > > > > > > > > > > > above (1000 dirs with about 4 > > > dirs in each > > > > > > > > > with about 100 > > > > > > > > > > > > > > > > > files in each) > > > > > > > > > > > > > > > > > the operation takes about 10 minutes. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Now, if I then go one level > above that > > > > > > (that's 1000 > > > > > > > > > > > > dirs with > > > > > > > > > > > > > > > > > 1000 dirs > > > > > > > > > > > > > > > > > in each with about 4 dirs in each > > > with about > > > > > > > > > 100 files in > > > > > > > > > > > > > > > each) the > > > > > > > > > > > > > > > > > operation takes days! Top shows > > > glusterfsd > > > > > > > > > takes 300-600% > > > > > > > > > > > > > > > cpu usage > > > > > > > > > > > > > > > > > (2X4core), I have about 48G of memory > > > > > > > > (usage is 0% as > > > > > > > > > > > > > expected). > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Has anyone seen anything like > > > this? How can I > > > > > > > > > speed it up? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Josh. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > > > > > > > > > > Gluster-users mailing list > > > > > > > > > > > > > > > > Gluster-users at gluster.org > > > > > > > > > > > > > > > > > > > > > > > > > > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > > > > > > Gluster-users mailing list > > > > > > > > > > > > Gluster-users at gluster.org > > > > > > > > > > > > > > > > > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users > > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > > Gluster-users mailing list > > > > > > > > Gluster-users at gluster.org > > > > > > > > > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > Gluster-users mailing list > > > > > Gluster-users at gluster.org > > > > > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users > > > > > > > > > _______________________________________________ > > > > Gluster-users mailing list > > > > Gluster-users at gluster.org > > > > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users > > > > > > > > > > > > > > >