CHANGELOGs and new geo-replica sync taking forever

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



tl;dr -- geo-replication of ~200,000 CHANGELOG files is killing me... Help!

I have about 125G spread over just shy of 5000 files that I'm replicating with
geo-replication to nodes around the world.  The content is fairly stable and
probably hasn't changed at all since I initially established the GlusterFS
nodes/network, which looks as follows:
x -> xx -> [xxx, xxy] (x geo-replicates to xx, xx geo-replicates to xxx/xxy)

Latency & throughput are markedly different (x -> xx is the fastest, xx -> xxx the slowest (at about 1G/hour)). That said, all nodes were synced with 5 days
of setting up the network.

I have since added another node, xxz, which is also geo-replicated from xx (xx -> xxz). Its latency/throughput is clearly better than xx -> xxx's, but over 5
days later, I'm still replicating CHANGELOGs and haven't gotten to any real
content (the replicated volumes' mounted filesystems are empty).

Starting with x, you can see I have a "reasonable" number of CHANGELOGs:
x # find /bricks/*/.glusterfs/changelogs -name CHANGELOG\* | wc -l
186

However, xxz's source is xx, and I've got a real problem with xx:
xx # find /bricks/*/.glusterfs/changelogs -name CHANGELOG\* | wc -l
193450

5+ days into this, and I've hardly managed to dent this on xxz:
xxz # find /bricks/*/.glusterfs/changelogs -name CHANGELOG\* | wc -l
43211

On top of that, xx is generating new CHANGELOGs at a rate of ~6/minute (two
volumes at ~3/minute each), so chasing CHANGELOGs is a (quickly) moving target.

And these files are small! The "I'm alive" file is 92 bytes long, I've also
seen them also average about 4k. Demonstrating latency/throughput, you can see
that small files (for me) are a real killer:
### x -> xx (fastest route)
# for i in 1 10 100 1000; do file="$( dd if=/dev/urandom bs=1024 count=$((4000/i)) 2> /dev/null )"; echo "$i ($(( $( echo -n "$file" | wc -c )/1024 ))k): $( ( time for i in $( seq 1 $i ); do echo -n "$file" | ssh xx 'cat > /dev/null'; done ) |& awk '/^real/{ print $2 }' )"; done 1 $i ); do echo -n "$file" | ssh $location 'cat > /dev/null'; done ) |& awk '/^real/{ print $2 }' )"; done
1 (3984k): 0m4.777s
10 (398k): 0m10.737s
100 (39k): 0m53.286s
1000 (3k): 7m21.493s

### xx -> xxx (slowest route)
# for i in 1 10 100 1000; do file="$( dd if=/dev/urandom bs=1024 count=$((4000/i)) 2> /dev/null )"; echo "$i ($(( $( echo -n "$file" | wc -c )/1024 ))k): $( ( time for i in $( seq 1 $i ); do echo -n "$file" | ssh xxx 'cat > /dev/null'; done ) |& awk '/^real/{ print $2 }' )"; done
1 (3984k): 0m11.065s
10 (398k): 0m41.007s
100 (39k): 4m52.814s
1000 (3k): 39m23.009s

### xx -> xxz (the route I've added and am trying to sync)
# for i in 1 10 100 1000; do file="$( dd if=/dev/urandom bs=1024 count=$((4000/i)) 2> /dev/null )"; echo "$i ($(( $( echo -n "$file" | wc -c )/1024 ))k): $( ( time for i in $( seq 1 $i ); do echo -n "$file" | ssh xxz 'cat > /dev/null'; done ) |& awk '/^real/{ print $2 }' )"; done
1 (3984k): 0m2.673s
10 (398k): 0m16.333s
100 (39k): 2m0.676s
1000 (3k): 17m28.265s

What you're looking at is the cost of transferring a total of 4000k: 1 transfer at 4000k, 10@400k, 100@40k, and 1000@4k. With 1 transfer at under 3s and 1000
transfers at nearly 17 1/2 minutes for xx -> xxz and for the same total
transfer size, it's really a killer to transfer CHANGELOGs, especially almost
200,000 of them.

And, 92 byte files doesn't improve this:
### x -> xx (fastest route)
# file="$( dd if=/dev/urandom bs=92 count=1 2> /dev/null )"; i=100; echo "$i ($(( $( echo -n "$file" | wc -c ) ))): $( ( time for i in $( seq 1 $i ); do echo -n "$file" | ssh xx 'cat > /dev/null'; done ) |& awk '/^real/{ print $2 }' )"
100 (92): 0m34.164s

### xx -> xxx (slowest route)
# file="$( dd if=/dev/urandom bs=92 count=1 2> /dev/null )"; i=100; echo "$i ($(( $( echo -n "$file" | wc -c ) ))): $( ( time for i in $( seq 1 $i ); do echo -n "$file" | ssh xxx 'cat > /dev/null'; done ) |& awk '/^real/{ print $2 }' )"
100 (92): 3m53.388s

### xx -> xxz (the route I've added and am trying to sync)
# file="$( dd if=/dev/urandom bs=92 count=1 2> /dev/null )"; i=100; echo "$i ($(( $( echo -n "$file" | wc -c ) ))): $( ( time for i in $( seq 1 $i ); do echo -n "$file" | ssh xxz 'cat > /dev/null'; done ) |& awk '/^real/{ print $2 }' )"
100 (92): 1m43.389s

Questions...:
o Why so many CHANGELOGs?

o Why so slow (in 5 days, I've transferred 43211 CHANGELOGs, so 43211/5/24/60=6
  implies a real transfer rate of about 6 CHANGELOG files per minute, which
  brings me back to xx's generating new ones at about that rate...)?

o What can I do to "fix" this?

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users



[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux