On 02/28/2013 05:58 AM, Torbj?rn Thorsen wrote: > On Wed, Feb 27, 2013 at 9:46 PM, Brian Foster <bfoster at redhat.com> wrote: >> On 02/27/2013 10:14 AM, Torbj?rn Thorsen wrote: >>> I'm seeing less-than-stellar performance on my Gluster deployment when >>> hosting VM images on the FUSE mount. ... > > I'm not familiar with the profiling feature, but I think I'm seeing > the same thing, > requests being fractured in smaller ones. > gluster profiling is pretty straight forward. Just run the commands as described and you can dump some stats on the workload the volume is seeing: http://www.gluster.org/community/documentation/index.php/Gluster_3.2:_Running_Gluster_Volume_Profile_Command The 'info' command will print the stats since the last info invocation, so you can easily compare results between different workloads provided the volume is otherwise idle. > However, by chance I found something which seems to impact the > performance even more. > I wanted to retry the dd-to-loop-device-with-sync today, the same one > I pasted yesterday. > However, today it was quite different. > > torbjorn at xen01:/srv/ganeti/shared-file-storage/tmp$ sudo dd > if=/dev/zero of=/dev/loop1 bs=1024k count=2000 oflag=sync > 303038464 bytes (303 MB) copied, 123.95 s, 2.4 MB/s > ^C > I started testing on a slightly more up to date VM. I'm seeing fairly consistent 10MB/s with sync I/O. This is with a loop device over a a file on a locally mounted gluster volume. > So I unmounted the loop device and mounted it again, and re-ran the test. > > torbjorn at xen01:/srv/ganeti/shared-file-storage/tmp$ sudo losetup -d /dev/loop1 > torbjorn at xen01:/srv/ganeti/shared-file-storage/tmp$ sudo losetup -f > loopback.img > torbjorn at xen01:/srv/ganeti/shared-file-storage/tmp$ sudo dd > if=/dev/zero of=/dev/loop1 bs=1024k count=2000 oflag=sync > 2097152000 bytes (2.1 GB) copied, 55.9117 s, 37.5 MB/s > I can reproduce something like this when dealing with non-sync I/O. Smaller overall writes (relative to available cache) run much faster and larger write tend to normalize to a lower value. Using xfs_io instead of dd shows that writes are in fact hitting cache (e.g., smaller writes complete at 1.5GB/s, larger writes normalize to 35MB/s when we've dirtied enough memory and flushing/reclaim kicks in). It also appears that a close() on the loop device results in aggressively flushing whatever data hasn't been flushed (something fuse also does on open()). My non-sync results in dd tend to jump around, so perhaps that is a reason why. > The situation inside the Xen instance was similar, although with > different numbers. > > After being on, but mostly idle, for ~5 days.: > torbjorn at hennec:~$ sudo dd if=/dev/zero of=bigfile bs=1024k count=2000 > oflag=direct > 28311552 bytes (28 MB) copied, 35.1314 s, 806 kB/s > ^C > > After reboot and a fresh loop device: > torbjorn at hennec:~$ sudo dd if=/dev/zero of=bigfile bs=1024k count=2000 > oflag=direct > 814743552 bytes (815 MB) copied, 34.7441 s, 23.4 MB/s > ^C > > These numbers might indicate that loop device performance degrades over time. > However, I haven't seen this on local filesystems, so is this possibly > only with files on Gluster or FUSE ? I would expect this kind of behavior when caching is involved, as described above, but I'm not quite sure what would cause it with sync I/O. > > I'm on Debian stable, so things aren't exactly box fresh. > torbjorn at xen01:~$ dpkg -l | grep "^ii linux-image-$(uname -r)" > ii linux-image-2.6.32-5-xen-amd64 2.6.32-46 > Linux 2.6.32 for 64-bit PCs, Xen dom0 support > > I'm not sure how to debug the Gluster -> FUSE -> loop device interaction, > but I might try a newer kernel on the client. >