We did a tla install of 906 yesterday and the problem seems to have
been resolved by that build. Thanks and keep up the great work! -Mic Anand Avati wrote: Mickey, Can you check if the latest tla code has resolved the issues you faced? Thanks, Avati On Sun, Feb 8, 2009 at 11:19 PM, Mickey Mazarick <mic@xxxxxxxxxxxxxxxxxx> wrote:Heh our tests are kind of an unholy mess... but here's the part I think is useful: We use a startup script that will iterate through vol files and mount the first available file on the list. We have a bunch of vol files that test a few different server configurations. After mountpoints are prepared we have other scripts that start virtual machine on the various mounts. In other words I have a directory called "/glustermounts/" and in that directory I have the files: main.vol main.vol.ib main.vol.tcp stripe.vol.ha stripe.vol.tcp after running "/etc/init.d/glustersystem start" I will have the following mount points: /system (our default mount, we actually store the vol files here) /mnt/main /mnt/stripe The output shows me if any vol file failed to mount and it automatically attempts the next one (ex" "mounting main.vol failed, trying main.vol.ib"). We simply arrange vol files from most features to least. We have a separate script which starts up a virtual machine on each test mount. This is the actual "test" we use as it creates symbolic links, uses mmaps etc but it's pretty specific to us. This closely mirrors how we use it in production. I've included out startup script and I would suggest you simply run something similar to your production on a few mounts in the same way we have. I may share this with the entire group although there are probably better init scripts out there. This one does kill all processes attached to a mount point which is useful. Let me know if you have any questions! Thanks! -Mickey Mazarick Geoff Kassel wrote: Hi, As a fellow GlusterFS user, I was just wondering if you could point me to the regression tests you're using for GlusterFS? I've looked high and low for the unit tests that the GlusterFS devs are meants to be using (ala http://www.gluster.org/docs/index.php/GlusterFS_QA) so that I can do my own testing, but I've not been able to find them. If it's tests you've developed in-house, would you be interested in releasing them to the wider community? Kind regards, Geoff Kassel. On Thu, 5 Feb 2009, Mickey Mazarick wrote: I haven't done any full regression testing to see where the problem is but the later TLA versions are causeing out storage servers to spike to 100% cpu usage and the clients never see any files. Our initial tests are with ibverbs/HA but no performance translators. Thanks! -Mickey Mazarick -- #!/bin/sh # Startup script for gluster Mount system volFiles="/glustermounts/" defaultcheckFile="customers" speclist="/etc/glusterfs-system.vol.ibverbs /etc/glusterfs-system.vol.ha /etc/glusterfs-system.vol.ibverbs /etc/glusterfs-system.vol.tcp" start() { specfile=${1} if [ "$#" -gt 1 ]; then mountpt=${2} else mountpt=`echo ${specfile} |sed "s#\.vol.*\\\$##" |sed "s#/.*/##"` mountpt="/mnt/${mountpt}" fi logfile=`echo ${specfile} |sed "s#\.vol.*\\\$##" |sed "s#/.*/##"` logfile="/var/${logfile}.log" pidfile=`echo ${specfile} |sed "s#\.vol.*\\\$##" |sed "s#/.*/##"` pidfile="/var/run/${pidfile}.pid" echo "mounting specfile:${specfile} at:${mountpt} with pid at:${pidfile}" currentpids=`pidof glusterfs` currentpids="0 ${currentpids}" mountct=`mount |grep ${mountpt} |grep -c glusterfs` if [ -f $pidfile ]; then currentpid=`cat ${pidfile}` pidct=`echo "${currentpids}" |grep -c ${currentpid}` if [ "${pidct}" -eq 0 ]; then rm -rf ${pidfile} echo "removing pid file: ${pidfile}" fi if [ "${mountct}" -lt 1 ]; then echo "Gluster System mount:${mountpt} died. Remounting." stop ${mountpt} ${pidfile} fi else rm -rf ${pidfile} if [ "${mountct}" -gt 0 ]; then myupid=`ps -ef |grep /system |grep gluster |sed "s#root\s*##" |sed "s#\s.*##"` if [ "${myupid}" -gt 0 ]; then echo "${myupid}" > ${pidfile} else echo "Gluster System mounted at:${mountpt} but with no pid. Remounting." stop ${mountpt} ${pidfile} fi fi fi if [ -e $pidfile ]; then echo "Gluster System Mount:${mountpt} is running with spec: ${specfile}" #echo "Gluster System Mount:${mountpt} is running." return 0 else #rm -rf /var/glustersystemclient.log modprobe fuse sleep 1.5 #rm -rf /var/glustersystemclient.log mkdir ${mountpt} rm -rf $pidfile cmd="/usr/local/sbin/glusterfs -p $pidfile -l ${logfile} -L ERROR -f ${specfile} --disable-direct-io-mode ${mountpt}" echo "${cmd}" ${cmd} #/usr/local/sbin/glusterfs -p $pidfile -l ${logfile} --volume-specfile=${specfile} --disable-direct-io-mode ${mountpt} #/usr/local/sbin/glusterfs -p $pidfile -l /var/glustersystemclient.log -f $specfile --direct-io-mode=DISABLE /system fi return 1 } checkStart() { mountdir=$1 checkfile="total" if [ "$#" -gt 1 ]; then checkfile=$2 fi lspid=0 sleep 1 counter=0 countermax=15 ls -l ${mountdir} & while [ "${lspid}" != "" ] do echo "waiting for gluster to come up... ${counter}" sleep 1 lspid=`/sbin/pidof ls` let counter++ if [ "${counter}" -eq "${countermax}" ] then lspid="" fi done if [ "${counter}" -lt "${countermax}" ]; then errorct=`ls ${mountdir} 2>&1 |grep -c "not connected"` if [ "${errorct}" -eq 1 ]; then counter=`echo ${countermax}` else glcount=`ls -l ${mountdir} |grep ${checkfile} -c` if [ "${glcount}" -lt 1 ]; then counter=`echo ${countermax}` fi fi fi if [ "${counter}" -eq "${countermax}" ] then echo "gluster FAILED to mount:${mountdir} with spec: ${specfile}" lspid=`/sbin/pidof ls` kill $lspid lspid=10 return 0 else echo "Gluster System Mount:${mountdir} is running with spec: ${specfile}" #echo "gluster sucessfully mounted:${mountdir} with spec: ${specfile}" return 1 fi } StartSpeclist() { specfilelist="${1}" echo "Attempting to mount first of: (${specfilelist})" for file in $specfilelist do specfile="${file}" if [ "$#" -gt 1 ]; then checkfile=${2} else checkfile="total" fi if [ "$#" -gt 2 ]; then mountpt=${3} else mountpt=`echo ${specfile} |sed "s#\.vol.*\\\$##" |sed "s#/.*/##"` mountpt="/mnt/${mountpt}" fi start ${specfile} ${mountpt} if [ "$?" -eq "1" ]; then checkStart ${mountpt} ${checkfile} if [ "$?" -eq "0" ]; then stop ${specfile} ${mountpt} else return 1 fi else return 1 fi done return 0 } stop() { specfile1=${1} if [ "$#" -gt 1 ]; then mountpt=${2} else mountpt=`echo ${specfile1} |sed "s#\.vol.*\\\$##" |sed "s#/.*/##"` mountpt="/mnt/${mountpt}" fi pidfile=`echo ${specfile1} |sed "s#\.vol.*\\\$##" |sed "s#/.*/##"` pidfile="/var/run/${pidfile}.pid" #runningpids=`lsof |grep ${mountpt} |sed "s#..........##" |sed "s# .*##"` # for pid in `lsof |grep ${mountpt} |sed "s#\w*\s*##" |sed "s# .*##"` for pid in `lsof |grep ${mountpt} |sed "s#..........\(......\).*#\1#"` do kill -9 $pid done #fuser -km /system echo "Stopping mount:${mountpt} spec:${specfile1}" umount -f ${mountpt} currentpid=`cat ${pidfile}` kill $currentpid rm -rf $pidfile } stopmp(){ mountpt=${1} spec=`ps -ef |grep gluster |grep ${mountpt} |grep specfile |sed "s#.*specfile=\(.*\)/s*.*#\1#"| sed "s# .*##"` stop "${spec}" "${mountpt}" } startAll() { StartSpeclist "${speclist}" "glustermounts" /system if [ "$?" -eq "0" ]; then echo "ERROR STARTING" else #for i in `ls -b ${volFiles}*.vol |sed s/.glustermounts.//`; for i in `ls -b ${volFiles}*.vol |sed s#${volFiles}##`; do list=`ls -C ${volFiles}${i}*` mountpt=`echo $i |sed s/\.vol//` StartSpeclist "${list}" "${defaultcheckFile}" /mnt/${mountpt} done fi } stopAll() { mountlist=`mount |grep glusterfs |sed "s#glusterfs on \(.*\) type.*#\1#"` for mountpt in ${mountlist}; do stopmp ${mountpt} done kill `pidof glusterfs` } case "$1" in start) if [ "$#" -gt 1 ]; then StartSpeclist "${2}" ${3} ${4} else startAll fi ;; stop) if [ "$#" -gt 1 ]; then stopmp ${2} else stopAll stopAll fi ;; status) status ;; restart) stop start ;; condrestart) stop start ;; *) echo $"Usage: $0 {start|stop|restart|condrestart|status}" exit 1 esac exit $RETVAL _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxx http://lists.nongnu.org/mailman/listinfo/gluster-devel --
|