Re: Problem with TLA ver > 887

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We did a tla install of 906 yesterday and the problem seems to have been resolved by that build.

Thanks and keep up the great work!
-Mic

Anand Avati wrote:
Mickey,
 Can you check if the latest tla code has resolved the issues you faced?

Thanks,
Avati

On Sun, Feb 8, 2009 at 11:19 PM, Mickey Mazarick <mic@xxxxxxxxxxxxxxxxxx> wrote:
  
Heh our tests are kind of an unholy mess... but here's the part I think is
useful:
We use a startup script that will iterate through vol files and mount the
first available file on the list. We have a bunch of vol files that test a
few different server configurations. After mountpoints are prepared we have
other scripts that start virtual machine on the various mounts.

 In other words I have a directory called "/glustermounts/" and in that
directory I have the files:
main.vol  main.vol.ib  main.vol.tcp stripe.vol.ha stripe.vol.tcp

after running "/etc/init.d/glustersystem start"  I will have the following
mount points:
/system     (our default mount, we actually store the vol files here)
/mnt/main
/mnt/stripe

The output shows me if any vol file failed to mount and it automatically
attempts the next one (ex" "mounting main.vol failed, trying main.vol.ib").
We simply arrange vol files from most features to least. We have a separate
script which starts up a virtual machine on each test mount. This is the
actual "test" we use as it creates symbolic links, uses mmaps etc but it's
pretty specific to us. This closely mirrors how we use it in production.

I've included out startup script and I would suggest you simply run
something similar to your production on a few mounts in the same way we
have. I may share this with the entire group although there are probably
better init scripts out there. This one does kill all processes attached to
a mount point which is useful. Let me know if you have any questions!

Thanks!

-Mickey Mazarick



Geoff Kassel wrote:

Hi,
   As a fellow GlusterFS user, I was just wondering if you could point me to
the regression tests you're using for GlusterFS?

   I've looked high and low for the unit tests that the GlusterFS devs are
meants to be using (ala http://www.gluster.org/docs/index.php/GlusterFS_QA)
so that I can do my own testing, but I've not been able to find them.

   If it's tests you've developed in-house, would you be interested in
releasing them to the wider community?

Kind regards,

Geoff Kassel.

On Thu, 5 Feb 2009, Mickey Mazarick wrote:


I haven't done any full regression testing to see where the problem is
but the later TLA versions are causeing out storage servers to spike to
100% cpu usage and the clients never see any files. Our initial tests
are with ibverbs/HA but no performance translators.

Thanks!
-Mickey Mazarick


--

#!/bin/sh
# Startup script for gluster Mount system
volFiles="/glustermounts/"
defaultcheckFile="customers"
speclist="/etc/glusterfs-system.vol.ibverbs /etc/glusterfs-system.vol.ha
/etc/glusterfs-system.vol.ibverbs /etc/glusterfs-system.vol.tcp"
start() {
       specfile=${1}
       if [ "$#" -gt 1 ]; then
               mountpt=${2}
       else
               mountpt=`echo ${specfile} |sed "s#\.vol.*\\\$##" |sed
"s#/.*/##"`
               mountpt="/mnt/${mountpt}"
       fi
       logfile=`echo ${specfile} |sed "s#\.vol.*\\\$##" |sed "s#/.*/##"`
       logfile="/var/${logfile}.log"
       pidfile=`echo ${specfile} |sed "s#\.vol.*\\\$##" |sed "s#/.*/##"`
       pidfile="/var/run/${pidfile}.pid"
       echo "mounting specfile:${specfile} at:${mountpt} with pid
at:${pidfile}"
       currentpids=`pidof glusterfs`
       currentpids="0 ${currentpids}"
       mountct=`mount |grep ${mountpt} |grep -c glusterfs`
       if [ -f $pidfile ]; then
               currentpid=`cat ${pidfile}`
               pidct=`echo "${currentpids}" |grep -c ${currentpid}`
               if [ "${pidct}" -eq 0 ]; then
                       rm -rf ${pidfile}
                       echo "removing pid file: ${pidfile}"
               fi
               if [ "${mountct}" -lt 1 ]; then
                       echo "Gluster System mount:${mountpt} died.
Remounting."
                       stop ${mountpt} ${pidfile}
               fi
       else
               rm -rf ${pidfile}
               if [ "${mountct}" -gt 0 ]; then
                       myupid=`ps -ef |grep /system |grep gluster |sed
"s#root\s*##" |sed "s#\s.*##"`
                       if [ "${myupid}" -gt 0 ]; then
                          echo "${myupid}" > ${pidfile}
                       else
                          echo "Gluster System mounted at:${mountpt} but
with no pid. Remounting."
                          stop ${mountpt} ${pidfile}
                       fi
               fi
       fi

       if [ -e $pidfile ]; then
               echo "Gluster System Mount:${mountpt} is running with spec:
${specfile}"
               #echo "Gluster System Mount:${mountpt} is running."
               return 0
       else
       #rm -rf /var/glustersystemclient.log
       modprobe fuse
       sleep 1.5
       #rm -rf /var/glustersystemclient.log
       mkdir ${mountpt}
       rm -rf $pidfile
       cmd="/usr/local/sbin/glusterfs -p $pidfile -l ${logfile} -L ERROR -f
${specfile} --disable-direct-io-mode ${mountpt}"
echo "${cmd}"
       ${cmd}
#/usr/local/sbin/glusterfs -p $pidfile -l ${logfile}
--volume-specfile=${specfile} --disable-direct-io-mode ${mountpt}
       #/usr/local/sbin/glusterfs -p $pidfile -l
/var/glustersystemclient.log -f $specfile --direct-io-mode=DISABLE /system
       fi
       return 1
}

checkStart() {
       mountdir=$1
       checkfile="total"
       if [ "$#" -gt 1 ]; then
          checkfile=$2
       fi
       lspid=0
       sleep 1
       counter=0
       countermax=15

       ls -l ${mountdir} &
       while [ "${lspid}" != "" ]
       do
         echo "waiting for gluster to come up... ${counter}"
         sleep 1
         lspid=`/sbin/pidof ls`
         let counter++
         if [ "${counter}" -eq "${countermax}" ]
         then
          lspid=""
         fi
       done
       if [ "${counter}" -lt "${countermax}" ]; then
         errorct=`ls ${mountdir} 2>&1 |grep -c "not connected"`
         if [ "${errorct}" -eq 1  ]; then
               counter=`echo ${countermax}`
         else
           glcount=`ls -l ${mountdir} |grep ${checkfile} -c`
           if [ "${glcount}" -lt 1 ]; then
               counter=`echo ${countermax}`
           fi
         fi
       fi

       if [ "${counter}" -eq "${countermax}" ]
       then
         echo "gluster FAILED to mount:${mountdir} with spec: ${specfile}"
         lspid=`/sbin/pidof ls`
         kill $lspid
         lspid=10
         return 0
       else
         echo "Gluster System Mount:${mountdir} is running with spec:
${specfile}"
         #echo "gluster sucessfully mounted:${mountdir} with spec:
${specfile}"
         return 1
       fi
}

StartSpeclist() {
       specfilelist="${1}"
       echo "Attempting to mount first of: (${specfilelist})"
       for file in $specfilelist
       do
               specfile="${file}"
               if [ "$#" -gt 1 ]; then
                       checkfile=${2}
               else
                       checkfile="total"
               fi
               if [ "$#" -gt 2 ]; then
                       mountpt=${3}
               else
                       mountpt=`echo ${specfile} |sed "s#\.vol.*\\\$##" |sed
"s#/.*/##"`
                       mountpt="/mnt/${mountpt}"
               fi

               start ${specfile} ${mountpt}
               if [ "$?" -eq "1" ]; then
                 checkStart ${mountpt} ${checkfile}
                 if [ "$?" -eq "0" ]; then
                       stop ${specfile} ${mountpt}
                 else
                       return 1
                 fi
               else
                       return 1
               fi
         done
       return 0
}

stop() {
       specfile1=${1}
       if [ "$#" -gt 1 ]; then
               mountpt=${2}
       else
               mountpt=`echo ${specfile1} |sed "s#\.vol.*\\\$##" |sed
"s#/.*/##"`
               mountpt="/mnt/${mountpt}"
       fi
       pidfile=`echo ${specfile1} |sed "s#\.vol.*\\\$##" |sed "s#/.*/##"`
       pidfile="/var/run/${pidfile}.pid"
       #runningpids=`lsof |grep ${mountpt} |sed "s#..........##" |sed "s#
.*##"`
#       for pid in `lsof |grep ${mountpt} |sed "s#\w*\s*##" |sed "s# .*##"`
       for pid in `lsof |grep ${mountpt} |sed
"s#..........\(......\).*#\1#"`
       do
               kill -9 $pid
       done
       #fuser -km /system
       echo "Stopping mount:${mountpt} spec:${specfile1}"
       umount -f ${mountpt}
       currentpid=`cat ${pidfile}`
       kill $currentpid
       rm -rf $pidfile
}

stopmp(){
       mountpt=${1}
       spec=`ps -ef |grep gluster |grep ${mountpt} |grep specfile |sed
"s#.*specfile=\(.*\)/s*.*#\1#"| sed "s# .*##"`
       stop "${spec}" "${mountpt}"
}

startAll() {
         StartSpeclist "${speclist}" "glustermounts" /system
         if [ "$?" -eq "0" ]; then
               echo "ERROR STARTING"
         else
               #for i in `ls -b ${volFiles}*.vol |sed s/.glustermounts.//`;
               for i in `ls -b ${volFiles}*.vol |sed s#${volFiles}##`;
               do
               list=`ls -C ${volFiles}${i}*`
               mountpt=`echo $i |sed s/\.vol//`
               StartSpeclist "${list}" "${defaultcheckFile}" /mnt/${mountpt}
               done

         fi

}

stopAll() {
       mountlist=`mount |grep glusterfs |sed "s#glusterfs on \(.*\)
type.*#\1#"`
       for mountpt in ${mountlist};
       do
         stopmp ${mountpt}
       done
       kill `pidof glusterfs`
}

case "$1" in
       start)
        if [ "$#" -gt 1 ]; then

           StartSpeclist "${2}" ${3} ${4}

        else
          startAll
        fi
           ;;

       stop)
        if [ "$#" -gt 1 ]; then
          stopmp ${2}
        else
          stopAll
          stopAll
        fi

           ;;

       status)
           status
           ;;
       restart)
           stop
           start
           ;;
       condrestart)
               stop
               start
           ;;

       *)
           echo $"Usage: $0 {start|stop|restart|condrestart|status}"
           exit 1

esac

exit $RETVAL

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxx
http://lists.nongnu.org/mailman/listinfo/gluster-devel


    


--

[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux