Re: Bug with ext3 journaling using Sparc hardware.

"Kevin Wang" <kevin34@xxxxxxxxxxxx> · Mon, 23 Apr 2007 12:16:27 +0800

Dear Sir,

one known issue with YS drives is this drive may failed easily in array
application.
and the update drive firmware should fix it.

Best Regards,

Kevin Wang

Areca Technology Tech-support Division
Tel : 886-2-87974060 Ext. 223
Fax : 886-2-87975970
Http://www.areca.com.tw
Ftp://ftp.areca.com.tw

----- Original Message ----- 
From: "Johannes Kullberg" <triak@xxxxxxxx>
To: "Kevin Wang" <kevin34@xxxxxxxxxxxx>
Cc: <sparclinux@xxxxxxxxxxxxxxx>; <eki@xxxxxx>;
<tuomas.leikola@xxxxxxxxxxxxx>; <joel.suovaniemi@xxxxxx>
Sent: Sunday, April 22, 2007 4:58 AM
Subject: Re: Bug with ext3 journaling using Sparc hardware.

>
> Hi Kevin,
> I suspect it's not a raidset problem. The stress-script runs weeks without
> errors, processing many terabytes of data. There's no changes in any smart
> attributes. Have you heard of similar problems using WD disks?
>
> -Johannes-
>
> > Dear Sir,
> >
> > after the problem happen, have you reboot system before ?
> > if not, please check the raidset member drives status in controller
> > management console.
> > there have some SMART value and two error counts.
> > does any drive reports error ?
> >
> > if your system reboot after problem happen, these two error counts will
be
> > reset.
> > you may needed to reproduce the problem then check the status.
> >
> > you can also do a volume check without fix to check the data inside.
> > and YS drive have firmware updated for array applications, have you
updated
> > the firmware already ?
> >
> >
> > Best Regards,
> >
> >
> > Kevin Wang
> >
> > Areca Technology Tech-support Division
> > Tel : 886-2-87974060 Ext. 223
> > Fax : 886-2-87975970
> > Http://www.areca.com.tw
> > Ftp://ftp.areca.com.tw
> >
> > ----- Original Message -----
> > From: "Johannes Kullberg" <triak@xxxxxxxx>
> > To: <sparclinux@xxxxxxxxxxxxxxx>; <ext3-users@xxxxxxxxxxxxxxx>;
> > <eki@xxxxxx>; <kevin34@xxxxxxxxxxxx>; <tuomas.leikola@xxxxxxxxxxxxx>
> > Sent: Thursday, April 19, 2007 4:52 AM
> > Subject: Bug with ext3 journaling using Sparc hardware.
> >
> >
> >> Hi guys,
> >> My fileserver project is giving me a headache.
> >> I have been struggling with this one for a long time now, and help is
> > needed.
> >> My goal is to have a stable Sparc-based fileserver with hardware RAID
and
> >> possibility to use SATA disk's.
> >> I'm experiencing severe filesystem breakage in certain situations.
> >>
> >> Setup:
> >> Sun Microsystems E450 2x300 Mhz / 512MB / OpenBoot 3.30
> >> Areca ARC-1160 fw. V1.42
> >> 8x Western Digital WD2500YS ( RAID6 1506 GB)
> >> Seagate SX336704LC root disk
> >> Intel Pro1000T
> >> Debian etch 2.6.20.1 smp
> >>
> >> Filesystems are created with theese commands:
> >>
> >> mkfs.ext3 -m 0 -L home /dev/sdb1
> >> mkfs.ext3 -m 0 -L srv /dev/sdb2
> >> mkfs.ext3 -m 0 -L store /dev/sdb3
> >> root filesystem formatted with Debian defaults
> >>
> >> Filesystem           1K-blocks      Used Available Use% Mounted on
> >> /dev/sda4             34416328  11627128  21040928  36% /
> >> tmpfs                   256552         0    256552   0% /lib/init/rw
> >> tmpfs                   256552         0    256552   0% /dev/shm
> >> /dev/sda1                90329     21922     63588  26% /boot
> >> /dev/sdb1             96122620    192312  95930308   1% /home
> >> /dev/sdb2             96122636    192312  95930324   1% /srv
> >> /dev/sdb3            1255375040 102169936 1153205104   9% /store
> >>
> >> I have been using the following script to test the filesystem:
> >>
> >> #!/bin/sh
> >> dir=/store
> >>
> >> iter=0
> >> while :; do
> >>    test -d $dir/iter-$iter && rm -rf $dir/iter-$iter
> >>    mkdir $dir/iter-$iter
> >>    cd $dir/iter-$iter
> >>     for i in 0 1 2 3 4 5 6 7 8 9; do
> >>       (mkdir d$i && cd d$i && tar xf /root/root.tar) &
> >>    done
> >>    wait
> >>    du -s $dir/iter-$iter
> >>    if [ $iter == 7 ]; then
> >>      echo "pass $iter, disk almost full, removing test data.."
> >>      rm -rf /$dir/iter-*
> >>      iter=0
> >>    else
> >>      echo "pass $iter, untarring the next round.."
> >>    fi
> >>    iter=$(expr $iter + 1)
> >> done
> >>
> >> root.tar is the whole Redhat root directory tarred in to ~9GB package.
> >> The test runs without problems for weeks, processing many terabytes.
> >> But then...!!
> >> Copying files from another partition or running fsfuzzer (after a short
> >> period) breaks the filesystem beyond repair.
> >> The following errors appear right after the copy process is done:
> >>
> >> EXT3-fs error (device sdb1): htree_dirblock_to_tree: bad entry in
> > directory #2: rec_len is smaller than minimal - offset=0, inode=0,
> > rec_len=0, name_len=0
> >> EXT3-fs error (device sdb2): htree_dirblock_to_tree: bad entry in
> > directory #2: rec_len is smaller than minimal - offset=0, inode=0,
> > rec_len=0, name_len=0
> >> EXT3-fs error (device sdb3): htree_dirblock_to_tree: bad entry in
> > directory #2: rec_len is smaller than minimal - offset=0, inode=0,
> > rec_len=0, name_len=0
> >> journal_bmap: journal block not found at offset 5883 on sdb2
> >> Aborting journal on device sdb2.
> >>
> >> How is this possible? The stress-script does not break anything and it
> >> handels billions of bytes.
> >>
> >> I run fsck.ext3 on sdb1..2..3 with lots of errors (output included as
> >> attatchment). I can mount the partition as ext2, trying to mount ext3
> >> gives an error:
> >>
> >> ext3: No journal on filesystem on sdb
> >>
> >> Any suggestions?
> >>
> >> TIA: Johannes
> >
> > -
> > To unsubscribe from this list: send the line "unsubscribe sparclinux" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >

-
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html