I can't speak for SuSe issues but I believe there is some confusion on the packages and command syntax. So hang on, we are going for a ride, step by step... Check and repair are not packages per say. You should have a package called echo. If you run this; echo 1 Should get a 1 echoed back at you. For example; [root@gateway]# echo 1 1 Or anything else you want; [root@gateway]# echo check check Now all we are doing with this is redirecting with the ">>" to another location, /sys/block/md0/md/sync_action The difference between a double >> and a single > is the >> will append it to the end and the single > will replace the contents of the file with the value. For example; I will create a file called foo; [root@gateway tmp]# vi foo In this file I add two lines of text, foo, than I will write and quit :wq Now I will take a look at the file I just made with my vi editor... [root@gateway tmp]# cat foo foo foo Great, now I run my echo command to send another value to it. First I use the double >> to just append; [root@gateway tmp]# echo foo2 >> foo Now I take another look at the file; [root@gateway tmp]# cat foo foo foo foo2 So, I have my first two text lines the third line "foo2" appended. Now I do this again but use just the single > to replace the file with a value. [root@gateway tmp]# echo foo3 > foo Than I look at it again; [root@gateway tmp]# cat foo foo3 Ahh, all the other lines are gone and now I just have foo3. So, > replaces and >> appends. How does this affect your /sys/block/md0/md/sync_action file? As it turns out, it does not matter. Think of the proc and sys (/proc and /sys) as psuedo file system is a real time, memory resident file system that tracks the processes running on your machine and the state of your system. So first lets go to /sys/block/ Than I will list its contents; [root@gateway ~]# cd /sys/block/ [root@gateway block]# ls dm-0 dm-3 hda md1 ram0 ram11 ram14 ram3 ram6 ram9 sdc sdf sdi dm-1 dm-4 hdc md2 ram1 ram12 ram15 ram4 ram7 sda sdd sdg dm-2 dm-5 md0 md3 ram10 ram13 ram2 ram5 ram8 sdb sde sdh This will be different for you since your system will have different hardware and settings, again a pseudo file system. The dm stuff are my logical volumes and you might have more or less sata drives, the sda, sdb, ... these were created when I boot the system. If I add another sata drive, another sdj will be created automatically for me. So depending on how many raid devices you have (I have four, /boot, swa, /, and my RAID6 data, (md0, md1, md2, md3)) they are listed here too. So lets go into one, my swap RAID, md1, is small so let go to that one and test this out; [root@gateway md1]# ls dev holders md range removable size slaves stat uevent Lets go deeper, [root@gateway md1]# cd /sys/block/md1/md/ [root@gateway md]# ls chunk_size dev-hdc1 mismatch_cnt rd0 suspend_lo sync_speed component_size level new_dev rd1 sync_action sync_speed_max dev-hda1 metadata_version raid_disks suspend_hi sync_completed sync_speed_min Now lets look at sync_action; [root@gateway md]# cat sync_action idle That is the pseudo file the represents the current state of my RAID md1. So lets run that echo command and than lets check the state of the RAID; [root@gateway md]# echo check > sync_action [root@gateway md]# cat /proc/mdstat Personalities : [raid1] [raid6] md1 : active raid1 hdc1[1] hda1[0] 104320 blocks [2/2] [UU] [============>........] resync = 62.7% (65664/104320) finish=0.0min speed=65664K/sec So it is in resync state and if there are bad blocks they will be correct from parity. Now once it is done, lets check that sync_action file again. [root@gateway md]# cat sync_action idle Now remember we used the single redirect, so we replace the value with the text of "check" with our echo command. Once it was done with the resync, my system changed the value back to "idle". What about the double ">>" well they append to the file but it will have the over all same effect... [root@gateway md]# echo check >> sync_action [root@gateway md]# cat /proc/mdstat Personalities : [raid1] [raid6] md1 : active raid1 hdc1[1] hda1[0] 104320 blocks [2/2] [UU] [=========>...........] resync = 49.0% (52096/104320) finish=0.0min speed=52096K/sec When it is done the value goes back to idle; [root@gateway md]# cat sync_action idle So, > or >> does not matter here. And the command you need is echo. Manipulating the pseudo files in /proc are similar. Say for example, for security, I don't want my box to respond to pings (1 is for true and 0 is for false), echo 0 > /proc/sys/net/ipv4/icmp_echo_ignore_all In this case, you want the single > because you want to replace the current value to 1 and not the >> for append. Also another pseudo file for turning you linux box into a router; echo 1 > /proc/sys/net/ipv4/ip_forward As for SuSe updating your kernel, removing your original one and breaking your box by dropping you to a limited shell on boot up.. I can't help you much there. I don't have SuSe but as I understand, they are a good distro. In my current distro, Fedora, you can tell the update manager to not update the kernel. Also in Fedora, it will keep your old kernel by default so if there was an issue, you can select to go back to it in the grub boot up menu. I believe Ubuntu is similar. I bet you could configure SuSe to do the same. I hope that clears up some confusion and good luck. Dan. -----Original Message----- From: Michael [mailto:big_green_jelly_bean@xxxxxxxxx] Sent: Friday, July 13, 2007 11:48 AM To: Daniel Korstad Cc: davidsen; linux-raid Subject: Re: Software based SATA RAID-5 expandable arrays? RESPONSE I had everything working, but it is evident that when I installed SuSe the first time check and repair where not included in the package:( I did not use the ">>" I used ">", as was incorrectly stated in many documentations I set up. The thing that made me suspect check and repair wasn't part of sues was the failure of "check" or "repair" typed at the command prompt to respond in any kind other then a response that stated their was no command. In addition man check and man repair was also missing. BROKEN! I did an auto update of the SuSe machine, which ended up replacing the kernel. They added the new entries to the boot choices but the mount information was not transfered. SuSe also deleted the original kernel boot setup. When suse looked at the drives individually they found that none of them was recognizable. Therefor when I woke up this morning and rebooted the machine after the update, I received the errors and then dumps me to a basic prompt with limited ability to do anything. I know I need to manually remount the drives, but its going to be a challenge since I did not do this in the past. The answer to this question is that I either have to change distro's (which I am tempted to do) or fix the current distro. Please do not bother providing any solutions for I simply have to RTFM (which I haven't had time to do). I think I am going to reset up my machines. The first two drives with identical boot partitions, yet not mirror them. I can then manually run a "tree" copy that would update my second drive as I grow the system, and after successfull and needed updates. This would then allow me a fall back after any updates, and with simply swapping SATA drive cables from the first boot drive too the second. I am assuming this will work. I then can RAID-6 (or 5) in the setup, recopy my files (yes I haven't deleted them because I am not confident in my ability with Linux yet.). Hopefully I will just simply remount these 4 drives because there a simple raid 5 array. SUSE's COMPLETE FAILURES This frustration with SuSe, the lack of a simple reliable update utility and the failures I experience has discouraged me from using SuSe at all. Its got some amazing tools that help me from constantly looking up documentation, posting to forums, or going to IRC, but the unreliable upgrade process is a deal breaker for me. Its simply to much work to manually update everything. This project had a simple goal, which was to provide an easy and cheap solution to an unlimited NAS service. SUPPORT In addition, SuSe's IRC help channel is among the worst I have encountered. The level of support is often very good, but the level of harassment, flames and simple childish behavior overcomes almost any attempt at providing any level of support. I have no problem giving back to the community when I learn enough to do so, but I will not be mocked for my inability to understand a new and very in depth system. In fact, I tend to goto the wonderful gentoo irc for my answers. The IRC is amazing, the people patient and encouraging, the level of knowledge is the best I have experienced. This resource, outside the original incident, has been an amazing resource. I feel highly confident asking questions about RAID here, because I know you guys are actually RUNNING systems that I am attempting to do. ----- Original Message ---- From: Daniel Korstad <dan@xxxxxxxxxxx> To: big.green.jelly.bean <big_green_jelly_bean@xxxxxxxxx> Cc: davidsen <davidsen@xxxxxxx>; linux-raid <linux-raid@xxxxxxxxxxxxxxx> Sent: Friday, July 13, 2007 11:22:45 AM Subject: RE: Software based SATA RAID-5 expandable arrays? To run it manually; echo check >> /sys/block/md0/md/sync_action than you can check the status with; cat /proc/mdstat Or to continually watch it, if you want (kind of boring though :) ) watch cat /proc/mdstat This will refresh ever 2sec. In my original email I suggested to use a crontab so you don't need to remember to do this every once in a while. Run (I did this in root); crontab -e This will allow you to edit you crontab. Now past this command in there; 30 2 * * Mon echo >> check /sys/block/md0/md/sync_action If you want you can add comments, I like to comment my stuff since I have lots of stuff in mine, just make sure you have '#' in the front of the lines so your system knows it is just a comment and not a command it should run; #check for bad blocks once a week (every Mon at 2:30am) #if bad blocks are found, they are corrected from parity information After you have put this in your crontab, write and quit with this command; :wq It should come back with this; [root@gateway ~]# crontab -e crontab: installing new crontab Now you can look at your cron table (without editing) with this; crontab -l It should return something like this, depending if you added comments or how you scheduled your command; #check for bad blocks once a week (every Mon at 2:30am) #if bad blocks are found, they are corrected from parity information 30 2 * * Mon echo >> check /sys/block/md0/md/sync_action For more info on crontab and syntax for times (I just did a google and grabbed the first couple links...); http://www.tech-geeks.org/contrib/mdrone/cron&crontab-howto.htm http://ubuntuforums.org/showthread.php?t=102626&highlight=cron Cheers, Dan. -----Original Message----- From: Michael [mailto:big_green_jelly_bean@xxxxxxxxx] Sent: Thursday, July 12, 2007 5:43 PM To: Bill Davidsen; Daniel Korstad Cc: linux-raid@xxxxxxxxxxxxxxx Subject: Re: Software based SATA RAID-5 expandable arrays? SuSe uses its own version of cron which is different then everything else I have seen, and the documentation is horrible. However they provide a wonderfull xwindows utility that helps set them up... the problem Im having is figuring out what to run. When I try to run "/sys/block/md0/md/sync_action" under a prompt it shoots out a permission denied even though I am SU or logged in under Root. Very annoying. You mention Check vrs Repair... which brings me too my last issue on setting up this machine. How do you send an email when Check, SMART, and when a RAID drive fails? How do you auto repair if the Check fails? These are the last things I need to do for my Linux Server to work right... after I get all of this done, I will change the boot to goto the command prompt and not XWindows, and I will leave it in the corner of my room hopefully not to be used for as long as possible. ----- Original Message ---- From: Bill Davidsen <davidsen@xxxxxxx> To: Daniel Korstad <dan@xxxxxxxxxxx> Cc: Michael <big_green_jelly_bean@xxxxxxxxx>; linux-raid@xxxxxxxxxxxxxxx Sent: Wednesday, July 11, 2007 10:21:42 AM Subject: Re: Software based SATA RAID-5 expandable arrays? Daniel Korstad wrote: > You have lots of options. This will be a lengthy response and will give just some ideas for just some of the options... > > Just a few thoughts below interspersed with your comments. > For my server, I had started out with a single drive. I later migrated to migrate to a RAID 1 mirror (after having to deal with reinstalls after drive failures I wised up). Since I already had an OS that I wanted to keep, my RAID-1 setup was a bit more involved. I following this migration to get me there; > http://wiki.clug.org.za/wiki/RAID-1_in_a_hurry_with_grub_and_mdadm > > Since you are starting from scratch, it should be easier for you. Most distros will have an installer that will guide you though the process. When you get to hard drive partitioning, look for an advance option or review and modify partition layout option or something similar otherwise it might just make a guess of what you want and that would not be RAID. In this advance partition setup, you will be able to create your RAID. First you make equal size partitions on both physical drives. For example, first carve out 100M partition on each of the two physical OS drives, than make a RAID 1 md0 with each of this partitions and than make this your /boot. Do this again for other partitions you want to have RAIDed. You can do this for /boot, /var, /home, /tmp, /usr. This is can be nice to have a separations incase a user fills /home/foo with crap and this will not effect other parts of the OS, or if mail spool fills up, it will not hang the OS. Only problem it determining how big to make them during the install. At a minimum, I would do three partitions; /boot, swap, and / This means all the others (/var, /home, /tmp, /usr) are in the / partition but this way you don't have to worry about sizing them all correctly. > > For the simplest setup, I would do RAID 1 for /boot (md0), swap (md1), and / (md2) (Alternatively, your could make a swap file in / and not have a swap partition, tons of options...) Do you need to RAID your swap? Well, I would RAID it or make a swap file within a RAID partition. If you don't and your system is using swap and you lose a drive that has swap information/partition on it, you might have issues depending on how important that information in the failed drive was. You systems might hang. > > Note that RAID-10 generally performs better than mirroring, particularly when more than a few drives are involved. This can have performance implications for swap, when large i/o pushes program pages out of memory. The other side of that coin is that "recovery CDs" don't seem to know how to use RAID-10 swap, which might be an issue on some systems. > After you go through the install and have a bootable OS that is running on mdadm RAID, I would test it to make sure grub was installed correctly to both the physical drives. If grub is not installed to both drives, and you lose one drive down the road and if that one was the one with grub, you will have a system that will not boot even though it has a second drive with a copy of all the files. If this were to happen, you can recover by booting with a bootable linux CD or recover disk and manually installing grub too. For example say you only had grub installed to hda and it failed, boot with a live linux cd and type (assuming /dev/hdd is the surviving second drive); > grub > device (hd0) /dev/hdd > root (hd0,0) > setup (hd0) > quit > You say you are using two 500G drives for the OS. You don't necessary have to use all the space for the OS. You can make your partitions and take the left over space and throw it into a logical volume. This logical volume would not be fault tolerant, but would be the sum of the left over capacity from both drives. For example, you use 100M for /boot and 200G for / and 2G for swap. Take the rest and make a standard ext3 partition for the remaining space on both drives and put them in a logical volume giving over 500G to play with for non critical crap. > > Why do I use RAID6? For the extra redundancy and I have 10 drives in my arrary. > I have been an advocate for RAID 6, especially with the every increasing drive capacity and the number of drives in the array is above say six; > http://www.intel.com/technology/magazine/computing/RAID-6-0505.htm > > Other configurations will perform better for writes, know your i/o performance requirements. > http://storageadvisors.adaptec.com/2005/10/13/raid-5-pining-for-the-fjords/ > "...for using RAID-6, the single biggest reason is based on the chance of drive errors during an array rebuild after just a single drive failure. Rebuilding the data on a failed drive requires that all the other data on the other drives be pristine and error free. If there is a single error in a single sector, then the data for the corresponding sector on the replacement drive cannot be reconstructed. Data is lost. In the drive industry, the measurement of how often this occurs is called the Bit Error Rate (BER). Simple calculations will show that the chance of data loss due to BER is much greater than all the other reasons combined. Also, PATA and SATA drives have historically had much greater BERs, i.e., more bit errors per drive, than SCSI and SAS drives, causing some vendors to recommend RAID-6 for SATA drives if they¢re used for mission critical data." > > Since you are using only four drives for your data array, the overhead for RAID6 (two drives for parity) might not be worth it. > > With four drives you would be just fine with a RAID5. > However, I would make a cron for the command to run every once in awhile. Add this to your crontab... > > #check for bad blocks once a week (every Mon at 2:30am)if bad blocks are found, they are corrected from parity information > 30 2 * * Mon echo check /sys/block/md0/md/sync_action > > With this, you will keep hidden bad blocks to a minimum and when a drive fails, you won't be likely bitten by a hidden bad block(s) during a rebuild. > > I think a comment on "check" vs. "repair" is appropriate here. At the least "see the man page" is appropriate. > For your data array, I would make one partition of Linux raid (FD) and have one partition for the whole drive in each physical drive. Than create your raid. > > mdadm --create /dev/md3 -l 5 -n 4 /dev/<your data drive1-partition> /dev/<your data drive2-partition> /dev/<your data drive3-partition> /dev/<your data drive4-partition> <---the /dev/md3 can be what you want and will depend on how many other previous raid arrays you have, so long as you use a number not currently used. > > My filesystem of choice is XFS, but you get to pick your own poison: > mkfs.xfs /-f /dev/md3 > > Mount the device : > mount /dev/md3 /foo > > I would edit your /etc/fstab to have it automounted for each startup. > > Dan. > Other misc comments: mirroring your boot partition on drives which the BIOS won't use is a waste of bytes. If you have more than, say four, drives fail to function you probably have a system problem other than disk. And some BIOS versions will boot a secondary drive if the primary fails hard but not if it has a parity or other error, which can enter a retry loop (I *must* keep trying to boot). This behavior can be seen on at least one major server hardware from a big name vendor, it's not just cheap desktops. The solution, ugly as it is, is to use the firmware "RAID" on the motherboard controller for boot, and I have several systems with low cost small PATA drives in mirror just for boot (after which they are spun down with hdparm settings) for this reason. Really good notes, people should hang onto them! -- bill davidsen <davidsen@xxxxxxx> CTO TMR Associates, Inc Doing interesting things with small computers since 1979 ____________________________________________________________________________________ Looking for a deal? Find great prices on flights and hotels with Yahoo! FareChase. http://farechase.yahoo.com/ ____________________________________________________________________________________ Don't pick lemons. See all the new 2007 cars at Yahoo! Autos. http://autos.yahoo.com/new_cars.html - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html