mailadministrator@canada.com (Anthony) writes: > Our Database is having errors. We are currently using PostgreSQL to > store 2.5 Million records per day. The average addition to our primary > table is 4.5 Gigs of data. > > We are doing this on a dual Opteron 244 system with 1 TeraByte of HDD > space. The drives are 250 Gig Western Digital. The Raid Controller is > LSI Logic MegaRaid 150-6. > > We are getting an error after about 4-5 days worth of data being put > into the system. > > ******************************************************* > ERROR: invalid page header in block 59305 of relation > "item_info_2004_04_leaf_category_1" > ******************************************************* > > Our Base Server Configuration is as follows. > PostgreSQL Version= 7.4.2 > x86_64-PC-Linux-GNU > Compiled with GCC 3.3.3 > XFS File System > Running on Gentoo Linux 3.3.3 Propolice-3.3-7 > > Any help on how to solve this probelm would be extremely appreciated. > > Even the potential that Tom Lane might respond to this is worth it. May I point you to the pg_filedump utility? <http://sources.redhat.com/rhdb/utilities.html> It can give you a fair idea of just where the system is blowing up. I experienced what sounds like the same problem with a system that was fairly similarly appointed with hardware, albeit with a few conspicuous differences... 1. PostgreSQL 7.4.1 2. FreeBSD 4.9 3. Berkeley FFS with soft updates 4. Quad-Xeon, 8GB RAM (only using 4GB of it :-() 5. AMI MegaRaid controller... 6. Slightly less disk; 12x74GB SCSI drives [root@hathi scsi]# cat /proc/scsi/megaraid/1 LSI Logic MegaRAID 1.74 254 commands 16 targs 7 chans 7 luns What I found in looking at the page with the "invalid page header" was that it was full of ASCII NUL values. We had previously had quite a bit of trouble with a different box with the same hardware configuration running RHAT 7.3, although when I replaced a 2.4.18 Linux kernel with 2.6.2, those problems evaporated. The only thing that we have been able to point to on the box in question is a hardware problem. In view of the disk being RAIDed, the causes seem to fall to three things being most likely sorts of culprits: 1. Perhaps the controller is "glitched;" 2. Perhaps the controller driver is "glitched;" 3. Perhaps there is a RAM problem. Notice that the list of suspects doesn't include any that actually relate to database software. Your best bet is to look for hardware problems. -- (reverse (concatenate 'string "gro.gultn" "@" "enworbbc")) http://cbbrowne.com/info/linuxxian.html Never take life seriously. Nobody gets out alive anyway. ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster