-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 If you have a backup, the easiest way would be to restore it. There is also a way to run the database logfile into the database from a point in time (ie. from the time f last backup) so that you can get your data. I've never actually seen it work though. Peter Petrov wrote: > Hi, > > Today one of the disk was marked as as failed .... and now some files > are corrupted. > I've decided to copy the pgsqldata directory and try to fix PG_VERSION > (see below for information - what PostgreSQL don't like) files ... and > see if the database will come up. > During copying files and etc. I'll be open for any other idea how to > deal with the problem ;) > > PostgreSQL's log offer me to run initdb (HINT message from LOG file) - > what will happen if then I try to copy the rest ot the structure into > the newly created database cluster ? > > linux (Slackware 12.0.0), software RAID5 (partition based) + PostgreSQL > 8.3.0: > > Here's what happen (from dmesg): > > --------------------------------------- > # uname -a > Linux xeonito 2.6.21.5 #3 SMP Tue Oct 2 16:20:48 EEST 2007 i686 Intel(R) > Xeon(R) CPU E5335 @ 2.00GHz GenuineIntel GNU/Linux > > --------------------------------------- > # dmesg > sd 0:0:3:0: SCSI error: return code = 0x08000002 > sdd: Current: sense key=0x4 > ASC=0x44 ASCQ=0x0 > Info fld=0x0 > end_request: I/O error, dev sdd, sector 159620863 > sd 0:0:3:0: SCSI error: return code = 0x08000002 > sdd: Current: sense key=0x4 > ASC=0x44 ASCQ=0x0 > Info fld=0x0 > end_request: I/O error, dev sdd, sector 159617119 > raid5: Disk failure on sdd1, disabling device. Operation continuing on 4 > devices > ...... > > RAID5 conf printout: > --- rd:5 wd:4 > disk 0, o:1, dev:sdb1 > disk 1, o:1, dev:sdc1 > disk 2, o:0, dev:sdd1 > disk 3, o:1, dev:sde1 > disk 4, o:1, dev:sdf1 > RAID5 conf printout: > --- rd:5 wd:4 > disk 0, o:1, dev:sdb1 > disk 1, o:1, dev:sdc1 > disk 3, o:1, dev:sde1 > disk 4, o:1, dev:sdf1 > > --------------------------------------- > > # cat /proc/mdstat > Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] > [raid4] [multipath] [faulty] > md1 : active raid5 sdb1[0] sdf1[4] sde1[3] sdd1[5](F) sdc1[1] > 585924608 blocks level 5, 8192k chunk, algorithm 2 [5/4] [UU_UU] > > md0 : active raid5 sdb2[0] sdf2[4] sde2[3] sdd2[5](F) sdc2[1] > 390053888 blocks level 5, 1024k chunk, algorithm 2 [5/4] [UU_UU] > > unused devices: <none> > > --------------------------------------- > > And here's what the partitions look like: > > # fdisk -l /dev/sdb > > Disk /dev/sdb: 249.8 GB, 249865175040 bytes > 255 heads, 63 sectors/track, 30377 cylinders > Units = cylinders of 16065 * 512 = 8225280 bytes > > Device Boot Start End Blocks Id System > /dev/sdb1 1 18237 146488671 83 Linux > /dev/sdb2 18238 30377 97514550 83 Linux > > --------------------------------------- > Kernel parameters: > > echo 4200000000 > /proc/sys/kernel/shmmax > echo 4200000000 > /proc/sys/kernel/shmall > sysctl -w vm.overcommit_memory=2 > > echo 8192 > /sys/block/md0/md/stripe_cache_size > echo 8192 > /sys/block/md1/md/stripe_cache_size > > --------------------------------------- > > > Both md0 and md1 are used from PostgreSQL - initially it was not design > to use the whole disk sdb-sdf, but due to size requirement I join also > the other unused space to be used by PostgreSQL. > > > And here's the Postgre's log (FATAL message is coming when I try to > connect to the database, of course this is the case for the most > interesting database ... some other small databases are working fine): > > LOG: received smart shutdown request > LOG: autovacuum launcher shutting down > LOG: shutting down > LOG: database system is shut down > LOG: could not create IPv6 socket: Address family not supported by > protocol > LOG: database system was shut down at 2008-05-20 17:54:17 EEST > LOG: autovacuum launcher started > LOG: database system is ready to accept connections > FATAL: "base/16399" is not a valid data directory > DETAIL: File "base/16399/PG_VERSION" does not contain valid data. > HINT: You might need to initdb. > > Of course base/16399/PG_VERSION contains something strange not the > version information: > > # cat base/16399/PG_VERSION > X > > > --------------------------------------- > > > > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.8 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkg0F0YACgkQjDX6szCBa+r5wwCg5Dzms7G3ipmVaoBbCZd+jPp8 TmIAnRrehvG1m+wvERsZ8J8Xw8v9scO5 =5AgU -----END PGP SIGNATURE-----