Re: ext3 + fs > 2Tbyte

Vincent.McIntyre@xxxxxxxx · Fri, 4 Nov 2005 16:19:00 +1100 (EST)

No files were written to the filesystem during the test sequence.

Hmm, I would expect at least the need to write something to the filesystem,
unless you are unlucky enough that the last group(s) aliases exactly over
the first superblock on disk, but is kept in the cache enough to remount
it before you reboot.

ok, I can add that to the scripts in my next round of tests.

Do you only use the parted "mkfs" or do you actually use the mke2fs 
from e2fsprogs? 
The script does this
  parted -s /dev/sdb1 print
  parted -s /dev/sdb1 mklabel gpt
  parted -s /dev/sdb1 print
  parted -s /dev/sdb1 mkpart primary 0 10
  parted -s /dev/sdb1 print
  parted -s /dev/sdb1 mke2fs 1 ext2
  parted -s /dev/sdb1 print

I did not try mke2fs before now because I don't think it worked when
I was trying to make FS larger than 2Tb. Can't recall now.

If you just to the mke2fs + reboot + mount does that fail?

Yes. While you were typing,
 * I made a teeny 10 Mbyte filesystem (using parted, as above)
 * mounted
 * umounted
 * ran findsuper and od
 * reboot
 * ran parted /dev/sdb1 print
   (repeated, using strace)
 * ran an straced e2fsck /dev/sdb1
and got the same error.

I couldn't quite believe this so I tried it again. Same result.
Post reboot, I did things in slightly different order:

 * strace e2fsck -n /dev/sdb1
 e2fsck 1.38 (30-Jun-2005)
 Couldn't find ext2 superblock, trying backup blocks...
 /local/sbin/e2fsck: Bad magic number in super-block while trying to open 
/dev/sdb1

 The superblock could not be read or does not describe a correct ext2
 filesystem.  If the device is valid and it really contains an ext2
 filesystem (and not swap or ufs or something else), then the superblock
 is corrupt, and you might try running e2fsck with an alternate
 superblock:
    e2fsck -b 8193 <device>

 * /local/sbin/parted /dev/sdb print
 Disk geometry for /dev/sdb: 0.000-2289288.000 megabytes
 Disk label type: gpt
 Minor    Start       End     Filesystem  Name                  Flags
 1          0.017     10.000  ext2
 Information: Don't forget to update /etc/fstab, if necessary.

Same with just the tune2fs -j + reboot + remount?

I switched to using mke2fs to create the filesystem, ie
 * I made a teeny 10 Mbyte partition (using parted)
 * mke2fs /dev/sdb1
 * mounted
 * umounted
 * ran findsuper and od
 * reboot
 * strace -o strace.e2fsck.postboot /local/sbin/e2fsck -n /dev/sdb1
 e2fsck 1.38 (30-Jun-2005)
 Couldn't find ext2 superblock, trying backup blocks...
 /local/sbin/e2fsck: Bad magic number in super-block while trying to open 
/dev/sdb1

 The superblock could not be read or does not describe a correct ext2
 filesystem.  If the device is valid and it really contains an ext2
 filesystem (and not swap or ufs or something else), then the superblock
 is corrupt, and you might try running e2fsck with an alternate
 superblock:
    e2fsck -b 8193 <device>

So it is starting to look like the GPT disklabel is causing a problem.

I switched to having parted make a msdos disklabel but kept everything
else the same - it worked fine.
 # strace -o strace.e2fsck.postboot /local/sbin/e2fsck -n /dev/sdb1
 e2fsck 1.38 (30-Jun-2005)
 /dev/sdb1: clean, 11/2000 files, 268/8000 blocks
 #

findsuper tells me there are superblocks, but fs_blk_sz changes (!?)

These are remnants of previous filesystems on the device, each with
slightly different offsets (maybe with and without a partition table,
or with different partition types).  In one case there was a small
1kB block filesystem on the disk in the past.

ah, of course. I thought findsuper would respect the partition boundaries
and stop at the "end" of the filesystem. It did that pre-reboot, e.g. my
10Mbyte test above
  starting at 0, with 512 byte increments
       thisoff     block fs_blk_sz  blksz grp last_mount
          1024         1     10223  1024    0 Thu Jan  1 10:00:00 1970
       8389632      8193     10223  1024    1 Thu Jan  1 10:00:00 1970

      10468864: finished with errno 0

Post-reboot, I get this:
  starting at 0, with 512 byte increments
       thisoff     block fs_blk_sz  blksz grp last_mount
         17920        17     10223  1024    0 Thu Jan  1 10:00:00 1970
       8406528      8209     10223  1024    1 Thu Jan  1 10:00:00 1970
     134235648    131089 511999995  4096    1 Thu Jan  1 10:00:00 1970
     209733120    204817   1023983  1024   25 Thu Jan  1 10:00:00 1970
     226510336    221201   1023983  1024   27 Thu Jan  1 10:00:00 1970

To clean things up, I suppose I could dd /dev/zero into /dev/sdb?
It'll only take about 10 hours..

# /root/e2fsprogs-1.38/misc/findsuper /dev/sdb1
starting at 0, with 512 byte increments
       thisoff     block fs_blk_sz  blksz grp last_mount
         17920        17 586057719  4096    0 Thu Jan  1 10:00:00 1970

What is missing is the superblock at offset "1024".  What this tool
_should_ also print out is part of the superblock UUID so it is possible
to say which superblocks belong to a single filesystem.

With an ext3 filesystem you will also find copies of the superblock in
the journal, they will all be marked "grp 0" and are not valid backups.

ok, thanks for explaining this.

There appear to be 2 filesystems of interest.  One has offset 0x4200 = 16896,
but is missing the primary superblock.  The other has offset 0x4600 = 17920.
Neither of these would allow you to mount the filesystem as-is, because the
superblock is not aligned at 1024 bytes from the start of the device.

I would suspect something wacky with the partitioning and/or the way that
parted is making the filesystem.

Most of this just the history of the fs creation tests I did I guess.
Remeber all these are just test filesystems on separate hardware.
I have not dared to run findsuper on the filesystem of interest yet,
I want to make sure I can actually recover a test FS first.

So I tried a few e2fsck runs. I know I'm probably being dense but none
of these worked:
e2fsck -n -b 16        -B 4096 /dev/sdb1
e2fsck -n -b 17        -B 4096 /dev/sdb1
....

No, I'd expect you need to do something with the device partitioning
to get the filesystem aligned properly.  They aren't even aligned on
a block boundary, there is a 512-byte offset.

I noticed that when computing thisoff/blksz, but didn't make much of it.
Thanks for clearing that up.
I'll take a look at the manuals to see if I can force things to be
on a block boundary.

I would recommend to do the following:
- make a partition
- reboot the system
- use mke2fs -j to make the filesystem
- test mount, unmount, reboot at this point

This reboot-after-partition thing is foreign to me (coming from solaris); 
it seems quite a poor design to need this. But let's run with it.

  parted -s /dev/sdb1 print
  parted -s /dev/sdb1 mklabel gpt
  parted -s /dev/sdb1 print
  parted -s /dev/sdb1 mkpart primary 0 10
  parted -s /dev/sdb1 print
  sleep 60
  reboot
  parted -s /dev/sdb1 print
  mke2fs -n -v /dev/sdb1
  mke2fs -q /dev/sdb1
    mke2fs gets stuck...
    I have to ^C it.

  # fdisk -l /dev/sdb
  You must set cylinders.
  You can do this from the extra functions menu.

  Disk /dev/sdb: 0 MB, 0 bytes
  255 heads, 63 sectors/track, 0 cylinders
  Units = cylinders of 16065 * 512 = 8225280 bytes

     Device Boot      Start         End      Blocks   Id  System
  /dev/sdb1               1      267350  2147483647+  ee  EFI GPT
  Partition 1 has different physical/logical beginnings (non-Linux?):
     phys=(0, 0, 1) logical=(0, 0, 2)
  Partition 1 has different physical/logical endings:
     phys=(1023, 254, 63) logical=(267349, 89, 4)

  # /local/sbin/parted /dev/sdb print
  Error: The primary GPT table is corrupt, but the backup appears ok, so
  that will be used.
  OK/Cancel? C
  Information: Don't forget to update /etc/fstab, if necessary.

  # /local/sbin/parted /dev/sdb print
  Error: The primary GPT table is corrupt, but the backup appears ok, so
  that will be used.
  OK/Cancel? OK
  Disk geometry for /dev/sdb: 0.000-2289288.000 megabytes
  Disk label type: gpt
  Minor    Start       End     Filesystem  Name                  Flags
  1          0.017     10.000  ext2
  Information: Don't forget to update /etc/fstab, if necessary.

  # strace -o strace.e2fsck.post-parted /local/sbin/e2fsck -n /dev/sdb1
  e2fsck 1.38 (30-Jun-2005)
  Couldn't find ext2 superblock, trying backup blocks...
  /local/sbin/e2fsck: Bad magic number in super-block while trying to open
  /dev/sdb1

  The superblock could not be read or does not describe a correct ext2
  filesystem.  If the device is valid and it really contains an ext2
  filesystem (and not swap or ufs or something else), then the superblock
  is corrupt, and you might try running e2fsck with an alternate
  superblock:
    e2fsck -b 8193 <device>

So it appears that support is lacking for GPT disklabels in e2fsprogs
and possibly the kernel as well.

I ran one more time,
  partition with parted, gpt label.
  reboot
  make 10Mbyte ext2 fs with parted
  mount, umount, findsuper, od - all this seems to work ok.
  reboot
  attempt to mount
   mount -text2 /dev/sdb1 /tmp/a
   mount: wrong fs type, bad option, bad superblock on /dev/sdb1,
       or too many mounted file systems
       (aren't you trying to mount an extended partition,
       instead of some logical partition inside?)

I think this says there is something funky with the GPT disklabelling.

Thanks for your help,
Vince

_______________________________________________

Ext3-users@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/ext3-users