Re: wrong final bzImage build (regading #14270)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ok, some more to this.

It turns out dash's built-in echo command interprets \nnn octal
sequences by default, and there's no way to turn that off.  So,
for example, sed-zoffset command from arch/x86/boot/Makefile
(which includes \1 \2 etc substitutions for sed), when echoed
in verbose mode (V=1), produces.. interesting characters (with
ascii code 1 and 2).

It's not practival to replace V=1's echo with /bin/echo I think.

So I'd say it's not a bug in the build system after all, but
a bug in dash.  Well, at least this expanding-by-default didn't
trigger another very-difficult-to-find bug (hopefully), but it
has good potential.

I'll file a bug report against dash.

/mjt

[Michael Tokarev - Fri, Oct 09, 2009 at 06:17:50PM +0400]
Ok, finally the mystery solved.  After a week of
digging.

The original problem was titled "Cannot boot on
a PIII Celeron", and Rafael filed a bug #14270
for this.

In short, what I observed was that a new kernel
(2.6.31) fails to boot on a PIII Celeron machine.
But changing just the CPU to plain PIII and voila,
it now works.  I don't know why it behaved this
way, but I found where was the problem, finally.

And the problem is in the last stage of build, when
building the bzImage.

make -f scripts/Makefile.build obj=arch/x86/boot/compressed arch/x86/boot/compressed/vmlinux
...
  (cat arch/x86/boot/compressed/vmlinux.bin | lzma -9 && echo -ne \\x38\\xd6\\x37\\x00) > arch/x86/boot/compressed/vmlinux.bin.lzma
...

Note the echo command.

Now, Debian switched to dash as /bin/sh.  And dash
does not understand the -e option:

$ dash -c 'echo -ne \\x38\\xd6\\x37\\x00' | od -x
0000000 6e2d 2065 785c 3833 785c 3664 785c 3733
0000020 785c 3030 000a

$ bash -c 'echo -ne \\x38\\xd6\\x37\\x00' | od -x
0000000 d638 0037

So the final size (it's the size of uncompressed file)
becomes incorrect.  Here's what mkpiggy outputs for
this (in arch/x86/boot/compressed/piggy.S):

 z_output_len = 170930296

while it should be

 z_output_len = 3659320

And with the former (wrong, larger) size, the whole
thing just reboots on a PIII Celeron.  I've no idea
why, but the original problem is here.

The same thing happens with bzip2 algorithm which is
not new, not only with lzma.

The whole thing looks quite hackish to me, -- mkpiggy
can know the size from the original image just fine,
instead of getting it from the end of already compressed
file.

For now, quick fix is to change echo to printf in there.
Correct fix is to re-write mkpiggy to look at the
original file for size (IMHO anyway).

And this is a very good candidate for -stable as well.
The bug is very difficult to find.  And now when more
and more people who use Debian are switching to dash,
it will be more common.

Thanks!

--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux