On 9/2/21 10:41 PM, Arnd Bergmann wrote:
On Thu, Sep 2, 2021 at 9:48 PM Helge Deller <deller@xxxxxx> wrote:
On 9/2/21 8:35 PM, Arnd Bergmann wrote:
On Thu, Sep 2, 2021 at 2:06 PM Helge Deller <deller@xxxxxx> wrote:
Kernel v5.14 has various changes to optimize unaligned memory accesses,
e.g. commit 0652035a5794 ("asm-generic: unaligned: remove byteshift helpers").
Those changes break the bootloader and other places in kernel for parisc
which needs byte-wise accesses to unaligned memory.
Here is an updated patch/hack which fixes those boot problems by adding
a compiler optimization barrier. More info and background can be found in BZ:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162
Signed-off-by: Helge Deller <deller@xxxxxx>
Right, this should fix it, but I tend to agree with what Andrew Pinski
said: the existing version is actually correct and allows valid
optimizations on static variables as long as those are correctly
annotated in C.
Let's look at generic kernel code, e.g. in fs/btrfs/inode.c.
You will find many similiar cases all around the kernel.
------------
struct dir_entry {
u64 ino;
u64 offset;
unsigned type;
int name_len;
};
static int btrfs_filldir(void *addr, int entries, struct dir_context *ctx)
{
while (entries--) {
struct dir_entry *entry = addr;
char *name = (char *)(entry + 1);
ctx->pos = get_unaligned(&entry->offset);
if (!dir_emit(ctx, name, get_unaligned(&entry->name_len),
get_unaligned(&entry->ino),
get_unaligned(&entry->type)))
return 1;
addr += sizeof(struct dir_entry) +
get_unaligned(&entry->name_len);
ctx->pos++;
}
return 0;
}
-----------
According to Andrew Pinski's statement, the compiler will assume here that all of
those get_unaligned() calls will access naturally aligned memory and I'm pretty
sure the compiler will generate native 4/8 byte accesses on all platforms.
Most likely you will not notice on most platforms because it will get fixed by
exception handlers or natively in hardware.
But anyway, it's not what the developers intended by adding get_unaligned().
No, this case is completely different: 'entry' points to dynamically allocated
memory that gets passed in via a void pointer, so gcc has no knowledge of
the alignment of the underlying storage, and it will do the access according to
the __packed constrains in the get_unaligned() helper. When you look at the
assembler output for this function on a 5.14 parisc kernel, I'm sure you will
see the correct byte accesses, just like the trivial example I posted
in bugzilla.
The reason that the "output_len" access breaks is that gcc explicitly optimizes
the bytewise access into word accesses because it assumes that global variables
are correctly declared, and that they are aligned according to the requirements
of the ABI.
This may be surprising and even unfortunate, but I can see why they did
this optimization, and that it helps in other cases as well.
Arnd, you were absolutely correct and I was wrong.
It seems to work nicely now after I changed the output_len variable to
become an "extern char".
Thanks!
Helge
I see no chance to change all those places in the kernel.
No, that would mean changing all get_unaligned() accesses to pointer
dereferences on types that are declared as __packed themselves.
The get_unaligned()/put_unaligned() helpers generally do what they
are designed for, it just breaks when you have misaligned global
variables that are created by a linker script.
Yes.