On Fri, Mar 23, 2012 at 1:54 PM, Yinghai Lu <yinghai@xxxxxxxxxx> wrote: > On Fri, Mar 23, 2012 at 7:29 AM, Thomas Renninger <trenn@xxxxxxx> wrote: >> Details can be found in: >> Documentation/acpi/initrd_table_override.txt >> >> Additional dmesg output of a booted system with >> FACP (FADT), DSDT and SSDT (the 9th dynamically loaded one) >> tables overridden (with ### marked comments): >> >> ### ACPI tables found glued to initrd >> DSDT ACPI table found in initrd - size: 16234 >> FACP ACPI table found in initrd - size: 116 >> SSDT ACPI table found in initrd - size: 334 >> ### Re-printed e820 map via e820_update() with additionally created >> ### ACPI data section at 0xcff55000 where the ACPI tables passed via >> ### initrd where copied to >> modified physical RAM map: >> ... >> ### New ACPI data section: >> modified: 00000000cff55000 - 00000000cff5912c (ACPI data) >> ### BIOS e820 provided ACPI data section: >> modified: 00000000cff60000 - 00000000cff69000 (ACPI data) >> ... >> ### Total size of all ACPI tables glued to initrd >> ### The address is initrd_start which gets updated to >> ### initrd_start = initrd_start + "size of all ACPI tables glued to initrd" >> Found acpi tables of size: 16684 at 0xffff8800374c4000 >> >> Disabling lock debugging due to kernel taint >> ### initrd provided FACP and DSDT tables are used instead of BIOS provided ones >> ACPI: FACP @ 0x00000000cff68dd8 Phys table override, replaced with: >> ACPI: FACP 00000000cff58f6a 00074 (v01 INTEL TUMWATER 06040000 PTL 00000003) >> ACPI: DSDT @ 0x00000000cff649d4 Phys table override, replaced with: >> ACPI: DSDT 00000000cff55000 04404 (v01 Intel BLAKFORD 06040000 MSFT 0100000E) >> ... >> ### Much later, the 9th (/sys/firmware/acpi/table/dynamic/SSDT9) dynamically >> ### loaded ACPI table matches and gets overridden: >> ACPI: SSDT @ 0x00000000cff64824 Phys table override, replaced with: >> ACPI: SSDT 00000000cff58fde 0014E (v01 PmRef Cpu7Ist 00003000 INTL 20110316) >> ACPI: Dynamic OEM Table Load: >> ACPI: SSDT (null) 0014E (v01 PmRef Cpu7Ist 00003000 INTL 20110316) >> ... >> >> If the initrd does not start with a valid ACPI table signature or the ACPI >> table's checksum is wrong, there is no functional change. updated it a little bit to remove 2 #ifdef. please check attached. Yinghai
From: Thomas Renninger <trenn@xxxxxxx> Cc: eric.piel@xxxxxxxxxxxxxxxx, vojcek@xxxxxxx, dsdt@xxxxxxxxxxx, Thomas Renninger <trenn@xxxxxxx>, linux-acpi@xxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx, x86@xxxxxxxxxx, Lin Ming <ming.m.lin@xxxxxxxxx>, lenb@xxxxxxxxxx, robert.moore@xxxxxxxxx, hpa@xxxxxxxxx Subject: [PATCH] ACPI: Implement overriding of arbitrary ACPI tables via initrd Date: Fri, 23 Mar 2012 15:29:44 +0100 Message-Id: <1332512984-79664-1-git-send-email-trenn@xxxxxxx> Details can be found in: Documentation/acpi/initrd_table_override.txt Additional dmesg output of a booted system with FACP (FADT), DSDT and SSDT (the 9th dynamically loaded one) tables overridden (with ### marked comments): ### ACPI tables found glued to initrd DSDT ACPI table found in initrd - size: 16234 FACP ACPI table found in initrd - size: 116 SSDT ACPI table found in initrd - size: 334 ### Re-printed e820 map via e820_update() with additionally created ### ACPI data section at 0xcff55000 where the ACPI tables passed via ### initrd where copied to modified physical RAM map: ... ### New ACPI data section: modified: 00000000cff55000 - 00000000cff5912c (ACPI data) ### BIOS e820 provided ACPI data section: modified: 00000000cff60000 - 00000000cff69000 (ACPI data) ... ### Total size of all ACPI tables glued to initrd ### The address is initrd_start which gets updated to ### initrd_start = initrd_start + "size of all ACPI tables glued to initrd" Found acpi tables of size: 16684 at 0xffff8800374c4000 Disabling lock debugging due to kernel taint ### initrd provided FACP and DSDT tables are used instead of BIOS provided ones ACPI: FACP @ 0x00000000cff68dd8 Phys table override, replaced with: ACPI: FACP 00000000cff58f6a 00074 (v01 INTEL TUMWATER 06040000 PTL 00000003) ACPI: DSDT @ 0x00000000cff649d4 Phys table override, replaced with: ACPI: DSDT 00000000cff55000 04404 (v01 Intel BLAKFORD 06040000 MSFT 0100000E) ... ### Much later, the 9th (/sys/firmware/acpi/table/dynamic/SSDT9) dynamically ### loaded ACPI table matches and gets overridden: ACPI: SSDT @ 0x00000000cff64824 Phys table override, replaced with: ACPI: SSDT 00000000cff58fde 0014E (v01 PmRef Cpu7Ist 00003000 INTL 20110316) ACPI: Dynamic OEM Table Load: ACPI: SSDT (null) 0014E (v01 PmRef Cpu7Ist 00003000 INTL 20110316) ... If the initrd does not start with a valid ACPI table signature or the ACPI table's checksum is wrong, there is no functional change. -v2: remove 2 #ifdef. Signed-off-by: Thomas Renninger <trenn@xxxxxxx> CC: linux-acpi@xxxxxxxxxxxxxxx CC: linux-kernel@xxxxxxxxxxxxxxx CC: x86@xxxxxxxxxx CC: Lin Ming <ming.m.lin@xxxxxxxxx> CC: lenb@xxxxxxxxxx CC: robert.moore@xxxxxxxxx CC: hpa@xxxxxxxxx --- Documentation/acpi/initrd_table_override.txt | 110 +++++++++++++++++++++ arch/x86/kernel/e820.c | 24 ++++ arch/x86/kernel/setup.c | 15 ++ arch/x86/mm/init.c | 4 drivers/acpi/Kconfig | 10 + drivers/acpi/osl.c | 136 ++++++++++++++++++++++++++- include/linux/acpi.h | 9 + include/linux/initrd.h | 5 8 files changed, 304 insertions(+), 9 deletions(-) create mode 100644 Documentation/acpi/initrd_table_override.txt Index: linux-2.6/Documentation/acpi/initrd_table_override.txt =================================================================== --- /dev/null +++ linux-2.6/Documentation/acpi/initrd_table_override.txt @@ -0,0 +1,110 @@ +Overriding ACPI tables via initrd +================================= + +1) Introduction (What is this about) +2) What is this for +3) How does it work +4) References (Where to retrieve userspace tools) + +1) What is this about +--------------------- + +If ACPI_INITRD_TABLE_OVERRIDE compile option is true, it is possible to +override nearly any ACPI table provided by the BIOS with an instrumented, +modified one. + +Up to 10 arbitrary ACPI tables can be passed. +For a full list of ACPI tables that can be overridden, take a look at +the char *table_sigs[MAX_ACPI_SIGNATURE]; definition in drivers/acpi/osl.c +All ACPI tables iasl (Intel's ACPI compiler and disassembler) knows should +be overridable, except: + - ACPI_SIG_RSDP (has a signature of 6 bytes) + - ACPI_SIG_FACS (does not have an ordinary ACPI table header) +Both could get implemented as well. + + +2) What is this for +------------------- + +Please keep in mind that this is a debug option. +ACPI tables should not get overridden for productive use. +If BIOS ACPI tables are overridden the kernel will get tainted with the +TAINT_OVERRIDDEN_ACPI_TABLE flag. +Complain to your platform/BIOS vendor if you find a bug which is that sever +that a workaround is not accepted in the Linus kernel. + +Still, it can and should be enabled in any kernel, because: + - There is no functional change with not instrumented initrds + - It provides a powerful feature to easily debug and test ACPI BIOS table + compatibility with the Linux kernel. + +Until now it was only possible to override the DSDT by compiling it into +the kernel. This is a nightmare when trying to work on ACPI related bugs +and a lot bugs got stuck because of that. +Even for people with enough kernel knowledge, building a kernel to try out +things is very time consuming. Also people may have to browse and modify the +ACPI interpreter code to find a possible BIOS bug. With this feature, people +can correct the ACPI tables and try out quickly whether this is the root cause +that needs to get addressed in the kernel. + +This could even ease up testing for BIOS providers who could flush their BIOS +to test, but overriding table via initrd is much easier and quicker. +For example one could prepare different initrds overriding NUMA tables with +different affinity settings. Set up a script, let the machine reboot and +run tests over night and one can get a picture how these settings influence +the Linux kernel and which values are best. + +People can instrument the dynamic ACPI (ASL) code (for example with debug +statements showing up in syslog when the ACPI code is processed, etc.), +to better understand BIOS to OS interfaces, to hunt down ACPI BIOS code related +bugs quickly or to easier develop ACPI based drivers. + +Intstrumenting ACPI code in SSDTs is now much easier. Before, one had to copy +all SSDTs into the DSDT to compile it into the kernel for testing +(because only DSDT could get overridden). That's what the acpi_no_auto_ssdt +boot param is for: the BIOS provided SSDTs are ignored and all have to get +copied into the DSDT, complicated and time consuming. + +Much more use cases, depending on which ACPI parts you are working on... + + +3) How does it work +------------------- + +# Extract the machine's ACPI tables: +acpidump >acpidump +acpixtract -a acpidump +# Disassemble, modify and recompile them: +iasl -d *.dat +# For example add this statement into a _PRT (PCI Routing Table) function +# of the DSDT: +Store("Hello World", debug) +iasl -sa *.dsl +# glue them together with the initrd. ACPI tables go first, original initrd +# goes on top: +cat TBL1.dat >>instrumented_initrd +cat TBL2.dat >>instrumented_initrd +cat TBL3.dat >>instrumented_initrd +cat /boot/initrd >>instrumented_initrd +# reboot with increased acpi debug level, e.g. boot params: +acpi.debug_level=0x2 acpi.debug_layer=0xFFFFFFFF +# and check your syslog: +[ 1.268089] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] +[ 1.272091] [ACPI Debug] String [0x0B] "HELLO WORLD" + +iasl is able to disassemble and recompile quite a lot different, +also static ACPI tables. + +4) Where to retrieve userspace tools +------------------------------------ + +iasl and acpixtract are part of Intel's ACPICA project: +http://acpica.org/ +and should be packaged by distributions (for example in the acpica package +on SUSE). + +acpidump can be found in Len Browns pmtools: +ftp://kernel.org/pub/linux/kernel/people/lenb/acpi/utils/pmtools/acpidump +This tool is also part of the acpica package on SUSE. +Alternatively used ACPI tables can be retrieved via sysfs in latest kernels: +/sys/firmware/acpi/tables Index: linux-2.6/arch/x86/kernel/setup.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/setup.c +++ linux-2.6/arch/x86/kernel/setup.c @@ -412,12 +412,19 @@ static void __init reserve_initrd(void) */ initrd_start = ramdisk_image + PAGE_OFFSET; initrd_end = initrd_start + ramdisk_size; - return; + } else { + relocate_initrd(); + memblock_free(ramdisk_image, ramdisk_end - ramdisk_image); } - relocate_initrd(); - - memblock_free(ramdisk_image, ramdisk_end - ramdisk_image); + acpi_initrd_offset = acpi_initrd_table_override((void *)initrd_start, + (void *)initrd_end); + if (!acpi_initrd_offset) + return; + printk(KERN_INFO "Found acpi tables of size: %lu at 0x%lx\n", + acpi_initrd_offset, initrd_start); + initrd_start += acpi_initrd_offset; + return; } #else static void __init reserve_initrd(void) Index: linux-2.6/arch/x86/mm/init.c =================================================================== --- linux-2.6.orig/arch/x86/mm/init.c +++ linux-2.6/arch/x86/mm/init.c @@ -407,7 +407,9 @@ void free_initrd_mem(unsigned long start * - relocate_initrd() * So here We can do PAGE_ALIGN() safely to get partial page to be freed */ - free_init_pages("initrd memory", start, PAGE_ALIGN(end)); + /* need to offset back acpi tables in acpi tables */ + free_init_pages("initrd memory", start - acpi_initrd_offset, + PAGE_ALIGN(end)); } #endif Index: linux-2.6/drivers/acpi/Kconfig =================================================================== --- linux-2.6.orig/drivers/acpi/Kconfig +++ linux-2.6/drivers/acpi/Kconfig @@ -261,6 +261,16 @@ config ACPI_CUSTOM_DSDT bool default ACPI_CUSTOM_DSDT_FILE != "" +config ACPI_INITRD_TABLE_OVERRIDE + bool + depends on X86 + default y + help + This option provides functionality to override arbitrary ACPI tables + via initrd. No functional change if no ACPI tables are glued to the + initrd, therefore it's safe to say Y. + See Documentation/acpi/initrd_table_override.txt for details + config ACPI_BLACKLIST_YEAR int "Disable ACPI for systems before Jan 1st this year" if X86_32 default 0 Index: linux-2.6/drivers/acpi/osl.c =================================================================== --- linux-2.6.orig/drivers/acpi/osl.c +++ linux-2.6/drivers/acpi/osl.c @@ -533,6 +533,92 @@ acpi_os_predefined_override(const struct return AE_OK; } +#ifdef CONFIG_ACPI_INITRD_TABLE_OVERRIDE + +#define ACPI_OVERRIDE_TABLES 10 + +static unsigned long acpi_table_override_offset[ACPI_OVERRIDE_TABLES]; +static u64 acpi_tables_inram; + +unsigned long acpi_initrd_offset; + +/* Copied from acpica/tbutils.c:acpi_tb_checksum() */ +u8 __init acpi_table_checksum(u8 *buffer, u32 length) +{ + u8 sum = 0; + u8 *end = buffer + length; + + while (buffer < end) + sum = (u8) (sum + *(buffer++)); + return sum; +} + +/* All but ACPI_SIG_RSDP and ACPI_SIG_FACS: */ +#define MAX_ACPI_SIGNATURE 35 +static const char *table_sigs[MAX_ACPI_SIGNATURE] = { + ACPI_SIG_BERT, ACPI_SIG_CPEP, ACPI_SIG_ECDT, ACPI_SIG_EINJ, + ACPI_SIG_ERST, ACPI_SIG_HEST, ACPI_SIG_MADT, ACPI_SIG_MSCT, + ACPI_SIG_SBST, ACPI_SIG_SLIT, ACPI_SIG_SRAT, ACPI_SIG_ASF, + ACPI_SIG_BOOT, ACPI_SIG_DBGP, ACPI_SIG_DMAR, ACPI_SIG_HPET, + ACPI_SIG_IBFT, ACPI_SIG_IVRS, ACPI_SIG_MCFG, ACPI_SIG_MCHI, + ACPI_SIG_SLIC, ACPI_SIG_SPCR, ACPI_SIG_SPMI, ACPI_SIG_TCPA, + ACPI_SIG_UEFI, ACPI_SIG_WAET, ACPI_SIG_WDAT, ACPI_SIG_WDDT, + ACPI_SIG_WDRT, ACPI_SIG_DSDT, ACPI_SIG_FADT, ACPI_SIG_PSDT, + ACPI_SIG_RSDT, ACPI_SIG_XSDT, ACPI_SIG_SSDT }; + +u64 __weak early_alloc_ram_for_acpi(u64 size, u64 align); +int __init acpi_initrd_table_override(void *start_addr, void *end_addr) +{ + int table_nr, sig; + unsigned long offset = 0, max_len = end_addr - start_addr; + char *p; + + for (table_nr = 0; table_nr < ACPI_OVERRIDE_TABLES; table_nr++) { + struct acpi_table_header *table; + if (max_len < offset + sizeof(struct acpi_table_header)) { + WARN_ON(1); + return 0; + } + table = start_addr + offset; + + for (sig = 0; sig < MAX_ACPI_SIGNATURE; sig++) + if (!memcmp(table->signature, table_sigs[sig], 4)) + break; + + if (sig >= MAX_ACPI_SIGNATURE) + break; + + if (max_len < offset + table->length) { + WARN_ON(1); + return 0; + } + + if (acpi_table_checksum(start_addr + offset, table->length)) { + WARN(1, "%4.4s has invalid checksum\n", + table->signature); + continue; + } + printk(KERN_INFO "%4.4s ACPI table found in initrd" + " - size: %d\n", table->signature, table->length); + + offset += table->length; + acpi_table_override_offset[table_nr] = offset; + } + if (!offset) + return 0; + + acpi_tables_inram = early_alloc_ram_for_acpi(offset, PAGE_SIZE); + + if (!acpi_tables_inram) + panic("Cannot find place for ACPI override tables\n"); + + p = early_ioremap(acpi_tables_inram, offset); + memcpy(p, start_addr, offset); + early_iounmap(p, offset); + return offset; +} +#endif + acpi_status acpi_os_table_override(struct acpi_table_header * existing_table, struct acpi_table_header ** new_table) @@ -558,13 +644,55 @@ acpi_os_table_override(struct acpi_table acpi_status acpi_os_physical_table_override(struct acpi_table_header *existing_table, - acpi_physical_address * new_address, - u32 *new_table_length) + acpi_physical_address *address, + u32 *table_length) { - return AE_SUPPORT; +#ifndef CONFIG_ACPI_INITRD_TABLE_OVERRIDE + *table_length = 0; + *address = 0; + return AE_OK; +#else + int table_nr = 0; + *table_length = 0; + *address = 0; + for (; table_nr < ACPI_OVERRIDE_TABLES && + acpi_table_override_offset[table_nr]; table_nr++) { + int table_offset; + int table_len; + struct acpi_table_header *table; + + if (table_nr == 0) + table_offset = 0; + else + table_offset = acpi_table_override_offset[table_nr - 1]; + + table_len = acpi_table_override_offset[table_nr] - table_offset; + + table = acpi_os_map_memory(acpi_tables_inram + table_offset, + table_len); + + if (memcmp(existing_table->signature, table->signature, 4)) { + acpi_os_unmap_memory(table, table_len); + continue; + } + + /* Only override tables with matching oem id */ + if (memcmp(table->oem_table_id, existing_table->oem_table_id, + ACPI_OEM_TABLE_ID_SIZE)) { + acpi_os_unmap_memory(table, table_len); + continue; + } + + acpi_os_unmap_memory(table, table_len); + *address = acpi_tables_inram + table_offset; + *table_length = table_len; + add_taint(TAINT_OVERRIDDEN_ACPI_TABLE); + break; + } + return AE_OK; +#endif } - static irqreturn_t acpi_irq(int irq, void *dev_id) { u32 handled; Index: linux-2.6/include/linux/acpi.h =================================================================== --- linux-2.6.orig/include/linux/acpi.h +++ linux-2.6/include/linux/acpi.h @@ -82,6 +82,15 @@ struct acpi_subtable_proc { int count; }; +#ifdef CONFIG_ACPI_INITRD_TABLE_OVERRIDE +int acpi_initrd_table_override(void *start_addr, void *end_addr); +#else +static inline int acpi_initrd_table_override(void *start_addr, void *end_addr) +{ + return 0; +} +#endif + char * __acpi_map_table (unsigned long phys_addr, unsigned long size); void __acpi_unmap_table(char *map, unsigned long size); int early_acpi_boot_init(void); Index: linux-2.6/include/linux/initrd.h =================================================================== --- linux-2.6.orig/include/linux/initrd.h +++ linux-2.6/include/linux/initrd.h @@ -16,5 +16,10 @@ extern int initrd_below_start_ok; /* free_initrd_mem always gets called with the next two as arguments.. */ extern unsigned long initrd_start, initrd_end; extern void free_initrd_mem(unsigned long, unsigned long); +#ifdef CONFIG_ACPI_INITRD_TABLE_OVERRIDE +extern unsigned long acpi_initrd_offset; +#else +#define acpi_initrd_offset 0 +#endif extern unsigned int real_root_dev; Index: linux-2.6/arch/x86/kernel/e820.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/e820.c +++ linux-2.6/arch/x86/kernel/e820.c @@ -746,6 +746,30 @@ u64 __init early_reserve_e820(u64 size, return addr; } +u64 early_alloc_ram_for_acpi(u64 size, u64 align) +{ + u64 addr = memblock_find_in_range(0, max_low_pfn_mapped << PAGE_SHIFT, + size, align); + + if (!addr) + return 0; + + /* + * Only calling e820_add_reserve does not work and the + * tables are invalid (memory got used) later. + * memblock_x86_reserve_range works as expected and the tables + * won't get modified. But it's not enough because ioremap will + * complain later (used by acpi_os_map_memory) that the pages + * that should get mapped are not marked "reserved". + * Both memblock_x86_reserve_range and e820_add_region works fine. + */ + memblock_reserve(addr, addr + size); + e820_update_range(addr, size, E820_RAM, E820_ACPI); + update_e820(); + + return addr; +} + #ifdef CONFIG_X86_32 # ifdef CONFIG_X86_PAE # define MAX_ARCH_PFN (1ULL<<(36-PAGE_SHIFT))