On Воскресенье 25 февраля 2007, Rafael J. Wysocki wrote: > On Sunday, 25 February 2007 11:37, Andrey Borzenkov wrote: > > On Воскресенье 25 февраля 2007, Rafael J. Wysocki wrote: > > > On Sunday, 25 February 2007 00:26, Andrey Borzenkov wrote: > > > > On Суббота 24 февраля 2007, Rafael J. Wysocki wrote: > > > > > Hi, > > > > > > > > > > On Saturday, 24 February 2007 10:55, Andrey Borzenkov wrote: > > > > > > On Вторник 13 февраля 2007, Andrey Borzenkov wrote: > > > > > > > On Четверг 07 декабря 2006, Lebedev, Vladimir P wrote: > > > > > > > > Please register new bug, attach acpidump and dmesg. > > > > > > > > > > > > > > http://bugzilla.kernel.org/show_bug.cgi?id=7995 > > > > > > > > > > > > > > regards > > > > > > > > > > > > Well, this starts looking like ACPI is not at fault. > > > > > > > > > > > > When reporting AC state ACPI just reads contents of system memory > > > > > > (I presume it gets updated by BIOS/ACPI when AC state changes). > > > > > > It looks like this memory area is restored during resume from > > > > > > STD. I updated mentioned bug report with more detailed > > > > > > description. Now if someone could suggest a way to catch if > > > > > > specific physical address gets saved/restored this would finally > > > > > > explain it. > > > > > > > > > > First, if you want the reserved memory areas to be left alone by > > > > > swsusp, you need to mark them as 'nosave'. On x86_64 this is done > > > > > by the function e820_mark_nosave_range() in > > > > > arch/x86_64/kernel/e820.c that can be ported to i386 with no > > > > > problems. However, we haven't found that very useful, so far, > > > > > since no one has ever reported any problems with the current > > > > > approach, which is to save and restore them. > > > > > > > > Well, the following proof of concept patch fixes this issue for me. > > > > Please notice that original version of e820_mark_nosave_range() could > > > > fail to exclude some areas due to alignment issues (exactly what > > > > happened to me on first try) so it still can explain your problem > > > > too. > > > > > > Great job, thanks for the patch! It looks good, so I'm going to > > > forward it for merging. > > > > Please no; I'm currently testing slightly more polished version; I will > > send it later. > > OK > > > Could anybody explain (or give pointer to) what happens which region that > > is not page-aligned? In particular, the very first one: > > > > BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) > > BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) > > > > Will the kernel allocate partial page (how?) or will the kernel ignore > > last (first) incomplete page? In the former case how those incomplete > > pages can be detected? > > Well, on x86_64, if I understand e820_register_active_regions() correctly, > the partial pages won't be registered. > It appears that for low memory kernel will ignore incomplete pages for sure. I hope it does the same for high memory - but for now I just throw this in and pray :) This also significantly simplifies patch. As this touches quite sensitive field, I Cc Andrew - if he considers this appropriate for mm; or would you do it as part of your tree? Also he probably can easily clarify memory allocation questions :p regards -andrey
Subject: [PATCH] Mark e820 reserved memory areas as nosave for suspend/resume From: Andrey Borzenkov <<arvidjaar@xxxxxxx>> This fixes bug http://bugzilla.kernel.org/show_bug.cgi?id=7995. From lkml discussion (http://marc.theaimsgroup.com/?t=116514878500001&r=1&w=2): ============= > When reporting AC state ACPI just reads contents of system memory (I > presume > it gets updated by BIOS/ACPI when AC state changes). It looks like this > memory area is restored during resume from STD. I updated mentioned bug > report with more detailed description. Now if someone could suggest a way > to > catch if specific physical address gets saved/restored this would finally > explain it. First, if you want the reserved memory areas to be left alone by swsusp, you need to mark them as 'nosave'. On x86_64 this is done by the function e820_mark_nosave_range() in arch/x86_64/kernel/e820.c that can be ported to i386 with no problems. However, we haven't found that very useful, so far, since no one has ever reported any problems with the current approach, which is to save and restore them. ============= The patch adds adapted x84_64 version for i386. It differs from x86_64 in that - it does not touch memory regions not covered by e820 table (I simply have no idea if this is appropriate to do) - it properly marks also partial pages (like the initial one in second line below). Apparently kernel won't allocate and use such pages, so there is nothing to preserve there. BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) The region in question starts at ee800. This is likely true for x86_64 as well. Signed-off-by: Andrey Borzenkov <<arvidjaar@xxxxxxx>> --- arch/i386/kernel/e820.c | 41 +++++++++++++++++++++++++++++++++++++++++ arch/i386/kernel/setup.c | 1 + include/asm-i386/e820.h | 1 + 3 files changed, 43 insertions(+), 0 deletions(-) diff --git a/arch/i386/kernel/e820.c b/arch/i386/kernel/e820.c index f391abc..adf0e6f 100644 --- a/arch/i386/kernel/e820.c +++ b/arch/i386/kernel/e820.c @@ -311,6 +311,47 @@ static int __init request_standard_resources(void) subsys_initcall(request_standard_resources); +/* + * Mark pages corresponding to given pfn range as nosaves + * + * For low memory kernel definitely won't use partial pages; + * I hope the same happens for high memory too. That is why + * round in outer direction to be sure to preserve those partial + * pages if they contain reserved regions. + */ +static void __init +e820_mark_nosave_range(unsigned long long start, unsigned long long end) +{ + unsigned long pfn, max_pfn = PFN_UP(end); + + if (start >= end) + return; + + printk("Nosave address range: %016Lx - %016Lx\n", start, end); + for (pfn = PFN_DOWN(start); pfn < max_pfn; pfn++) + if (pfn_valid(pfn)) + SetPageNosave(pfn_to_page(pfn)); +} + +/* + * Find the ranges of physical addresses that do not correspond to + * e820 RAM areas and mark the corresponding pages as nosave for software + * suspend and suspend to RAM. + * + * This assumes kernel won't use partial pages. + */ +void __init e820_mark_nosave_regions(void) +{ + int i; + + for (i = 0; i < e820.nr_map; i++) { + struct e820entry *ei = &e820.map[i]; + + if (ei->type != E820_RAM) + e820_mark_nosave_range(ei->addr, ei->addr + ei->size); + } +} + void __init add_memory_region(unsigned long long start, unsigned long long size, int type) { diff --git a/arch/i386/kernel/setup.c b/arch/i386/kernel/setup.c index 4b31ad7..4f43e46 100644 --- a/arch/i386/kernel/setup.c +++ b/arch/i386/kernel/setup.c @@ -640,6 +640,7 @@ void __init setup_arch(char **cmdline_p) #endif e820_register_memory(); + e820_mark_nosave_regions(); #ifdef CONFIG_VT #if defined(CONFIG_VGA_CONSOLE) diff --git a/include/asm-i386/e820.h b/include/asm-i386/e820.h index c5b8fc6..80e49bc 100644 --- a/include/asm-i386/e820.h +++ b/include/asm-i386/e820.h @@ -43,6 +43,7 @@ extern void register_bootmem_low_pages(unsigned long max_low_pfn); extern void e820_register_memory(void); extern void limit_regions(unsigned long long size); extern void print_memory_map(char *who); +extern void e820_mark_nosave_regions(void); #endif/*!__ASSEMBLY__*/
Attachment:
pgpYUAjoOdtCn.pgp
Description: PGP signature