CC linux-renesas-soc On Tue, Oct 29, 2024 at 1:41 PM Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote: > On Wed, Oct 9, 2024 at 12:08 AM Oreoluwa Babatunde > <quic_obabatun@xxxxxxxxxxx> wrote: > > Reserved memory regions defined in the devicetree can be broken up into > > two groups: > > i) Statically-placed reserved memory regions > > i.e. regions defined with a static start address and size using the > > "reg" property. > > ii) Dynamically-placed reserved memory regions. > > i.e. regions defined by specifying an address range where they can be > > placed in memory using the "alloc_ranges" and "size" properties. > > > > These regions are processed and set aside at boot time. > > This is done in two stages as seen below: > > > > Stage 1: > > At this stage, fdt_scan_reserved_mem() scans through the child nodes of > > the reserved_memory node using the flattened devicetree and does the > > following: > > > > 1) If the node represents a statically-placed reserved memory region, > > i.e. if it is defined using the "reg" property: > > - Call memblock_reserve() or memblock_mark_nomap() as needed. > > - Add the information for that region into the reserved_mem array > > using fdt_reserved_mem_save_node(). > > i.e. fdt_reserved_mem_save_node(node, name, base, size). > > > > 2) If the node represents a dynamically-placed reserved memory region, > > i.e. if it is defined using "alloc-ranges" and "size" properties: > > - Add the information for that region to the reserved_mem array with > > the starting address and size set to 0. > > i.e. fdt_reserved_mem_save_node(node, name, 0, 0). > > Note: This region is saved to the array with a starting address of 0 > > because a starting address is not yet allocated for it. > > > > Stage 2: > > After iterating through all the reserved memory nodes and storing their > > relevant information in the reserved_mem array,fdt_init_reserved_mem() is > > called and does the following: > > > > 1) For statically-placed reserved memory regions: > > - Call the region specific init function using > > __reserved_mem_init_node(). > > 2) For dynamically-placed reserved memory regions: > > - Call __reserved_mem_alloc_size() which is used to allocate memory > > for each of these regions, and mark them as nomap if they have the > > nomap property specified in the DT. > > - Call the region specific init function. > > > > The current size of the resvered_mem array is 64 as is defined by > > MAX_RESERVED_REGIONS. This means that there is a limitation of 64 for > > how many reserved memory regions can be specified on a system. > > As systems continue to grow more and more complex, the number of > > reserved memory regions needed are also growing and are starting to hit > > this 64 count limit, hence the need to make the reserved_mem array > > dynamically sized (i.e. dynamically allocating memory for the > > reserved_mem array using membock_alloc_*). > > > > On architectures such as arm64, memory allocated using memblock is > > writable only after the page tables have been setup. This means that if > > the reserved_mem array is going to be dynamically allocated, it needs to > > happen after the page tables have been setup, not before. > > > > Since the reserved memory regions are currently being processed and > > added to the array before the page tables are setup, there is a need to > > change the order in which some of the processing is done to allow for > > the reserved_mem array to be dynamically sized. > > > > It is possible to process the statically-placed reserved memory regions > > without needing to store them in the reserved_mem array until after the > > page tables have been setup because all the information stored in the > > array is readily available in the devicetree and can be referenced at > > any time. > > Dynamically-placed reserved memory regions on the other hand get > > assigned a start address only at runtime, and hence need a place to be > > stored once they are allocated since there is no other referrence to the > > start address for these regions. > > > > Hence this patch changes the processing order of the reserved memory > > regions in the following ways: > > > > Step 1: > > fdt_scan_reserved_mem() scans through the child nodes of > > the reserved_memory node using the flattened devicetree and does the > > following: > > > > 1) If the node represents a statically-placed reserved memory region, > > i.e. if it is defined using the "reg" property: > > - Call memblock_reserve() or memblock_mark_nomap() as needed. > > > > 2) If the node represents a dynamically-placed reserved memory region, > > i.e. if it is defined using "alloc-ranges" and "size" properties: > > - Call __reserved_mem_alloc_size() which will: > > i) Allocate memory for the reserved region and call > > memblock_mark_nomap() as needed. > > ii) Call the region specific initialization function using > > fdt_init_reserved_mem_node(). > > iii) Save the region information in the reserved_mem array using > > fdt_reserved_mem_save_node(). > > > > Step 2: > > 1) This stage of the reserved memory processing is now only used to add > > the statically-placed reserved memory regions into the reserved_mem > > array using fdt_scan_reserved_mem_reg_nodes(), as well as call their > > region specific initialization functions. > > > > 2) This step has also been moved to be after the page tables are > > setup. Moving this will allow us to replace the reserved_mem > > array with a dynamically sized array before storing the rest of > > these regions. > > > > Signed-off-by: Oreoluwa Babatunde <quic_obabatun@xxxxxxxxxxx> > > Thanks for your patch, which is now commit 8a6e02d0c00e7b62 > ("of: reserved_mem: Restructure how the reserved memory regions > are processed") in dt-rh/for-next. > > I have bisected a boot issue on RZ/Five to this commit. > With "earlycon keep_bootcon" (else there is no output): > > Oops - store (or AMO) access fault [#1] > CPU: 0 UID: 0 PID: 1 Comm: swapper Not tainted > 6.12.0-rc1-00015-g8a6e02d0c00e #201 > Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT) > epc : __memset+0x60/0x100 > ra : __dma_alloc_from_coherent+0x150/0x17a > epc : ffffffff8062d2bc ra : ffffffff80053a94 sp : ffffffc60000ba20 > gp : ffffffff812e9938 tp : ffffffd601920000 t0 : ffffffc6000d0000 > t1 : 0000000000000000 t2 : ffffffffe9600000 s0 : ffffffc60000baa0 > s1 : ffffffc6000d0000 a0 : ffffffc6000d0000 a1 : 0000000000000000 > a2 : 0000000000001000 a3 : ffffffc6000d1000 a4 : 0000000000000000 > a5 : 0000000000000000 a6 : ffffffd601adacc0 a7 : ffffffd601a841a8 > s2 : ffffffd6018573c0 s3 : 0000000000001000 s4 : ffffffd6019541e0 > s5 : 0000000200000022 s6 : ffffffd6018f8410 s7 : ffffffd6018573e8 > s8 : 0000000000000001 s9 : 0000000000000001 s10: 0000000000000010 > s11: 0000000000000000 t3 : 0000000000000000 t4 : ffffffffdefe62d1 > t5 : 000000001cd6a3a9 t6 : ffffffd601b2aad6 > status: 0000000200000120 badaddr: ffffffc6000d0000 cause: 0000000000000007 > [<ffffffff8062d2bc>] __memset+0x60/0x100 > [<ffffffff80053e1a>] dma_alloc_from_global_coherent+0x1c/0x28 > [<ffffffff80053056>] dma_direct_alloc+0x98/0x112 > [<ffffffff8005238c>] dma_alloc_attrs+0x78/0x86 > [<ffffffff8035fdb4>] rz_dmac_probe+0x3f6/0x50a > [<ffffffff803a0694>] platform_probe+0x4c/0x8a > [<ffffffff8039ea16>] really_probe+0xe4/0x1c8 > [<ffffffff8039ebc4>] __driver_probe_device+0xca/0xce > [<ffffffff8039ec48>] driver_probe_device+0x34/0x92 > [<ffffffff8039ede8>] __driver_attach+0xb4/0xbe > [<ffffffff8039ce58>] bus_for_each_dev+0x60/0xa0 > [<ffffffff8039e26a>] driver_attach+0x1a/0x22 > [<ffffffff8039dc20>] bus_add_driver+0xa4/0x184 > [<ffffffff8039f65c>] driver_register+0x8a/0xb4 > [<ffffffff803a051c>] __platform_driver_register+0x1c/0x24 > [<ffffffff808202f6>] rz_dmac_driver_init+0x1a/0x22 > [<ffffffff80800ef6>] do_one_initcall+0x64/0x134 > [<ffffffff8080122e>] kernel_init_freeable+0x200/0x202 > [<ffffffff80638126>] kernel_init+0x1e/0x10a > [<ffffffff8063d58e>] ret_from_fork+0xe/0x18 > Code: 1007 82b3 40e2 0797 0000 8793 00e7 8305 97ba 8782 (b023) 00b2 > ---[ end trace 0000000000000000 ]--- > Kernel panic - not syncing: Fatal exception in interrupt > ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]--- > > Nothing really stands out in the kernel log, except for a delayed > initialization of the reserved mem nodes (they are the same > before/after): > > printk: debug: ignoring loglevel setting. > -OF: reserved mem: 0x0000000000030000..0x000000000003ffff (64 KiB) > nomap non-reusable mmode_resv0@30000 > -OF: reserved mem: 0x0000000000040000..0x000000000004ffff (64 KiB) > nomap non-reusable mmode_resv1@40000 > -OF: reserved mem: 0x0000000044000000..0x000000004403ffff (256 KiB) > nomap non-reusable mmode_resv3@44000000 > -OF: reserved mem: 0x0000000044040000..0x000000004405ffff (128 KiB) > nomap non-reusable mmode_resv2@44040000 > +earlycon: scif0 at MMIO 0x000000001004b800 (options '115200n8') > +printk: legacy bootconsole [scif0] enabled > +printk: debug: skip boot console de-registration. > Reserved memory: created DMA memory pool at 0x0000000058000000, size 128 MiB > OF: reserved mem: initialized node pma_resv0@58000000, compatible id > shared-dma-pool > OF: reserved mem: 0x0000000058000000..0x000000005fffffff (131072 KiB) > nomap non-reusable pma_resv0@58000000 > +OF: reserved mem: 0x0000000000030000..0x000000000003ffff (64 KiB) > nomap non-reusable mmode_resv0@30000 > +OF: reserved mem: 0x0000000000040000..0x000000000004ffff (64 KiB) > nomap non-reusable mmode_resv1@40000 > +OF: reserved mem: 0x0000000044040000..0x000000004405ffff (128 KiB) > nomap non-reusable mmode_resv2@44040000 > +OF: reserved mem: 0x0000000044000000..0x000000004403ffff (256 KiB) > nomap non-reusable mmode_resv3@44000000 > Zone ranges: > DMA32 [mem 0x0000000048000000-0x000000007fffffff] > Normal empty > > Reverting commits 00c9a452a235c61f ("of: reserved_mem: Add code to > dynamically allocate reserved_mem array") and 8a6e02d0c00e7b62 fixes > the issue. > > root@smarc-rzfive:/sys/firmware/devicetree/base/reserved-memory# ls -l > total 0 > -r--r--r-- 1 root root 4 Oct 29 12:37 #address-cells > -r--r--r-- 1 root root 4 Oct 29 12:37 #size-cells > drwxr-xr-x 2 root root 0 Oct 29 12:37 mmode_resv0@30000 > drwxr-xr-x 2 root root 0 Oct 29 12:37 mmode_resv1@40000 > drwxr-xr-x 2 root root 0 Oct 29 12:37 mmode_resv2@44040000 > drwxr-xr-x 2 root root 0 Oct 29 12:37 mmode_resv3@44000000 > -r--r--r-- 1 root root 16 Oct 29 12:37 name > drwxr-xr-x 2 root root 0 Oct 29 12:37 pma_resv0@58000000 > -r--r--r-- 1 root root 0 Oct 29 12:37 ranges > > > diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c > > index 4d528c10df3a..d0dbc8183ac4 100644 > > --- a/drivers/of/fdt.c > > +++ b/drivers/of/fdt.c > > @@ -511,8 +511,6 @@ void __init early_init_fdt_scan_reserved_mem(void) > > break; > > memblock_reserve(base, size); > > } > > - > > - fdt_init_reserved_mem(); > > } > > > > /** > > @@ -1212,6 +1210,9 @@ void __init unflatten_device_tree(void) > > { > > void *fdt = initial_boot_params; > > > > + /* Save the statically-placed regions in the reserved_mem array */ > > + fdt_scan_reserved_mem_reg_nodes(); > > + > > /* Don't use the bootloader provided DTB if ACPI is enabled */ > > if (!acpi_disabled) > > fdt = NULL; > > diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h > > index 04aa2a91f851..29525c0b9939 100644 > > --- a/drivers/of/of_private.h > > +++ b/drivers/of/of_private.h > > @@ -9,6 +9,7 @@ > > */ > > > > #define FDT_ALIGN_SIZE 8 > > +#define MAX_RESERVED_REGIONS 64 > > > > /** > > * struct alias_prop - Alias property in 'aliases' node > > @@ -180,7 +181,7 @@ static inline struct device_node *__of_get_dma_parent(const struct device_node * > > #endif > > > > int fdt_scan_reserved_mem(void); > > -void fdt_init_reserved_mem(void); > > +void __init fdt_scan_reserved_mem_reg_nodes(void); > > > > bool of_fdt_device_is_available(const void *blob, unsigned long node); > > > > diff --git a/drivers/of/of_reserved_mem.c b/drivers/of/of_reserved_mem.c > > index 46e1c3fbc769..2011174211f9 100644 > > --- a/drivers/of/of_reserved_mem.c > > +++ b/drivers/of/of_reserved_mem.c > > @@ -27,7 +27,6 @@ > > > > #include "of_private.h" > > > > -#define MAX_RESERVED_REGIONS 64 > > static struct reserved_mem reserved_mem[MAX_RESERVED_REGIONS]; > > static int reserved_mem_count; > > > > @@ -56,6 +55,7 @@ static int __init early_init_dt_alloc_reserved_memory_arch(phys_addr_t size, > > return err; > > } > > > > +static void __init fdt_init_reserved_mem_node(struct reserved_mem *rmem); > > /* > > * fdt_reserved_mem_save_node() - save fdt node for second pass initialization > > */ > > @@ -74,6 +74,9 @@ static void __init fdt_reserved_mem_save_node(unsigned long node, const char *un > > rmem->base = base; > > rmem->size = size; > > > > + /* Call the region specific initialization function */ > > + fdt_init_reserved_mem_node(rmem); > > + > > reserved_mem_count++; > > return; > > } > > @@ -106,7 +109,6 @@ static int __init __reserved_mem_reserve_reg(unsigned long node, > > phys_addr_t base, size; > > int len; > > const __be32 *prop; > > - int first = 1; > > bool nomap; > > > > prop = of_get_flat_dt_prop(node, "reg", &len); > > @@ -134,10 +136,6 @@ static int __init __reserved_mem_reserve_reg(unsigned long node, > > uname, &base, (unsigned long)(size / SZ_1M)); > > > > len -= t_len; > > - if (first) { > > - fdt_reserved_mem_save_node(node, uname, base, size); > > - first = 0; > > - } > > } > > return 0; > > } > > @@ -165,12 +163,77 @@ static int __init __reserved_mem_check_root(unsigned long node) > > return 0; > > } > > > > +static void __init __rmem_check_for_overlap(void); > > + > > +/** > > + * fdt_scan_reserved_mem_reg_nodes() - Store info for the "reg" defined > > + * reserved memory regions. > > + * > > + * This function is used to scan through the DT and store the > > + * information for the reserved memory regions that are defined using > > + * the "reg" property. The region node number, name, base address, and > > + * size are all stored in the reserved_mem array by calling the > > + * fdt_reserved_mem_save_node() function. > > + */ > > +void __init fdt_scan_reserved_mem_reg_nodes(void) > > +{ > > + int t_len = (dt_root_addr_cells + dt_root_size_cells) * sizeof(__be32); > > + const void *fdt = initial_boot_params; > > + phys_addr_t base, size; > > + const __be32 *prop; > > + int node, child; > > + int len; > > + > > + if (!fdt) > > + return; > > + > > + node = fdt_path_offset(fdt, "/reserved-memory"); > > + if (node < 0) { > > + pr_info("Reserved memory: No reserved-memory node in the DT\n"); > > + return; > > + } > > + > > + if (__reserved_mem_check_root(node)) { > > + pr_err("Reserved memory: unsupported node format, ignoring\n"); > > + return; > > + } > > + > > + fdt_for_each_subnode(child, fdt, node) { > > + const char *uname; > > + > > + prop = of_get_flat_dt_prop(child, "reg", &len); > > + if (!prop) > > + continue; > > + if (!of_fdt_device_is_available(fdt, child)) > > + continue; > > + > > + uname = fdt_get_name(fdt, child, NULL); > > + if (len && len % t_len != 0) { > > + pr_err("Reserved memory: invalid reg property in '%s', skipping node.\n", > > + uname); > > + continue; > > + } > > + base = dt_mem_next_cell(dt_root_addr_cells, &prop); > > + size = dt_mem_next_cell(dt_root_size_cells, &prop); > > + > > + if (size) > > + fdt_reserved_mem_save_node(child, uname, base, size); > > + } > > + > > + /* check for overlapping reserved regions */ > > + __rmem_check_for_overlap(); > > +} > > + > > +static int __init __reserved_mem_alloc_size(unsigned long node, const char *uname); > > + > > /* > > * fdt_scan_reserved_mem() - scan a single FDT node for reserved memory > > */ > > int __init fdt_scan_reserved_mem(void) > > { > > int node, child; > > + int dynamic_nodes_cnt = 0; > > + int dynamic_nodes[MAX_RESERVED_REGIONS]; > > const void *fdt = initial_boot_params; > > > > node = fdt_path_offset(fdt, "/reserved-memory"); > > @@ -192,8 +255,24 @@ int __init fdt_scan_reserved_mem(void) > > uname = fdt_get_name(fdt, child, NULL); > > > > err = __reserved_mem_reserve_reg(child, uname); > > - if (err == -ENOENT && of_get_flat_dt_prop(child, "size", NULL)) > > - fdt_reserved_mem_save_node(child, uname, 0, 0); > > + /* > > + * Save the nodes for the dynamically-placed regions > > + * into an array which will be used for allocation right > > + * after all the statically-placed regions are reserved > > + * or marked as no-map. This is done to avoid dynamically > > + * allocating from one of the statically-placed regions. > > + */ > > + if (err == -ENOENT && of_get_flat_dt_prop(child, "size", NULL)) { > > + dynamic_nodes[dynamic_nodes_cnt] = child; > > + dynamic_nodes_cnt++; > > + } > > + } > > + for (int i = 0; i < dynamic_nodes_cnt; i++) { > > + const char *uname; > > + > > + child = dynamic_nodes[i]; > > + uname = fdt_get_name(fdt, child, NULL); > > + __reserved_mem_alloc_size(child, uname); > > } > > return 0; > > } > > @@ -253,8 +332,7 @@ static int __init __reserved_mem_alloc_in_range(phys_addr_t size, > > * __reserved_mem_alloc_size() - allocate reserved memory described by > > * 'size', 'alignment' and 'alloc-ranges' properties. > > */ > > -static int __init __reserved_mem_alloc_size(unsigned long node, > > - const char *uname, phys_addr_t *res_base, phys_addr_t *res_size) > > +static int __init __reserved_mem_alloc_size(unsigned long node, const char *uname) > > { > > int t_len = (dt_root_addr_cells + dt_root_size_cells) * sizeof(__be32); > > phys_addr_t start = 0, end = 0; > > @@ -334,9 +412,8 @@ static int __init __reserved_mem_alloc_size(unsigned long node, > > return -ENOMEM; > > } > > > > - *res_base = base; > > - *res_size = size; > > - > > + /* Save region in the reserved_mem array */ > > + fdt_reserved_mem_save_node(node, uname, base, size); > > return 0; > > } > > > > @@ -425,48 +502,37 @@ static void __init __rmem_check_for_overlap(void) > > } > > > > /** > > - * fdt_init_reserved_mem() - allocate and init all saved reserved memory regions > > + * fdt_init_reserved_mem_node() - Initialize a reserved memory region > > + * @rmem: reserved_mem struct of the memory region to be initialized. > > + * > > + * This function is used to call the region specific initialization > > + * function for a reserved memory region. > > */ > > -void __init fdt_init_reserved_mem(void) > > +static void __init fdt_init_reserved_mem_node(struct reserved_mem *rmem) > > { > > - int i; > > - > > - /* check for overlapping reserved regions */ > > - __rmem_check_for_overlap(); > > - > > - for (i = 0; i < reserved_mem_count; i++) { > > - struct reserved_mem *rmem = &reserved_mem[i]; > > - unsigned long node = rmem->fdt_node; > > - int err = 0; > > - bool nomap; > > + unsigned long node = rmem->fdt_node; > > + int err = 0; > > + bool nomap; > > > > - nomap = of_get_flat_dt_prop(node, "no-map", NULL) != NULL; > > + nomap = of_get_flat_dt_prop(node, "no-map", NULL) != NULL; > > > > - if (rmem->size == 0) > > - err = __reserved_mem_alloc_size(node, rmem->name, > > - &rmem->base, &rmem->size); > > - if (err == 0) { > > - err = __reserved_mem_init_node(rmem); > > - if (err != 0 && err != -ENOENT) { > > - pr_info("node %s compatible matching fail\n", > > - rmem->name); > > - if (nomap) > > - memblock_clear_nomap(rmem->base, rmem->size); > > - else > > - memblock_phys_free(rmem->base, > > - rmem->size); > > - } else { > > - phys_addr_t end = rmem->base + rmem->size - 1; > > - bool reusable = > > - (of_get_flat_dt_prop(node, "reusable", NULL)) != NULL; > > - > > - pr_info("%pa..%pa (%lu KiB) %s %s %s\n", > > - &rmem->base, &end, (unsigned long)(rmem->size / SZ_1K), > > - nomap ? "nomap" : "map", > > - reusable ? "reusable" : "non-reusable", > > - rmem->name ? rmem->name : "unknown"); > > - } > > - } > > + err = __reserved_mem_init_node(rmem); > > + if (err != 0 && err != -ENOENT) { > > + pr_info("node %s compatible matching fail\n", rmem->name); > > + if (nomap) > > + memblock_clear_nomap(rmem->base, rmem->size); > > + else > > + memblock_phys_free(rmem->base, rmem->size); > > + } else { > > + phys_addr_t end = rmem->base + rmem->size - 1; > > + bool reusable = > > + (of_get_flat_dt_prop(node, "reusable", NULL)) != NULL; > > + > > + pr_info("%pa..%pa (%lu KiB) %s %s %s\n", > > + &rmem->base, &end, (unsigned long)(rmem->size / SZ_1K), > > + nomap ? "nomap" : "map", > > + reusable ? "reusable" : "non-reusable", > > + rmem->name ? rmem->name : "unknown"); > > } > > } Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds