Re: [PATCH v10 1/2] of: reserved_mem: Restruture how the reserved memory regions are processed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Oreoluwa,

On Wed, Oct 9, 2024 at 12:08 AM Oreoluwa Babatunde
<quic_obabatun@xxxxxxxxxxx> wrote:
> Reserved memory regions defined in the devicetree can be broken up into
> two groups:
> i) Statically-placed reserved memory regions
> i.e. regions defined with a static start address and size using the
>      "reg" property.
> ii) Dynamically-placed reserved memory regions.
> i.e. regions defined by specifying an address range where they can be
>      placed in memory using the "alloc_ranges" and "size" properties.
>
> These regions are processed and set aside at boot time.
> This is done in two stages as seen below:
>
> Stage 1:
> At this stage, fdt_scan_reserved_mem() scans through the child nodes of
> the reserved_memory node using the flattened devicetree and does the
> following:
>
> 1) If the node represents a statically-placed reserved memory region,
>    i.e. if it is defined using the "reg" property:
>    - Call memblock_reserve() or memblock_mark_nomap() as needed.
>    - Add the information for that region into the reserved_mem array
>      using fdt_reserved_mem_save_node().
>      i.e. fdt_reserved_mem_save_node(node, name, base, size).
>
> 2) If the node represents a dynamically-placed reserved memory region,
>    i.e. if it is defined using "alloc-ranges" and "size" properties:
>    - Add the information for that region to the reserved_mem array with
>      the starting address and size set to 0.
>      i.e. fdt_reserved_mem_save_node(node, name, 0, 0).
>    Note: This region is saved to the array with a starting address of 0
>    because a starting address is not yet allocated for it.
>
> Stage 2:
> After iterating through all the reserved memory nodes and storing their
> relevant information in the reserved_mem array,fdt_init_reserved_mem() is
> called and does the following:
>
> 1) For statically-placed reserved memory regions:
>    - Call the region specific init function using
>      __reserved_mem_init_node().
> 2) For dynamically-placed reserved memory regions:
>    - Call __reserved_mem_alloc_size() which is used to allocate memory
>      for each of these regions, and mark them as nomap if they have the
>      nomap property specified in the DT.
>    - Call the region specific init function.
>
> The current size of the resvered_mem array is 64 as is defined by
> MAX_RESERVED_REGIONS. This means that there is a limitation of 64 for
> how many reserved memory regions can be specified on a system.
> As systems continue to grow more and more complex, the number of
> reserved memory regions needed are also growing and are starting to hit
> this 64 count limit, hence the need to make the reserved_mem array
> dynamically sized (i.e. dynamically allocating memory for the
> reserved_mem array using membock_alloc_*).
>
> On architectures such as arm64, memory allocated using memblock is
> writable only after the page tables have been setup. This means that if
> the reserved_mem array is going to be dynamically allocated, it needs to
> happen after the page tables have been setup, not before.
>
> Since the reserved memory regions are currently being processed and
> added to the array before the page tables are setup, there is a need to
> change the order in which some of the processing is done to allow for
> the reserved_mem array to be dynamically sized.
>
> It is possible to process the statically-placed reserved memory regions
> without needing to store them in the reserved_mem array until after the
> page tables have been setup because all the information stored in the
> array is readily available in the devicetree and can be referenced at
> any time.
> Dynamically-placed reserved memory regions on the other hand get
> assigned a start address only at runtime, and hence need a place to be
> stored once they are allocated since there is no other referrence to the
> start address for these regions.
>
> Hence this patch changes the processing order of the reserved memory
> regions in the following ways:
>
> Step 1:
> fdt_scan_reserved_mem() scans through the child nodes of
> the reserved_memory node using the flattened devicetree and does the
> following:
>
> 1) If the node represents a statically-placed reserved memory region,
>    i.e. if it is defined using the "reg" property:
>    - Call memblock_reserve() or memblock_mark_nomap() as needed.
>
> 2) If the node represents a dynamically-placed reserved memory region,
>    i.e. if it is defined using "alloc-ranges" and "size" properties:
>    - Call __reserved_mem_alloc_size() which will:
>      i) Allocate memory for the reserved region and call
>      memblock_mark_nomap() as needed.
>      ii) Call the region specific initialization function using
>      fdt_init_reserved_mem_node().
>      iii) Save the region information in the reserved_mem array using
>      fdt_reserved_mem_save_node().
>
> Step 2:
> 1) This stage of the reserved memory processing is now only used to add
>    the statically-placed reserved memory regions into the reserved_mem
>    array using fdt_scan_reserved_mem_reg_nodes(), as well as call their
>    region specific initialization functions.
>
> 2) This step has also been moved to be after the page tables are
>    setup. Moving this will allow us to replace the reserved_mem
>    array with a dynamically sized array before storing the rest of
>    these regions.
>
> Signed-off-by: Oreoluwa Babatunde <quic_obabatun@xxxxxxxxxxx>

Thanks for your patch, which is now commit 8a6e02d0c00e7b62
("of: reserved_mem: Restructure how the reserved memory regions
are processed") in dt-rh/for-next.

I have bisected a boot issue on RZ/Five to this commit.
With "earlycon keep_bootcon" (else there is no output):

    Oops - store (or AMO) access fault [#1]
    CPU: 0 UID: 0 PID: 1 Comm: swapper Not tainted
6.12.0-rc1-00015-g8a6e02d0c00e #201
    Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT)
    epc : __memset+0x60/0x100
     ra : __dma_alloc_from_coherent+0x150/0x17a
    epc : ffffffff8062d2bc ra : ffffffff80053a94 sp : ffffffc60000ba20
     gp : ffffffff812e9938 tp : ffffffd601920000 t0 : ffffffc6000d0000
     t1 : 0000000000000000 t2 : ffffffffe9600000 s0 : ffffffc60000baa0
     s1 : ffffffc6000d0000 a0 : ffffffc6000d0000 a1 : 0000000000000000
     a2 : 0000000000001000 a3 : ffffffc6000d1000 a4 : 0000000000000000
     a5 : 0000000000000000 a6 : ffffffd601adacc0 a7 : ffffffd601a841a8
     s2 : ffffffd6018573c0 s3 : 0000000000001000 s4 : ffffffd6019541e0
     s5 : 0000000200000022 s6 : ffffffd6018f8410 s7 : ffffffd6018573e8
     s8 : 0000000000000001 s9 : 0000000000000001 s10: 0000000000000010
     s11: 0000000000000000 t3 : 0000000000000000 t4 : ffffffffdefe62d1
     t5 : 000000001cd6a3a9 t6 : ffffffd601b2aad6
    status: 0000000200000120 badaddr: ffffffc6000d0000 cause: 0000000000000007
    [<ffffffff8062d2bc>] __memset+0x60/0x100
    [<ffffffff80053e1a>] dma_alloc_from_global_coherent+0x1c/0x28
    [<ffffffff80053056>] dma_direct_alloc+0x98/0x112
    [<ffffffff8005238c>] dma_alloc_attrs+0x78/0x86
    [<ffffffff8035fdb4>] rz_dmac_probe+0x3f6/0x50a
    [<ffffffff803a0694>] platform_probe+0x4c/0x8a
    [<ffffffff8039ea16>] really_probe+0xe4/0x1c8
    [<ffffffff8039ebc4>] __driver_probe_device+0xca/0xce
    [<ffffffff8039ec48>] driver_probe_device+0x34/0x92
    [<ffffffff8039ede8>] __driver_attach+0xb4/0xbe
    [<ffffffff8039ce58>] bus_for_each_dev+0x60/0xa0
    [<ffffffff8039e26a>] driver_attach+0x1a/0x22
    [<ffffffff8039dc20>] bus_add_driver+0xa4/0x184
    [<ffffffff8039f65c>] driver_register+0x8a/0xb4
    [<ffffffff803a051c>] __platform_driver_register+0x1c/0x24
    [<ffffffff808202f6>] rz_dmac_driver_init+0x1a/0x22
    [<ffffffff80800ef6>] do_one_initcall+0x64/0x134
    [<ffffffff8080122e>] kernel_init_freeable+0x200/0x202
    [<ffffffff80638126>] kernel_init+0x1e/0x10a
    [<ffffffff8063d58e>] ret_from_fork+0xe/0x18
    Code: 1007 82b3 40e2 0797 0000 8793 00e7 8305 97ba 8782 (b023) 00b2
    ---[ end trace 0000000000000000 ]---
    Kernel panic - not syncing: Fatal exception in interrupt
    ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---

Nothing really stands out in the kernel log, except for a delayed
initialization of the reserved mem nodes (they are the same
before/after):

 printk: debug: ignoring loglevel setting.
-OF: reserved mem: 0x0000000000030000..0x000000000003ffff (64 KiB)
nomap non-reusable mmode_resv0@30000
-OF: reserved mem: 0x0000000000040000..0x000000000004ffff (64 KiB)
nomap non-reusable mmode_resv1@40000
-OF: reserved mem: 0x0000000044000000..0x000000004403ffff (256 KiB)
nomap non-reusable mmode_resv3@44000000
-OF: reserved mem: 0x0000000044040000..0x000000004405ffff (128 KiB)
nomap non-reusable mmode_resv2@44040000
+earlycon: scif0 at MMIO 0x000000001004b800 (options '115200n8')
+printk: legacy bootconsole [scif0] enabled
+printk: debug: skip boot console de-registration.
 Reserved memory: created DMA memory pool at 0x0000000058000000, size 128 MiB
 OF: reserved mem: initialized node pma_resv0@58000000, compatible id
shared-dma-pool
 OF: reserved mem: 0x0000000058000000..0x000000005fffffff (131072 KiB)
nomap non-reusable pma_resv0@58000000
+OF: reserved mem: 0x0000000000030000..0x000000000003ffff (64 KiB)
nomap non-reusable mmode_resv0@30000
+OF: reserved mem: 0x0000000000040000..0x000000000004ffff (64 KiB)
nomap non-reusable mmode_resv1@40000
+OF: reserved mem: 0x0000000044040000..0x000000004405ffff (128 KiB)
nomap non-reusable mmode_resv2@44040000
+OF: reserved mem: 0x0000000044000000..0x000000004403ffff (256 KiB)
nomap non-reusable mmode_resv3@44000000
 Zone ranges:
   DMA32    [mem 0x0000000048000000-0x000000007fffffff]
   Normal   empty

Reverting commits 00c9a452a235c61f ("of: reserved_mem: Add code to
dynamically allocate reserved_mem array") and 8a6e02d0c00e7b62 fixes
the issue.

root@smarc-rzfive:/sys/firmware/devicetree/base/reserved-memory# ls -l
total 0
-r--r--r-- 1 root root  4 Oct 29 12:37 #address-cells
-r--r--r-- 1 root root  4 Oct 29 12:37 #size-cells
drwxr-xr-x 2 root root  0 Oct 29 12:37 mmode_resv0@30000
drwxr-xr-x 2 root root  0 Oct 29 12:37 mmode_resv1@40000
drwxr-xr-x 2 root root  0 Oct 29 12:37 mmode_resv2@44040000
drwxr-xr-x 2 root root  0 Oct 29 12:37 mmode_resv3@44000000
-r--r--r-- 1 root root 16 Oct 29 12:37 name
drwxr-xr-x 2 root root  0 Oct 29 12:37 pma_resv0@58000000
-r--r--r-- 1 root root  0 Oct 29 12:37 ranges

> diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> index 4d528c10df3a..d0dbc8183ac4 100644
> --- a/drivers/of/fdt.c
> +++ b/drivers/of/fdt.c
> @@ -511,8 +511,6 @@ void __init early_init_fdt_scan_reserved_mem(void)
>                         break;
>                 memblock_reserve(base, size);
>         }
> -
> -       fdt_init_reserved_mem();
>  }
>
>  /**
> @@ -1212,6 +1210,9 @@ void __init unflatten_device_tree(void)
>  {
>         void *fdt = initial_boot_params;
>
> +       /* Save the statically-placed regions in the reserved_mem array */
> +       fdt_scan_reserved_mem_reg_nodes();
> +
>         /* Don't use the bootloader provided DTB if ACPI is enabled */
>         if (!acpi_disabled)
>                 fdt = NULL;
> diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h
> index 04aa2a91f851..29525c0b9939 100644
> --- a/drivers/of/of_private.h
> +++ b/drivers/of/of_private.h
> @@ -9,6 +9,7 @@
>   */
>
>  #define FDT_ALIGN_SIZE 8
> +#define MAX_RESERVED_REGIONS    64
>
>  /**
>   * struct alias_prop - Alias property in 'aliases' node
> @@ -180,7 +181,7 @@ static inline struct device_node *__of_get_dma_parent(const struct device_node *
>  #endif
>
>  int fdt_scan_reserved_mem(void);
> -void fdt_init_reserved_mem(void);
> +void __init fdt_scan_reserved_mem_reg_nodes(void);
>
>  bool of_fdt_device_is_available(const void *blob, unsigned long node);
>
> diff --git a/drivers/of/of_reserved_mem.c b/drivers/of/of_reserved_mem.c
> index 46e1c3fbc769..2011174211f9 100644
> --- a/drivers/of/of_reserved_mem.c
> +++ b/drivers/of/of_reserved_mem.c
> @@ -27,7 +27,6 @@
>
>  #include "of_private.h"
>
> -#define MAX_RESERVED_REGIONS   64
>  static struct reserved_mem reserved_mem[MAX_RESERVED_REGIONS];
>  static int reserved_mem_count;
>
> @@ -56,6 +55,7 @@ static int __init early_init_dt_alloc_reserved_memory_arch(phys_addr_t size,
>         return err;
>  }
>
> +static void __init fdt_init_reserved_mem_node(struct reserved_mem *rmem);
>  /*
>   * fdt_reserved_mem_save_node() - save fdt node for second pass initialization
>   */
> @@ -74,6 +74,9 @@ static void __init fdt_reserved_mem_save_node(unsigned long node, const char *un
>         rmem->base = base;
>         rmem->size = size;
>
> +       /* Call the region specific initialization function */
> +       fdt_init_reserved_mem_node(rmem);
> +
>         reserved_mem_count++;
>         return;
>  }
> @@ -106,7 +109,6 @@ static int __init __reserved_mem_reserve_reg(unsigned long node,
>         phys_addr_t base, size;
>         int len;
>         const __be32 *prop;
> -       int first = 1;
>         bool nomap;
>
>         prop = of_get_flat_dt_prop(node, "reg", &len);
> @@ -134,10 +136,6 @@ static int __init __reserved_mem_reserve_reg(unsigned long node,
>                                uname, &base, (unsigned long)(size / SZ_1M));
>
>                 len -= t_len;
> -               if (first) {
> -                       fdt_reserved_mem_save_node(node, uname, base, size);
> -                       first = 0;
> -               }
>         }
>         return 0;
>  }
> @@ -165,12 +163,77 @@ static int __init __reserved_mem_check_root(unsigned long node)
>         return 0;
>  }
>
> +static void __init __rmem_check_for_overlap(void);
> +
> +/**
> + * fdt_scan_reserved_mem_reg_nodes() - Store info for the "reg" defined
> + * reserved memory regions.
> + *
> + * This function is used to scan through the DT and store the
> + * information for the reserved memory regions that are defined using
> + * the "reg" property. The region node number, name, base address, and
> + * size are all stored in the reserved_mem array by calling the
> + * fdt_reserved_mem_save_node() function.
> + */
> +void __init fdt_scan_reserved_mem_reg_nodes(void)
> +{
> +       int t_len = (dt_root_addr_cells + dt_root_size_cells) * sizeof(__be32);
> +       const void *fdt = initial_boot_params;
> +       phys_addr_t base, size;
> +       const __be32 *prop;
> +       int node, child;
> +       int len;
> +
> +       if (!fdt)
> +               return;
> +
> +       node = fdt_path_offset(fdt, "/reserved-memory");
> +       if (node < 0) {
> +               pr_info("Reserved memory: No reserved-memory node in the DT\n");
> +               return;
> +       }
> +
> +       if (__reserved_mem_check_root(node)) {
> +               pr_err("Reserved memory: unsupported node format, ignoring\n");
> +               return;
> +       }
> +
> +       fdt_for_each_subnode(child, fdt, node) {
> +               const char *uname;
> +
> +               prop = of_get_flat_dt_prop(child, "reg", &len);
> +               if (!prop)
> +                       continue;
> +               if (!of_fdt_device_is_available(fdt, child))
> +                       continue;
> +
> +               uname = fdt_get_name(fdt, child, NULL);
> +               if (len && len % t_len != 0) {
> +                       pr_err("Reserved memory: invalid reg property in '%s', skipping node.\n",
> +                              uname);
> +                       continue;
> +               }
> +               base = dt_mem_next_cell(dt_root_addr_cells, &prop);
> +               size = dt_mem_next_cell(dt_root_size_cells, &prop);
> +
> +               if (size)
> +                       fdt_reserved_mem_save_node(child, uname, base, size);
> +       }
> +
> +       /* check for overlapping reserved regions */
> +       __rmem_check_for_overlap();
> +}
> +
> +static int __init __reserved_mem_alloc_size(unsigned long node, const char *uname);
> +
>  /*
>   * fdt_scan_reserved_mem() - scan a single FDT node for reserved memory
>   */
>  int __init fdt_scan_reserved_mem(void)
>  {
>         int node, child;
> +       int dynamic_nodes_cnt = 0;
> +       int dynamic_nodes[MAX_RESERVED_REGIONS];
>         const void *fdt = initial_boot_params;
>
>         node = fdt_path_offset(fdt, "/reserved-memory");
> @@ -192,8 +255,24 @@ int __init fdt_scan_reserved_mem(void)
>                 uname = fdt_get_name(fdt, child, NULL);
>
>                 err = __reserved_mem_reserve_reg(child, uname);
> -               if (err == -ENOENT && of_get_flat_dt_prop(child, "size", NULL))
> -                       fdt_reserved_mem_save_node(child, uname, 0, 0);
> +               /*
> +                * Save the nodes for the dynamically-placed regions
> +                * into an array which will be used for allocation right
> +                * after all the statically-placed regions are reserved
> +                * or marked as no-map. This is done to avoid dynamically
> +                * allocating from one of the statically-placed regions.
> +                */
> +               if (err == -ENOENT && of_get_flat_dt_prop(child, "size", NULL)) {
> +                       dynamic_nodes[dynamic_nodes_cnt] = child;
> +                       dynamic_nodes_cnt++;
> +               }
> +       }
> +       for (int i = 0; i < dynamic_nodes_cnt; i++) {
> +               const char *uname;
> +
> +               child = dynamic_nodes[i];
> +               uname = fdt_get_name(fdt, child, NULL);
> +               __reserved_mem_alloc_size(child, uname);
>         }
>         return 0;
>  }
> @@ -253,8 +332,7 @@ static int __init __reserved_mem_alloc_in_range(phys_addr_t size,
>   * __reserved_mem_alloc_size() - allocate reserved memory described by
>   *     'size', 'alignment'  and 'alloc-ranges' properties.
>   */
> -static int __init __reserved_mem_alloc_size(unsigned long node,
> -       const char *uname, phys_addr_t *res_base, phys_addr_t *res_size)
> +static int __init __reserved_mem_alloc_size(unsigned long node, const char *uname)
>  {
>         int t_len = (dt_root_addr_cells + dt_root_size_cells) * sizeof(__be32);
>         phys_addr_t start = 0, end = 0;
> @@ -334,9 +412,8 @@ static int __init __reserved_mem_alloc_size(unsigned long node,
>                 return -ENOMEM;
>         }
>
> -       *res_base = base;
> -       *res_size = size;
> -
> +       /* Save region in the reserved_mem array */
> +       fdt_reserved_mem_save_node(node, uname, base, size);
>         return 0;
>  }
>
> @@ -425,48 +502,37 @@ static void __init __rmem_check_for_overlap(void)
>  }
>
>  /**
> - * fdt_init_reserved_mem() - allocate and init all saved reserved memory regions
> + * fdt_init_reserved_mem_node() - Initialize a reserved memory region
> + * @rmem: reserved_mem struct of the memory region to be initialized.
> + *
> + * This function is used to call the region specific initialization
> + * function for a reserved memory region.
>   */
> -void __init fdt_init_reserved_mem(void)
> +static void __init fdt_init_reserved_mem_node(struct reserved_mem *rmem)
>  {
> -       int i;
> -
> -       /* check for overlapping reserved regions */
> -       __rmem_check_for_overlap();
> -
> -       for (i = 0; i < reserved_mem_count; i++) {
> -               struct reserved_mem *rmem = &reserved_mem[i];
> -               unsigned long node = rmem->fdt_node;
> -               int err = 0;
> -               bool nomap;
> +       unsigned long node = rmem->fdt_node;
> +       int err = 0;
> +       bool nomap;
>
> -               nomap = of_get_flat_dt_prop(node, "no-map", NULL) != NULL;
> +       nomap = of_get_flat_dt_prop(node, "no-map", NULL) != NULL;
>
> -               if (rmem->size == 0)
> -                       err = __reserved_mem_alloc_size(node, rmem->name,
> -                                                &rmem->base, &rmem->size);
> -               if (err == 0) {
> -                       err = __reserved_mem_init_node(rmem);
> -                       if (err != 0 && err != -ENOENT) {
> -                               pr_info("node %s compatible matching fail\n",
> -                                       rmem->name);
> -                               if (nomap)
> -                                       memblock_clear_nomap(rmem->base, rmem->size);
> -                               else
> -                                       memblock_phys_free(rmem->base,
> -                                                          rmem->size);
> -                       } else {
> -                               phys_addr_t end = rmem->base + rmem->size - 1;
> -                               bool reusable =
> -                                       (of_get_flat_dt_prop(node, "reusable", NULL)) != NULL;
> -
> -                               pr_info("%pa..%pa (%lu KiB) %s %s %s\n",
> -                                       &rmem->base, &end, (unsigned long)(rmem->size / SZ_1K),
> -                                       nomap ? "nomap" : "map",
> -                                       reusable ? "reusable" : "non-reusable",
> -                                       rmem->name ? rmem->name : "unknown");
> -                       }
> -               }
> +       err = __reserved_mem_init_node(rmem);
> +       if (err != 0 && err != -ENOENT) {
> +               pr_info("node %s compatible matching fail\n", rmem->name);
> +               if (nomap)
> +                       memblock_clear_nomap(rmem->base, rmem->size);
> +               else
> +                       memblock_phys_free(rmem->base, rmem->size);
> +       } else {
> +               phys_addr_t end = rmem->base + rmem->size - 1;
> +               bool reusable =
> +                       (of_get_flat_dt_prop(node, "reusable", NULL)) != NULL;
> +
> +               pr_info("%pa..%pa (%lu KiB) %s %s %s\n",
> +                       &rmem->base, &end, (unsigned long)(rmem->size / SZ_1K),
> +                       nomap ? "nomap" : "map",
> +                       reusable ? "reusable" : "non-reusable",
> +                       rmem->name ? rmem->name : "unknown");
>         }
>  }

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds





[Index of Archives]     [Device Tree Compilter]     [Device Tree Spec]     [Linux Driver Backports]     [Video for Linux]     [Linux USB Devel]     [Linux PCI Devel]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Yosemite Backpacking]


  Powered by Linux