On Mon, Dec 2, 2019 at 9:08 PM John Stultz <john.stultz@xxxxxxxxxx> wrote: > > On Wed, Sep 11, 2019 at 11:26 AM Nicolas Saenz Julienne > <nsaenzjulienne@xxxxxxx> wrote: > > So far all arm64 devices have supported 32 bit DMA masks for their > > peripherals. This is not true anymore for the Raspberry Pi 4 as most of > > it's peripherals can only address the first GB of memory on a total of > > up to 4 GB. > > > > This goes against ZONE_DMA32's intent, as it's expected for ZONE_DMA32 > > to be addressable with a 32 bit mask. So it was decided to re-introduce > > ZONE_DMA in arm64. > > > > ZONE_DMA will contain the lower 1G of memory, which is currently the > > memory area addressable by any peripheral on an arm64 device. > > ZONE_DMA32 will contain the rest of the 32 bit addressable memory. > > > > Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@xxxxxxx> > > Reviewed-by: Catalin Marinas <catalin.marinas@xxxxxxx> > > Hey Nicolas, > Testing the db845c with linus/master, I found a regression causing > system hangs in early boot: > > [ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x517f803c] > [ 0.000000] Linux version 5.4.0-mainline-10675-g957a03b9e38f > (docker@a4ec90a1e72c) (gcc version 7.4.0 (Ubuntu/Linaro > 7.4.0-1ubuntu1~18.04.1)) #1209 SMP PREEMPT Tue Dec 3 00:23:15 UTC 2019 > [ 0.000000] Machine model: Thundercomm Dragonboard 845c > [ 0.000000] earlycon: qcom_geni0 at MMIO 0x0000000000a84000 > (options '115200n8') > [ 0.000000] printk: bootconsole [qcom_geni0] enabled > [ 0.000000] efi: Getting EFI parameters from FDT: > [ 0.000000] efi: UEFI not found. > [ 0.000000] cma: Reserved 16 MiB at 0x00000000ff000000 > [ 0.000000] psci: probing for conduit method from DT. > [ 0.000000] psci: PSCIv1.1 detected in firmware. > [ 0.000000] psci: Using standard PSCI v0.2 function IDs > [ 0.000000] psci: MIGRATE_INFO_TYPE not supported. > [ 0.000000] psci: SMC Calling Convention v1.0 > [ 0.000000] psci: OSI mode supported. > [ 0.000000] percpu: Embedded 31 pages/cpu s87512 r8192 d31272 u126976 > [ 0.000000] Detected VIPT I-cache on CPU0 > [ 0.000000] CPU features: detected: GIC system register CPU interface > [ 0.000000] CPU features: kernel page table isolation forced ON by KASLR > [ 0.000000] CPU features: detected: Kernel page table isolation (KPTI) > [ 0.000000] ARM_SMCCC_ARCH_WORKAROUND_1 missing from firmware > [ 0.000000] CPU features: detected: Hardware dirty bit management > [ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: -188245 > [ 0.000000] Kernel command line: earlycon > firmware_class.path=/vendor/firmware/ androidboot.hardware=db845c > init=/init androidboot.boot_devices=soc/1d84000.ufshc > printk.devkmsg=on buildvariant=userdebug root=/dev/sda2 > androidboot.bootdevice=1d84000.ufshc androidboot.serialno=c4e1189c > androidboot.baseband=sda > msm_drm.dsi_display0=dsi_lt9611_1080_video_display: > androidboot.slot_suffix=_a skip_initramfs rootwait ro init=/init > > <hangs indefinitely here> > > I bisected the issue down to this patch (1a8e1cef7603 upstream - the > previous patch a573cdd7973d works though I need to apply the > arm64_dma_phys_limit bit from this one as the previous patch doesn't > build on its own). > > In the above log: > [ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: -188245 > looks the most suspect, and going back to the working a573cdd7973d + > build fix I see: > [ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 957419 > > Do you have any suggestions for what might be going wrong? Digging further, it seems the error is found in calculate_node_totalpages() real_size = size - zone_absent_pages_in_node(pgdat->node_id, i, node_start_pfn, node_end_pfn, zholes_size); Where for zone DMA32 size is 262144, but real_size is calculated as -883520. I've not traced through to figure out why zone_absent_pages_in_node is coming up with such a large number yet, but I'm about to crash so I wanted to share. thanks -john