On Fri, Mar 29, 2024 at 7:56 PM Maximilian Luz <luzmaximilian@xxxxxxxxx> wrote: > > > > On 3/29/24 7:53 PM, Maximilian Luz wrote: > > On 3/29/24 11:22 AM, Bartosz Golaszewski wrote: > >> On Thu, Mar 28, 2024 at 7:55 PM Bartosz Golaszewski <brgl@xxxxxxxx> wrote: > >>> > >>> On Thu, Mar 28, 2024 at 5:50 PM Maximilian Luz <luzmaximilian@xxxxxxxxx> wrote: > >>>> > >>>> If I understand correctly, it enters an atomic section in > >>>> qcom_tzmem_alloc() and then tries to schedule somewhere down the line. > >>>> So this shouldn't be qseecom specific. > >>>> > >>>> I should probably also say that I'm currently testing this on a patched > >>>> v6.8 kernel, so there's a chance that it's my fault. However, as far as > >>>> I understand, it enters an atomic section in qcom_tzmem_alloc() and then > >>>> later tries to expand the pool memory with dma_alloc_coherent(). Which > >>>> AFAIK is allowed to sleep with GFP_KERNEL (and I guess that that's the > >>>> issue here). > >>>> > >>>> I've also tried the shmem allocator option, but that seems to get stuck > >>>> quite early at boot, before I even have usb-serial access to get any > >>>> logs. If I can find some more time, I'll try to see if I can get some > >>>> useful output for that. > >>>> > >>> > >>> Ah, I think it happens here: > >>> > >>> + guard(spinlock_irqsave)(&pool->lock); > >>> + > >>> +again: > >>> + vaddr = gen_pool_alloc(pool->genpool, size); > >>> + if (!vaddr) { > >>> + if (qcom_tzmem_try_grow_pool(pool, size, gfp)) > >>> + goto again; > >>> > >>> We were called with GFP_KERNEL so this is what we pass on to > >>> qcom_tzmem_try_grow_pool() but we're now holding the spinlock. I need > >>> to revisit it. Thanks for the catch! > >>> > >>> Bart > >> > >> Can you try the following tree? > >> > >> https://git.codelinaro.org/bartosz_golaszewski/linux.git > >> topic/shm-bridge-v10 > >> > >> gen_pool_alloc() and gen_pool_add_virt() can be used without external > >> serialization. We only really need to protect the list of areas in the > >> pool when adding a new element. We could possibly even use > >> list_add_tail_rcu() as it updates the pointers atomically and go > >> lockless. > > > > Thanks! That fixes the allocations for CONFIG_QCOM_TZMEM_MODE_GENERIC=y. > > Unfortunately, with the shmbridge mode it still gets stuck at boot (and > > I haven't had the time to look into it yet). > > > > And for more bad news: It looks like the new allocator now fully exposes > > a bug that I've been tracking down the last couple of days. In short, > > uefisecapp doesn't seem to be happy when we split the allocations for > > request and response into two, causing commands to fail. Instead it > > wants a single buffer for both. Before, it seemed to be fairly sporadic > > (likely because kzalloc in sequence just returned consecutive memory > > almost all of the time) but now it's basically every call that fails. > > > > I have a fix for that almost ready and I'll likely post it in the next > > hour. But that means that you'll probably have to rebase this series > > on top of it... > > Forgot to mention: I tested it with the fix and this series, and that > works. > Both with and without SHM bridge? If so, please Cc me on the fix. Bart