The big change was actually snps,aal. As per the TRM, DMA channels not address aligned have severe limitations, if they work at all. Setting the DMA ops as address aligned fixed my 30mbps TX issue when combined with your snps,txpbl = <0x4>.
Honestly, I don't notice any difference either way with aal. So what happens without it? If You only use the 0x4 txpbl and having removed thresh dma mode, (2 things then) do you get bad tx?