Re: [RFC PATCH] arm64: remove CONFIG_DEBUG_ALIGN_RODATA feature

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2 Apr 2020 at 13:30, Catalin Marinas <catalin.marinas@xxxxxxx> wrote:
>
> On Mon, Mar 30, 2020 at 04:32:31PM +0200, Ard Biesheuvel wrote:
> > On Mon, 30 Mar 2020 at 16:28, Will Deacon <will@xxxxxxxxxx> wrote:
> > > > On Mon, 30 Mar 2020 at 16:04, Will Deacon <will@xxxxxxxxxx> wrote:
> > > > > On Mon, Mar 30, 2020 at 03:53:04PM +0200, Ard Biesheuvel wrote:
> > > > > > On Mon, 30 Mar 2020 at 15:51, Will Deacon <will@xxxxxxxxxx> wrote:
> > > > > > > But I would really like to go a step further and rip out the block mapping
> > > > > > > support altogether so that we can fix non-coherent DMA aliases:
> > > > > > >
> > > > > > > https://lore.kernel.org/lkml/20200224194446.690816-1-hch@xxxxxx
> > > > > >
> > > > > > I'm not sure I follow - is this about mapping parts of the static
> > > > > > kernel Image for non-coherent DMA?
> > > > >
> > > > > Sorry, it's not directly related to your patch, just that if we're removing
> > > > > options relating to kernel mappings then I'd be quite keen on effectively
> > > > > forcing page-granularity on the linear map, as is currently done by default
> > > > > thanks to RODATA_FULL_DEFAULT_ENABLED, so that we can nobble cacheable
> > > > > aliases for non-coherent streaming DMA mappings by hooking into Christoph's
> > > > > series above.
>
> Have we ever hit this issue in practice? At least from the CPU
> perspective, we've assumed that a non-cacheable access would not hit in
> the cache. Reading the ARM ARM rules, it doesn't seem to state this
> explicitly but we can ask for clarification (I dug out an email from
> 2015, left unanswered).
>

There is some wording in D4.4.5 (Behavior of caches at reset) that
suggests that implementations may permit cache hits in regions that
are mapped Non-cacheable (although the paragraph in question talks
about global controls and not page table attributes)

> Assuming that the CPU is behaving as we'd expect, are there other issues
> with peripherals/SMMU?
>

There is the NoSnoop PCIe issue as well: PCIe masters that are DMA
coherent in general can generate transactions with non-cacheable
attributes. I guess this is mostly orthogonal, but I'm sure it would
be much easier to reason about correctness if it is guaranteed that no
mappings with mismatched attributes exist anywhere.

> > > Fair enough, but I'd still like to see some numbers. If they're compelling,
> > > then we could explore something like CONFIG_OF_DMA_DEFAULT_COHERENT, but
> > > that doesn't really help the kconfig maze :(
>
> I'd prefer not to have a config option, we could easily break single
> Image at some point.
>
> > Could we make this a runtime thing? E.g., remap the entire linear
> > region down to pages under stop_machine() the first time we probe a
> > device that uses non-coherent DMA?
>
> That could be pretty expensive at run-time. With the ARMv8.4-TTRem
> feature, I wonder whether we could do this lazily when allocating
> non-coherent DMA buffers.
>
> (I still hope there isn't a problem at all with this mismatch ;)).
>

Now that we have the pieces to easily remap the linear region down to
pages, and [apparently] some generic infrastructure to manage the
linear aliases, the only downside is the alleged performance hit
resulting from increased TLB pressure. This is obviously highly
micro-architecture dependent, but with Xgene1 and ThunderX1 out of the
picture, I wonder if the tradeoffs are different now. Maybe by now, we
should just suck it up (Note that we had no complaints afaik regarding
the fact that we map the linear map down to pages by default now)



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux