On Tue, Jul 07, 2015 at 11:50:12AM +0200, Luis R. Rodriguez wrote: > mcgrof@ergon ~/linux-next (git::kill-mtrr)$ git grep ioremap_nocache drivers/| wc -l > 359 Yes, it's because we have: (a) LDD telling people they should be using ioremap_nocache() for mapping devices. (b) We have documentation in the Documentation/ subdirectory telling people to use ioremap_nocache() for the same. > This is part of the work I've been doing lately. The > eventual goal once we have the write-combing areas properly split with > ioremap_wc() and using the new proper preferred architecture agnostic modifier > (arch_phys_wc_add()) is to change the default ioremap behaviour on x86 to use > strong UC for PAT enabled systems for *both* ioremap() and ioremap_nocache(). Please note that on ARM, ioremap_wc() gives what's termed in ARM ARM speak "normal memory, non-cacheable" - which can be subject to speculation, write combining, multiple accesses, etc. The important point is that such mapping is not suitable for device registers, but is suitable for device regions that have "memory like" properties (iow, a chunk of RAM, like video drivers.) It does support unaligned accesses. > Because of these grammatical issues and the issues with > unaligned access with ARM I think its important we put some effort > to care a bit more about defining clear semantics through grammar > for new APIs or as we rewrite APIs. We have tools to do this these > days, best make use of them. I'm in support of anything which more clearly specifies the requirements for these APIs. > While we're at it and reconsidering all this, a few items I wish for > us to address as well then, most of them related to grammar, some > procedural clarification: > > * Document it as not supported to have overlapping ioremap() calls. > No one seems to have a clue if this should work, but clearly this > is just a bad idea. I don't see why we should support the complexity > of having this. It seems we can write grammar rules to prevent this. On ARM, we (probably) have a lot of cases where ioremap() is used multiple times for the same physical address space, so we shouldn't rule out having multiple mappings of the same type. However, differing types would be a problem on ARM. > * We seem to care about device drivers / kernel code doing unaligned > accesses with certain ioremap() variants. At least for ARM you should > not do unaligned accesses on ioremap_nocache() areas. ... and ioremap() areas. If we can stop the "abuse" of ioremap_nocache() to map device registers, then we could potentially switch ioremap_nocache() to be a normal-memory like mapping, which would allow it to support unaligned accesses. > I am not sure > if we can come up with grammar to vet for / warn for unaligned access > type of code in driver code on some memory area when some ioremap() > variant is used, but this could be looked into. I believe we may > want rules for unaligned access maybe in general, and not attached > to certain calls due to performance considerations, so this work > may be welcomed regardless (refer to > Documentation/unaligned-memory-access.txt) > > * We seem to want to be pedantic about adding new ioremap() variants, the > unaligned issue on ARM is one reason, do we ideally then want *all* > architecture maintainers to provide an Acked-by for any new ioremap > variants ? /If/ we get the current mess sorted out so that we have a safe fallback, and we have understanding of the different architecture variants (iow, documented what the safe fallback is) I don't see any reason why we'd need acks from arch maintainers. Unfortunately, we're not in that situation today, because of the poorly documented mess that ioremap*() currently is (and yes, I'm partly to blame for that too by not documenting ARMs behaviour here.) I have some patches (prepared last week, I was going to push them out towards the end of the merge window) which address that, but unfortunately the ARM autobuilders have been giving a number of seemingly random boot failures, and I'm not yet sure what's going on... so I'm holding that back until stuff has settled down. Another issue is... the use of memcpy()/memset() directly on memory returned from ioremap*(). The pmem driver does this. This fails sparse checks. However, years ago, x86 invented the memcpy_fromio()/memcpy_toio() memset_io() functions, which took a __iomem pointer (which /presumably/ means they're supposed to operate on the memory associated with an ioremap'd region.) Should these functions always be used for mappings via ioremap*(), and the standard memcpy()/memset() be avoided? To me, that sounds like a very good thing, because that gives us more control over the implementation of the functions used to access ioremap'd regions, and the arch can decide to prevent GCC inlining its own memset() or memcpy() code if desired. Note that on x86, these three functions are merely wrappers around standard memcpy()/memset(), so there should be no reason why pmem.c couldn't be updated to use these accessors instead. -- FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up according to speedtest.net. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>