Hi Peter, > -----Original Message----- > From: Peter Zijlstra <peterz@xxxxxxxxxxxxx> > Sent: Thursday, February 14, 2019 1:32 PM > To: Vineet Gupta <vineet.gupta1@xxxxxxxxxxxx> > Cc: David Laight <David.Laight@xxxxxxxxxx>; Alexey Brodkin <alexey.brodkin@xxxxxxxxxxxx>; linux-snps- > arc@xxxxxxxxxxxxxxxxxxx; Arnd Bergmann <arnd.bergmann@xxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx; > stable@xxxxxxxxxxxxxxx; Mark Rutland <mark.rutland@xxxxxxx> > Subject: Re: [PATCH] ARC: Explicitly set ARCH_SLAB_MINALIGN = 8 > > On Wed, Feb 13, 2019 at 03:23:36PM -0800, Vineet Gupta wrote: > > On 2/13/19 4:56 AM, Peter Zijlstra wrote: > > > > > > Personally I think u64 and company should already force natural > > > alignment; but alas. > > > > But there is an ISA/ABI angle here too. e.g. On 32-bit ARC, LDD (load double) is > > allowed to take a 32-bit aligned address to load a register pair. Thus all u64 > > need not be 64-bit aligned (unless attribute aligned 8 etc) hence the relaxation > > in ABI (alignment of long long is 4). You could certainly argue that we end up > > undoing some of it anyways by defining things like ARCH_KMALLOC_MINALIGN to 8, but > > still... > > So what happens if the data is then split across two cachelines; will a > STD vs LDD still be single-copy-atomic? I don't _think_ we rely on that > for > sizeof(unsigned long), with the obvious exception of atomic64_t, > but yuck... STD & LDD are simple store/load instructions so there's no problem for their 64-bit data to be from 2 subsequent cache lines as well as 2 pages (if we're that unlucky). Or you mean something else? > So even though it is allowed by the chip; does it really make sense to > use this? It gives performance benefits when dealing with either 64-bit or even larger buffers, see how we use it in our string routines like here [1]. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arc/lib/memset-archs.S#n81 -Alexey