On Tue, Jan 3, 2023, at 17:43, Yann Sionneau wrote: > This patch series adds support for the kv3-1 CPU architecture of the kvx family > found in the Coolidge (aka MPPA3-80) SoC of Kalray. > > This is an RFC, since kvx support is not yet upstreamed into gcc/binutils, > therefore this patch series cannot be merged into Linux for now. > > The goal is to have preliminary reviews and to fix problems early. > > The Kalray VLIW processor family (kvx) has the following features: > * 32/64 bits execution mode > * 6-issue VLIW architecture > * 64 x 64bits general purpose registers > * SIMD instructions > * little-endian > * deep learning co-processor Thanks for posting these, I had been wondering about the state of the port. Overall this looks really nice, I can see that you and the team have looked at other ports and generally made the right decisions. I commented on the syscall patch directly, I think it's important to stop using the deprecated syscalls as soon as possible to avoid having dependencies in too many libc binaries. Almost everything else can be changed easily as you get closer to upstream inclusion. I did not receive most of the other patches as I'm not subscribed to all the mainline lists. For future submissions, can you add the linux-arch list to Cc for all patches? Reading the rest of the series through lore.kernel.org, most of the comments I have are for improvements that you may find valuable rather than serious mistakes: - the {copy_to,copy_from,clear}_user functions are well worth optimizing better than the byte-at-a-time version you have, even just a C version built around your __get_user/__put_user inline asm should help, and could be added to lib/usercopy.c. - The __raw_{read,write}{b,w,l,q} helpers should normally be defined as inline asm instead of volatile pointer dereferences, I've seen cases where the compiler ends up splitting the access or does other things you may not want on MMIO areas. - I would recomment implementing HAVE_ARCH_VMAP_STACK as well as IRQ stacks, both of these help to avoid data corruption from stack overflow that you will eventually run into. - You use qspinlock as the only available spinlock implementation, but only support running on a single cluster of 16 cores. It may help to use the generic ticket spinlock instead, or leave it as a Kconfig option, in particular since you only have the emulated xchg16() atomic for qspinlock. - Your defconfig file enables CONFIG_EMBEDDED, which in turn enables CONFIG_EXPERT. This is probably not what you want, so better turn off both of these. - The GENERIC_CALIBRATE_DELAY should not be necessary since you have a get_cycles() based delay loop. Just set loops_per_jiffy to the correct value based on the frequency of the cycle counter, to save a little time during boot and get a more accurate delay loop. Arnd