Hi arnd, On Wed, Mar 21, 2018 at 03:36:43PM +0800, Arnd Bergmann wrote: > If the clocksource depends on a driver rather than a feature of the > architecture, > this may not be worth optimizing though, so maybe leave it as it is for now. > Ok, I'll keep it. > >> Usually the kernel should allow multiple CPU types to be selected > >> together, or ask for a "minimum architecture" level to be selected > >> by allow newer cores to be used as a superset. > > No, I need keep them seperate. > > Can you explain? What is it that makes them all incompatible? ck610 is abiv1 and its gcc is different from abiv2, they are: - csky-linux-abiv1-gcc - csky-linux-abiv2-gcc ck807/810/860 use the same csky-linux-abiv2-gcc, but their instruction-sets and pipeline schedule are different. So our gcc must&only use '-mcpu=' to determine which cpu series is. They are different cpu series. ck610 only have: ck610 ck807 could have: ck807 ck807f ck807vf ck807ef ck810 could have: ck810 ck810f ck810vf ck810ef ck860 could have: ck860 ck860f ck860vf f: means FPU co-processor v: means VDSP co-processor just like "ARM-NEON" e: is our old DSP co-processor which use HI-LO regs for operation. In current ck807/ck810 they default have HI-LO regs, so ck807&ck807e is the same for me. For this patch-set, we support: ck610 (ck807/ck807f/ck807ef) (ck810/ck810e/ck810ef) > >> > +config CPU_TLB_SIZE > >> > + int > >> > + default "128" if(CPU_CK610 || CPU_CK807 || CPU_CK810) > >> > + default "1024" if(CPU_CK860) > >> > + > >> > +config L1_CACHE_SHIFT > >> > + int > >> > + default "4" if(CPU_CK610) > >> > + default "5" if(CPU_CK807 || CPU_CK810) > >> > + default "6" if(CPU_CK860) > >> > >> I think you then need to reverse the order of the list here: When e.g. CK860 > >> and CK810 are both enabled, L1_CACHE_SHIFT should be the largest > >> possible size. > > No, I use L1_CACHE_SHIFT to determine the size of cache_line. > > When I flush cache for a range of memory, I need the size to loop flush cache line. > > This is still relatively easy to fix, you just need a cpu specific loop > that uses the actual line size rather than the maximum size. Here is my cacheflush code in mm/cachev2.c: #define L1_CACHE_BYTES (1<<L1_CACHE_SHIFT) for(i=start; i<end; i+=L1_CACHE_BYTES) asm volatile("dcache.cval1 %0\n"::"r"(i)); asm volatile("sync.is\n"); I use L1_CACHE_BYTES as the loop element to increase. So it must be the current CPU cache line size for "correct&performance". Each of our CPU-series has a fixed cache line size: ck610 is 16Bytes ck807/ck810 is 32Bytes ck860 is 64Bytes So I don't need determine them in .dts or detect on boot, just define them in Kconfig. > >> > +config SSEG0_BASE > >> > + hex "Direct mapping physical address" > >> > + default 0x0 > >> > + help > >> > + There are MSAx regs can be used to change the base physical address > >> > + of direct mapping. The default base physical address is 0x0. > >> > + > >> > +config RAM_BASE > >> > + hex "DRAM base address offset from SSEG0_BASE, it must be the same with dts memory." > >> > + default 0x08000000 > >> > >> To allow one kernel to run on multiple boards, it's better to detect > >> these two at runtime. > > CK-CPUs have a mips-like direct-mapping, and I use the macros to calculate the virtual-addr > > in headers. > > On many architectures, we detect the offsets at boot time and pass > them as variables. On > ARM, we go as far as patching the kernel at boot time to have constant > offsets, but usually > it's not worth the effort. I know it's duplicate setting with dts for users. But now, I still want to keep them. I'll consider your advices in future. > >> > +config CSKY_NR_IRQS > >> > + int "NR_IRQS to max virtual interrupt numbers of the whole system" > >> > + range 64 8192 > >> > + default "128" > >> > +endmenu > >> > >> This should no longer be needed, with the IRQ domain code, any number > >> of interrupts > >> can be used without noticeable overhead. > > Not I use it, some of our users need it to expand the GPIO irqs. Because > > they don't use irq domain code properly. I move it to Kconfig.debug, OK? > > It sounds like your GPIO driver should get fixed to use irq domains right, > it should not be too hard. The number of GPIOs is typically a compile > time constant today, but we also try to turn it into a dynamic allocation > that we have for IRQs on most targets. Got it, remove the CSKY_NR_IRQS. > What I meant here is that you can get the same behavior by > appending the dtb to the kernel rather than linking it into the > kernel. The reason for preferring the appended one is that you > can more easily use the same kernel binary across boards with > different bootloaders. Ok, got it. I'll use the dtb passed by bootloader first. > I'd say it's better to get rid of it for the upstream port, more importantly > getting rid of the code that checks for this symbol. Usually what happens > with version checks like this one is that they get out of sync quickly > as a new kernel version does things differently and diverges more > from the old release you were comparing against. In device drivers, > we tend to remove all those checks. Ok, follow the rules. > > >> -fno-tree-dse? > > This is from "gcc-4.5 compile linux-4.7" and it will cause wrong code without > > -fno-tree-dse for list.h. Now we use gcc-6.3, so I will try to remove it. > > You can also use the cc-ifversion Makefile macro to apply it on > the old compiler. That way you can still use gcc-4.5 as long as > that remains relevant but don't get worse code generation on > modern versions. Thx for advice, but we don't need support gcc-4.5 now. I'll just remove them. > >> Can you explain why you need the alignement fixups? > > For abiv1 ck610 couldn't handle the unalignment address access, so we > > need soft-alignment exception to fixup. There is no problem in abiv2 cpus. > > Ok. Generally speaking, architectures that don't allow unaligned access > should have all code built in a way that uses aligned access (gcc normally > falls back to byte access when it encounters an unaligned pointer at > compile time), but if this is just for old CPUs that are not used in future > products, having the fixup does sound simpler, as it allows you to still > run new binaries on the old machines. I haven't looked at the implementation > for the fixup here, but I remember the same thing from the nds32 port. > In that case, we ended up keeping the fixup as an option for old > user space, but disabled to softalign fixups for kernel code. Can you do > the same thing here? Ok. I got it, I'll do the same as nds32. Best Regards Guo Ren