Hi, Thomas > > On 2023-07-08 23:29:58+0800, Zhangjin Wu wrote: > > As the environ and _auxv support added for nolibc, the assembly _start > > function becomes more and more complex and therefore makes the porting > > of nolibc to new architectures harder and harder. > > > > To simplify portability, this c version of _start_c() is added to do > > most of the assembly start operations in C, which reduces the complexity > > a lot and will eventually simplify the porting of nolibc to the new > > architectures. > > > > The new _start_c() only requires a stack pointer argument, it will find > > argv, envp and _auxv for us, and then call main(), finally, it exit() > > with main's return status. With this new _start_c(), the future new > > architectures only require to add very few assembly instructions. > > I like it! > > A quick test indicates that the initialization of the stackprotectors > could also be moved into the C function. > Cool, do you mean directly call __stack_chk_init() at the beginning of _start_c()? > It also seems like a good opportunity to add some tests for > argv/environment variable passing. Yes, and even further, we can do more on auxv, just like musl does in src/env/__libc_start_main.c, not that urgent currently: libc.auxv = auxv = (void *)(envp+i+1); ... __hwcap = aux[AT_HWCAP]; if (aux[AT_SYSINFO]) __sysinfo = aux[AT_SYSINFO]; ... libc.page_size = aux[AT_PAGESZ]; if (!pn) pn = (void*)aux[AT_EXECFN]; if (!pn) pn = ""; __progname = __progname_full = pn; for (i=0; pn[i]; i++) if (pn[i]=='/') __progname = pn+i+1; __init_tls(aux); __init_ssp((void *)aux[AT_RANDOM]); if (aux[AT_UID]==aux[AT_EUID] && aux[AT_GID]==aux[AT_EGID] && !aux[AT_SECURE]) return; ... libc.secure = 1; > > And as general note to the full series I think that splitting the arch > files is not necessary and confusing. > Ok, welcome to discuss more in this thread: https://lore.kernel.org/lkml/20230710072340.10798-1-falcon@xxxxxxxxxxx/ and let's choose a better method as possible as we can, Just replied Willy to explain more. > > Signed-off-by: Zhangjin Wu <falcon@xxxxxxxxxxx> > > --- > > tools/include/nolibc/crt.h | 44 ++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 44 insertions(+) > > > > diff --git a/tools/include/nolibc/crt.h b/tools/include/nolibc/crt.h > > index 221b7c5346ca..b269294e9664 100644 > > --- a/tools/include/nolibc/crt.h > > +++ b/tools/include/nolibc/crt.h > > @@ -13,4 +13,48 @@ > > char **environ __attribute__((weak)); > > The old code seems to avoid putting "environ" into the global symbol > namespace. Could this declaration be moved into the function like in > getenv()? > ok, do you mean just move it to stdlib.h like this? I moved _auxv (used by getauxv()) to stdlib.h too: tools/nolibc: move environ and _auxv from crt.h to stdlib.h Move the definitions of environ and _auxv from crt.h to stdlib.h, where the place who uses those definitions. - getenv uses environ - getauxv uses _auxcv Signed-off-by: Zhangjin Wu <falcon@xxxxxxxxxxx> diff --git a/tools/include/nolibc/crt.h b/tools/include/nolibc/crt.h index b269294e9664..d2f84cbe73d0 100644 --- a/tools/include/nolibc/crt.h +++ b/tools/include/nolibc/crt.h @@ -10,14 +10,13 @@ #include "compiler.h" #include "crt_arch.h" -char **environ __attribute__((weak)); -const unsigned long *_auxv __attribute__((weak)); - int main(int argc, char *argv[], char **envp); static void exit(int); void _start_c(long *sp) { + extern char **environ; + extern const unsigned long *_auxv; int argc, i; char **argv; char **envp; diff --git a/tools/include/nolibc/stdlib.h b/tools/include/nolibc/stdlib.h index 2f9b4b3c6d26..5eadadc2d0f5 100644 --- a/tools/include/nolibc/stdlib.h +++ b/tools/include/nolibc/stdlib.h @@ -14,6 +14,9 @@ #include "string.h" #include <linux/auxvec.h> +char **environ __attribute__((weak)); +const unsigned long *_auxv __attribute__((weak)); + struct nolibc_heap { size_t len; char user_p[] __attribute__((__aligned__)); > > const unsigned long *_auxv __attribute__((weak)); > > > > +int main(int argc, char *argv[], char **envp); > > This will lead to conflicting declarations if the users use a different > signature. I'm not (yet?) sure how to work around this. > Ah yes, I forgot this critical case, people may use something like: int main(void) int main(int argc, char *argv[]) Just applied the method suggested by you in anothe reply [1]: diff --git a/tools/include/nolibc/crt.h b/tools/include/nolibc/crt.h index d2f84cbe73d0..8fe38ef8c5b2 100644 --- a/tools/include/nolibc/crt.h +++ b/tools/include/nolibc/crt.h @@ -10,7 +10,7 @@ #include "compiler.h" #include "crt_arch.h" -int main(int argc, char *argv[], char **envp); +typedef int (_nolibc_main_fn)(int, char **, char **); static void exit(int); void _start_c(long *sp) @@ -20,6 +20,7 @@ void _start_c(long *sp) int argc, i; char **argv; char **envp; + _nolibc_main_fn _nolibc_main __asm__ ("main"); /* * sp : argc <-- argument count, required by main() @@ -53,7 +54,7 @@ void _start_c(long *sp) _auxv = (void *)(envp + i + 1); /* go to application */ - exit(main(argc, argv, envp)); + exit(_nolibc_main(argc, argv, envp)); } #endif /* _NOLIBC_CRT_H */ It works as expected, thanks very much! > Also how is the case handled where main() returns "void"? Do you mean the main() without an explicit "return 0"? int main(int argc, char *argv[], char **envp) { printf("Hello\n"); } Tested c89, c99, c2x, c23, gnu11, the same result as yours: std | return value ----------|------------ c89 | 6 /* return value of printf() */ c99 | 0 /* set by compiler with: xorl %eax, %eax */ c2x | 0 gnu11 | 0 > I'm not sure how this is currently handled or if the compiler takes care > of returning 0 in this case. void main(void) { printf("Hello\n"); } Tested c89, c99, c2x, c23, gnu11, the same result as yours, the return value is 6, the return value is the return status of the last called functon. Thanks, Zhangjin --- [1]: https://lore.kernel.org/lkml/90fdd255-32f4-4caf-90ff-06456b53dac3@xxxxxxxx/