Hi, Willy > On Sun, Aug 13, 2023 at 09:26:20PM +0800, Zhangjin Wu wrote: > [...] > > With this exception, s390 no long need to provide its own mmap > > definition, it (seems i386 too, but it uses mmap2 currently) can simply > > define '__ARCH_WANT_SYS_OLD_MMAP' as the '__ARCH_WANT_SYS_OLD_SELECT' we > > are using for old_select. > > > > The same method applies to the selection of the different backward > > version of the sys_clone() syscall (from kernel/fork.c): > (...) > > > #ifdef __NR_clone > > #undef sys_clone > > #define __sys_clone(...) __sysdef(clone, __VA_ARGS__) > > > > static __attribute__((unused)) > > int sys_clone(unsigned long clone_flags, unsigned long newsp, > > int __attribute__((unused)) stack_size, > > int parent_tidptr, int child_tidptr, unsigned long tls) > > { > > long ret; > > #ifdef __ARCH_WANT_SYS_CLONE_BACKWARDS > > ret = __sys_clone(clone_flags, newsp, parent_tidptr, tls, child_tidptr); > > #elif defined(__ARCH_WANT_SYS_CLONE_BACKWARDS2) > > ret = __sys_clone(newsp, clone_flags, parent_tidptr, child_tidptr, tls); > > #elif defined(__ARCH_WANT_SYS_CLONE_BACKWARDS3) > > ret = __sys_clone(clone_flags, newsp, stack_size, parent_tidptr, child_tidptr, tls); > > #else > > ret = __sys_clone(clone_flags, newsp, parent_tidptr, child_tidptr, tls); > > #endif > > return ret; > > } > > #endif /* __NR_clone */ > > > > s390 only requires to define '__ARCH_WANT_SYS_CLONE_BACKWARDS2', no need > > to provide its own sys_fork() version, in the __NR_clone branch of > > fork(), __ARCH_WANT_SYS_CLONE_BACKWARDS2 can directly select the right > > version of sys_clone() for s390). > > Maybe but with much less #define indirections it would be significantly > better. > > (...) > > We only have these three exceptions currently, with this normalization, > > the library routines from sys.h can directly think sys_* macros are > > generic, if not, let syscall.h take care of the right exceptions. > > I see the point. But that doesn't remove the need to write the exported > function itself. I'm not saying there's nothing to save here, I see your > point, I'm just wondering if we really gain something in terms of ease > of declaring new syscalls especially for first-time contributors and if > we're not losing in maintenance. If at least it's easy enough to implement > exceptions, maybe it could be worth further investigating. > I will delay the whole work about __sysdef(), but work on some more generic parts (like the exceptions above) at first. > > > > static __attribute__((unused)) > > > > int dup2(int old, int new) > > > > { > > > > int ret = sys_dup3(old, new, 0); > > > > > > > > if (ret == -ENOSYS) > > > > ret = sys_dup2(old, new); > > > > > > > > return __sysret(ret); > > > > } > > > > > > But this will add a useless test after all such syscalls, we'd rather > > > not do that! > > > > > > > Indeed, I found this issue too, when __NR_dup3 not defined, it returns > > -ENOSYS, than, no size issue, otherwise, the compiler will not be able > > to learn what the ret of sys_dup3() will be, so, it can not optimize the > > second call to sys_dup2(). > > > > So, the '#ifdef' logic must be used like we did in sys_* functions, but > > it is really not that meaningful (no big gain as you mentioned above) if > > we only move them from the sys_* functions to the library routines. > > > > At last, I found the ternary operation together with the initialization > > of the not-defined __NR_* as NOLIBC__NR_NOSYS help this a lot, at last, > > we get something like this: > > > > /* __systry2() is used to select one of two provided low level syscalls */ > > #define __systry2(a, sys_a, sys_b) \ > > ((NOLIBC__NR_##a != NOLIBC__NR_NOSYS) ? (sys_a) : (sys_b)) > > But this supposes that all of them are manually defined as you did above. > I'd rather implement an ugly is_numeric() macro based on argument > resolution. I've done it once in another project, I don't remember > precisely where it is but I vaguely remember that it used to check > that the string resolution of the argument gave a letter (when it > does not exist) or a digit (when it does). I can look into that later > if needed. But please avoid extra macro definitions as much as possible, > they're a real pain to handle in the code. There's no error when one is > missing or has a typo, it's difficult to follow them and they don't > appear in the debugger. > Yeah, your reply inspired me to look into the IS_ENABLED() from ../include/linux/kconfig.h macro again, there was a __is_defined() there, let's throw away the ugly sysnr.h. I thought of IS_ENABLED() was only for y/n/m before, but it does return 0 when the macro is not defined, it uses the same trick in syscall() to calculate the number of arguments, if the macro is not defined, then, 0 "argument". > > It can eliminate all of the '#ifdef' stuffs, using the chmod example you > > mentioned above, it becomes something like this: > > > > /* > > * int chmod(const char *path, mode_t mode); > > */ > > > > static __attribute__((unused)) > > int chmod(const char *path, mode_t mode) > > { > > return __sysret(__systry2(chmod, sys_chmod(path, mode), sys_fchmodat(AT_FDCWD, path, mode, 0))); > > } > > > > Purely clean and clear. > > That's a matter of taste and it may explain why we often disagree. For me > it's horrid. If I'm the one implementing chmod for my platform and it does > not work, what should I do when facing that, except want to cry ? Think > that right now we have this: > > static __attribute__((unused)) > int sys_chmod(const char *path, mode_t mode) > { > #ifdef __NR_fchmodat > return my_syscall4(__NR_fchmodat, AT_FDCWD, path, mode, 0); > #elif defined(__NR_chmod) > return my_syscall2(__NR_chmod, path, mode); > #else > return -ENOSYS; > #endif > } > > Sure it can be called not pretty, but I think it has the merit of being > totally explicit, and whoever sees chmod() fail can quickly check based > on the test in what situation they're supposed to be and what to check. > > One thing I'm worried about also regarding using my_syscall() without the > number is that it's easy to miss an argument and have random values taken > from registers (or the stack) passed as argument. For example above we can > see that the flags part is 0 in fchmodat(). It's easy to miss themn and > while the syscall4() will complain, syscall() will silently turn that > into syscall3(). That's not necessarily a big deal, but we've seen during > the development that it's easy to make mistakes and they're not trivial > to spot. So again I'm really wondering about the benefits in such a case. > > This is well illustrated in your example below: > > > return __sysret(__systry2(newselect, sys_newselect(nfds, rfds, wfds, efds, timeout), > > sys_pselect6(nfds, rfds, wfds, efds, timeout ? &t : NULL, NULL))); > > How many attempts to get it right ? Just skip one NULL and you don't > see it. Yeah, seems we have missed the last 0 in ppoll() before and the test may not report about it either. [...] > > > Sure it's not pretty, and I'd rather just go back to SET_ERRNO() to be > > > honest, because we're there just because of the temptation to remove > > > lines that were not causing any difficulties :-/ > > > > > > I think we can do something in-between and deal only with signed returns, > > > and explicitly place the test for MAX_ERRNO on the two unsigned ones > > > (brk and mmap). It should look approximately like this: > > > > > > #define __sysret(arg) \ > > > ({ \ > > > __typeof__(arg) __sysret_arg = (arg); \ > > > (__sysret_arg < 0) ? ({ /* error ? */ \ > > > SET_ERRNO(-__sysret_arg); /* yes: errno != -ret */ \ > > > ((__typeof__(arg)) -1); /* return -1 */ \ > > > }) : __sysret_arg; /* return original value */ \ > > > }) > > > > > > > I like this one very much, a simple test shows, it saves one more byte. > > I'm going to do that and revert the 3 affected syscalls. > Ok. > > Only a quesiton, why 'errno != -ret' has a '!'? and we have post-tab in > > above two lines of __sysret() too, I have changed them to whitespaces. > [...] > > > But let them align with the others may be better, so, most of the sys_* > > macros can be simply mapped with a simple line (all of them are > > generated automatically), without the care of the return types changing. > > > > So, Willy, as a summary: > > > > - one solution is your new __sysret() + restore the original SET_ERRNO > > for mmap and brk [1]. > > > > - another solution is your new __sysret() + my patch [2] to let mmap and brk > > return 'long' as the other sys_* function does. > > No, because it will completely break them when they'll need to access the > second half of the memory, as I already explained somewhere else in one > of these numerous discussions. > Sorry, will recheck this part later, please ignore it. [...] > > > I will resell my proposed patchset above, at > > least a RFC patchset, please ignore this currently ;-) > > Please allow me to breathe a little bit. Really I mean, I'm already worn > by having to constantly review breaking changes that either introduce > bugs or break maintainability, and to have to justify myself for things > that I thought would be obvious to anyone. Massive changes are extremely > time consuming to review, and trying to figure subtle breakage in such > low-level stuff is even harder. I simply can't assign more time to this, > particularly for the expected gains which for me or often perceived as > losses of maintainability instead :-/ > Take a rest, I will delay the whole __sysdef() stuff and continue to work on the last tinyconfig patchset after v6.5, it is the last one before our early rv32 work ;-) Thanks, Zhangjin > Thanks, > Willy