On 8/18/21 10:20 AM, Thomas Huth wrote: > On 13/08/2021 13.31, Janosch Frank wrote: >> On 8/13/21 10:32 AM, Claudio Imbrenda wrote: >>> On Fri, 13 Aug 2021 07:36:08 +0000 >>> Janosch Frank <frankja@xxxxxxxxxxxxx> wrote: >>> >>>> Bit setting and clearing is never bad to have. >>>> >>>> Signed-off-by: Janosch Frank <frankja@xxxxxxxxxxxxx> >>>> --- >>>> lib/s390x/asm/bitops.h | 102 >>>> +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 102 >>>> insertions(+) >>>> >>>> diff --git a/lib/s390x/asm/bitops.h b/lib/s390x/asm/bitops.h >>>> index 792881ec..f5612855 100644 >>>> --- a/lib/s390x/asm/bitops.h >>>> +++ b/lib/s390x/asm/bitops.h >>>> @@ -17,6 +17,78 @@ >>>> >>>> #define BITS_PER_LONG 64 >>>> >>>> +static inline unsigned long *bitops_word(unsigned long nr, >>>> + const volatile unsigned >>>> long *ptr) +{ >>>> + unsigned long addr; >>>> + >>>> + addr = (unsigned long)ptr + ((nr ^ (nr & (BITS_PER_LONG - >>>> 1))) >> 3); >>>> + return (unsigned long *)addr; >>> >>> why not just >>> >>> return ptr + (nr / BITS_PER_LONG); >>> >>>> +} >>>> + >>>> +static inline unsigned long bitops_mask(unsigned long nr) >>>> +{ >>>> + return 1UL << (nr & (BITS_PER_LONG - 1)); >>>> +} >>>> + >>>> +static inline uint64_t laog(volatile unsigned long *ptr, uint64_t >>>> mask) +{ >>>> + uint64_t old; >>>> + >>>> + /* load and or 64bit concurrent and interlocked */ >>>> + asm volatile( >>>> + " laog %[old],%[mask],%[ptr]\n" >>>> + : [old] "=d" (old), [ptr] "+Q" (*ptr) >>>> + : [mask] "d" (mask) >>>> + : "memory", "cc" ); >>>> + return old; >>>> +} >>> >>> do we really need the artillery (asm) here? >>> is there a reason why we can't do this in C? >> >> Those are the interlocked/atomic instructions and even though we don't >> exactly need them right now I wanted to add them for completeness. > > I think I agree with Claudio - unless we really need them, we should not > clog the sources with arbitrary inline assembly functions. Alright I can trim it down > >> We might be able to achieve the same via compiler functionality but this >> is not my expertise. Maybe Thomas or David have a few pointers for me? > > I'm not an expert with atomic builtins either, but what's the point of this > at all? Loading a value and OR-ing something into the value in one go? > What's that good for? Well it's a block-concurrent interlocked-update load, or and store. I.e. it loads the data from the ptr and copies it into [old] then ors the mask and stores it back to the ptr address. The instruction name "load and or" does not represent the full actions of the instruction. > > Thomas >