On 2/28/20 1:19 AM, LIU Zhiwei wrote: >>> +#define GEN_VEXT_AMO_NOATOMIC_OP(NAME, ETYPE, MTYPE, H, DO_OP, SUF) \ >>> +static void vext_##NAME##_noatomic_op(void *vs3, target_ulong addr, \ >>> + uint32_t wd, uint32_t idx, CPURISCVState *env, uintptr_t retaddr)\ >>> +{ \ >>> + ETYPE ret; \ >>> + target_ulong tmp; \ >>> + int mmu_idx = cpu_mmu_index(env, false); \ >>> + tmp = cpu_ld##SUF##_mmuidx_ra(env, addr, mmu_idx, retaddr); \ >>> + ret = DO_OP((ETYPE)(MTYPE)tmp, *((ETYPE *)vs3 + H(idx))); \ >>> + cpu_st##SUF##_mmuidx_ra(env, addr, ret, mmu_idx, retaddr); \ >>> + if (wd) { \ >>> + *((ETYPE *)vs3 + H(idx)) = (target_long)(MTYPE)tmp; \ >> The target_long cast is wrong; should be ETYPE. > "If the AMO memory element width is less than SEW, the value returned from memory > is sign-extended to fill SEW" > > So just use (target_long) to sign-extended. As you see, instructions like > > vamominud > > have the uint64_t as ETYPE. And it can't sign-extend the value from memory by > (ETYPE)(MTYPE)tmp. Casting to target_long doesn't help -- it becomes signed at a variable size, possibly larger than MTYPE. In addition, I think you're performing the operation at the wrong length. The text of the ISA document could be clearer, but # If SEW > 32 bits, the value returned from memory # is sign-extended to fill SEW. You are performing the operation in ETYPE, but it should be done in MTYPE and only afterward extended to ETYPE. For minu/maxu, you're right that you need an unsigned for the operation. But then you need a signed type of the same width for the extension. One possibility is to *always* make MTYPE a signed type, but for the two cases that require an unsigned type, provide it. E.g. #define GEN_VEXT_AMO_NOATOMIC_OP(NAME, ESZ, MSZ, H, DO_OP, SUF) static void vext_##NAME##_noatomic_op(void *vs3, target_ulong addr, uint32_t wd, uint32_t idx, CPURISCVState *env, uintptr_t retaddr) { typedef int##ESZ##_t ETYPE; typedef int##MSZ##_t MTYPE; typedef uint##MSZ##_t UMTYPE; ETYPE *pe3 = (ETYPE *)vs3 + H(idx); MTYPE a = *pe3, b = cpu_ld##SUF##_data(env, addr); a = DO_OP(a, b); cpu_st##SUF##_data(env, addr, a); if (wd) { *pe3 = a; } } /* Signed min/max */ #define DO_MAX(N, M) ((N) >= (M) ? (N) : (M)) #define DO_MIN(N, M) ((N) >= (M) ? (M) : (N)) /* Unsigned min/max */ #define DO_MAXU(N, M) DO_MAX((UMTYPE)N, (UMTYPE)M) #define DO_MINU(N, M) DO_MIN((UMTYPE)N, (UMTYPE)M) GEN_VEXT_AMO_NOATOMIC_OP(vamomaxuw_v_d, 64, 32, H8, DO_MAXU, l) GEN_VEXT_AMO_NOATOMIC_OP(vamomaxud_v_d, 64, 64, H8, DO_MAXU, q) >> The missing aligned address check is the only remaining exception that the >> helper_atomic_* functions would raise, since you have properly checked for >> read+write. So it might be possible to get away with using the helpers, but I >> don't like it. > Do you mean write my own helpers to implement atomic operations? > > What's the meaning of " but I don't like it. "? I don't like re-using helpers in an incorrect way. >> But I do think it would be better to write your own helpers for the atomic >> paths. They need not check quite so much, since we have already done the >> validation above. You pretty much only need to use tlb_vaddr_to_host. >> >> If that gets too ugly, we can talk about rearranging >> accel/tcg/atomic_template.h so that it could be reused. > Good idea. Perhaps use tlb_vaddr_to_host instead of atomic_mmu_lookup > to define another macro like GEN_ATOMIC_HELPER? >> Alternately, we could simply *always* use the non-atomic helpers, and raise >> exit_atomic if PARALLEL. > Yes, it's the simplest way. > However I prefer try to define something like GEN_ATOMIC_HELPER in > vector_helper.c. I'll think about this some more. In the short-term, I think non-atomic is the best we can do. r~