On Thu, Nov 09, 2023 at 03:19:27AM +0000, Jan Bottorff wrote: > When running on a many core ARM64 server, errors were > happening in the ISR that looked like corrupted memory. These > corruptions would fix themselves if small delays were inserted > in the ISR. Errors reported by the driver included "i2c_designware > APMC0D0F:00: i2c_dw_xfer_msg: invalid target address" and > "i2c_designware APMC0D0F:00:controller timed out" during > in-band IPMI SSIF stress tests. > > The problem was determined to be memory writes in the driver were not > becoming visible to all cores when execution rapidly shifted between > cores, like when a register write immediately triggers an ISR. > Processors with weak memory ordering, like ARM64, make no > guarantees about the order normal memory writes become globally > visible, unless barrier instructions are used to control ordering. > > To solve this, regmap accessor functions configured by this driver > were changed to use non-relaxed forms of the low-level register > access functions, which include a barrier on platforms that require > it. This assures memory writes before a controller register access are > visible to all cores. The community concluded defaulting to correct > operation outweighed defaulting to the small performance gains from > using relaxed access functions. Being a low speed device added weight to > this choice of default register access behavior. > > Signed-off-by: Jan Bottorff <janb@xxxxxxxxxxxxxxxxxxxxxx> Applied to for-current, thanks!
Attachment:
signature.asc
Description: PGP signature