On Wed, Nov 6, 2024 at 6:36 PM Saeed Mahameed <saeed@xxxxxxxxxx> wrote: > > On 06 Nov 15:44, Caleb Sander wrote: > >On Tue, Nov 5, 2024 at 9:44 PM Parav Pandit <parav@xxxxxxxxxx> wrote: > >> > >> > >> > From: Caleb Sander <csander@xxxxxxxxxxxxxxx> > >> > Sent: Tuesday, November 5, 2024 9:36 PM > >> > > >> > On Mon, Nov 4, 2024 at 9:22 PM Parav Pandit <parav@xxxxxxxxxx> wrote: > >> > > > >> > > > >> > > > >> > > > From: Caleb Sander <csander@xxxxxxxxxxxxxxx> > >> > > > Sent: Monday, November 4, 2024 3:49 AM > >> > > > > >> > > > On Sat, Nov 2, 2024 at 8:55 PM Parav Pandit <parav@xxxxxxxxxx> wrote: > >> > > > > > >> > > > > > >> > > > > > >> > > > > > From: Caleb Sander Mateos <csander@xxxxxxxxxxxxxxx> > >> > > > > > Sent: Friday, November 1, 2024 9:17 AM > >> > > > > > > >> > > > > > The logic of eq_update_ci() is duplicated in mlx5_eq_update_ci(). > >> > > > > > The only additional work done by mlx5_eq_update_ci() is to > >> > > > > > increment > >> > > > > > eq->cons_index. Call eq_update_ci() from mlx5_eq_update_ci() to > >> > > > > > eq->avoid > >> > > > > > the duplication. > >> > > > > > > >> > > > > > Signed-off-by: Caleb Sander Mateos <csander@xxxxxxxxxxxxxxx> > >> > > > > > --- > >> > > > > > drivers/net/ethernet/mellanox/mlx5/core/eq.c | 9 +-------- > >> > > > > > 1 file changed, 1 insertion(+), 8 deletions(-) > >> > > > > > > >> > > > > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c > >> > > > > > b/drivers/net/ethernet/mellanox/mlx5/core/eq.c > >> > > > > > index 859dcf09b770..078029c81935 100644 > >> > > > > > --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c > >> > > > > > +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c > >> > > > > > @@ -802,19 +802,12 @@ struct mlx5_eqe *mlx5_eq_get_eqe(struct > >> > > > > > mlx5_eq *eq, u32 cc) } EXPORT_SYMBOL(mlx5_eq_get_eqe); > >> > > > > > > >> > > > > > void mlx5_eq_update_ci(struct mlx5_eq *eq, u32 cc, bool arm) { > >> > > > > > - __be32 __iomem *addr = eq->doorbell + (arm ? 0 : 2); > >> > > > > > - u32 val; > >> > > > > > - > >> > > > > > eq->cons_index += cc; > >> > > > > > - val = (eq->cons_index & 0xffffff) | (eq->eqn << 24); > >> > > > > > - > >> > > > > > - __raw_writel((__force u32)cpu_to_be32(val), addr); > >> > > > > > - /* We still want ordering, just not swabbing, so add a barrier */ > >> > > > > > - wmb(); > >> > > > > > + eq_update_ci(eq, arm); > >> > > > > Long ago I had similar rework patches to get rid of > >> > > > > __raw_writel(), which I never got chance to push, > >> > > > > > >> > > > > Eq_update_ci() is using full memory barrier. > >> > > > > While mlx5_eq_update_ci() is using only write memory barrier. > >> > > > > > >> > > > > So it is not 100% deduplication by this patch. > >> > > > > Please have a pre-patch improving eq_update_ci() to use wmb(). > >> > > > > Followed by this patch. > >> > > > > >> > > > Right, patch 1/2 in this series is changing eq_update_ci() to use > >> > > > writel() instead of __raw_writel() and avoid the memory barrier: > >> > > > https://lore.kernel.org/lkml/20241101034647.51590-1- > >> > > > csander@xxxxxxxxxxxxxxx/ > >> > > This patch has two bugs. > >> > > 1. writel() writes the MMIO space in LE order. EQ updates are in BE order. > >> > > So this will break on ppc64 BE. > >> > > >> > Okay, so this should be writel(cpu_to_le32(val), addr)? > >> > > >> That would break the x86 side because device should receive in BE format regardless of cpu endianness. > >> Above code will write in the LE format. > >> > >> So an API foo_writel() need which does > >> a. write memory barrier > >> b. write to MMIO space but without endineness conversion. > > > >Got it, thanks. writel(bswap_32(val, addr)) should work, then? I > >suppose it may introduce a second bswap on BE architectures, but > >that's probably worth it to avoid the memory barrier. > > > > The existing mb() needs to be changed to wmb(), this will provide a more > efficient fence on most architectures. > > I don't understand why you are still discussing the use of writel(), yes > it will work but you are introducing two unconditional swaps per doorbell > write. Well, no memory fence is cheaper still than a wmb(). But it's your driver, so if you prefer to use wmb() rather than switch to writel(), that's fine. I'll update the patch series. As for the bytes swaps in writel(bswap_32(val), addr), it would still be 1 on LE architectures, but 2 instead of 0 on BE architectures. Certainly a bit inefficient, but probably less overhead than the memory barrier currently adds on strongly-ordered architectures. > > Just replace the existing mb with wmb() in eq_update_ci() > > And if you have time to write one extra patch, please reuse eq_update_ci() > inside mlx5_eq_update_ci(). > > mlx5_eq_update_ci(eq, cc, arm) { > eq->cons_index += cc; > eq_update_ci(eq, arm); > } > > So we won't have two different implementations of EQ doorbell ringing > anymore. Isn't this what my patch 2 (at the start of this reply chain) already does? If you are suggesting something else, please clarify. Thanks for the reviews, Caleb