RE: Possible ACPI abuse in Mellanox BlueField Gigabit Ethernet driver

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




-----Original Message-----
From: Andy Shevchenko <andriy.shevchenko@xxxxxxxxxxxxxxx> 
Sent: Friday, August 13, 2021 9:50 AM
To: Asmaa Mnebhi <asmaa@xxxxxxxxxx>
Cc: Linux GPIO <linux-gpio@xxxxxxxxxxxxxxx>; linux-acpi@xxxxxxxxxxxxxxx; Rafael J. Wysocki <rjw@xxxxxxxxxxxxx>; Linus Walleij <linus.walleij@xxxxxxxxxx>; Bartosz Golaszewski <bgolaszewski@xxxxxxxxxxxx>; David Thompson <davthompson@xxxxxxxxxx>; Liming Sun <limings@xxxxxxxxxx>; David S. Miller <davem@xxxxxxxxxxxxx>
Subject: Re: Possible ACPI abuse in Mellanox BlueField Gigabit Ethernet driver
Importance: High

On Thu, Aug 12, 2021 at 08:07:49PM +0000, Asmaa Mnebhi wrote:
> From: Andy Shevchenko <andriy.shevchenko@xxxxxxxxxxxxxxx>
> Sent: Thursday, August 12, 2021 12:26 PM On Thu, Aug 12, 2021 at 
> 03:54:26PM +0000, Asmaa Mnebhi wrote:

> My first question, is it already firmware in the wild that does this?
> I.o.w. is there any time to amend it if needed?

> > Are you asking if it is possible to change the ACPI table's GPIO pin 
> > on the fly at boot time in UEFI code?

I'm asking if there is any device with these tables on market?

Yes it is.

...


> > We have 1 image common to all our board types. The ACPI tables are 
> > selected based on the board id. Some board types have PHY_INT pin 
> > connected to GPIO pin 9 and other boards have it connected to GPIO 
> > pin 12. So we have 2 ssdt.asl files:
> 
> Okay (You may have one and actually choose it based on some [NVS] 
> variable)
> 
> Asmaa> Ok!
> 
> > // first file: PHY_INT -> GPIO pin 12
> > Device(OOB) {
> >         Name(_HID, "MLNXBF17")
> >         Name(_UID, 0)
> >         Name(_CCA, 1)
> >         Name (_CRS, ResourceTemplate () {
> >            // OOB Ethernet
> >            Memory32Fixed (ReadWrite, 0x03000000, 0x00000600)
> >            // mdio[9]
> >            Memory32Fixed (ReadWrite, 0x028004C8, 0x00000008)
> >            // gpio[0]
> >            Memory32Fixed (ReadWrite, 0x0280c000, 0x00000100)
> >            // OOB LLU
> >            Memory32Fixed (ReadWrite, 0x039C0000, 0x0000A100)
> >            // OOB PLU
> >            Memory32Fixed (ReadWrite, 0x04000000, 0x00001100)
> >            Interrupt (ResourceConsumer, Level, ActiveHigh, Exclusive) { BF_RSH0_DEVICE_OOB_INT }
> >            Interrupt (ResourceConsumer, Edge, ActiveHigh, Exclusive) { BF_RSH0_DEVICE_OOB_LLU_INT }
> >            Interrupt (ResourceConsumer, Level, ActiveHigh, Exclusive) { BF_RSH0_DEVICE_OOB_PLU_INT }
> >            Interrupt (ResourceConsumer, Edge, ActiveHigh, Shared) { 
> > BF_RSH0_DEVICE_YU_INT }
> >
> >            // GPIO PHY interrupt
> >            GpioInt (Edge, ActiveHigh, Exclusive, PullUp, , " 
> > \\_SB.GPI0") {12}

Just a side note I forgot in previous reply: The tables themselves look good in my opinion.

> PullUp with Edge/Rise seems a bit awkward. Recently I have added a 
> corresponding paragraph to the 
> https://www.kernel.org/doc/html/latest/firmware-guide/acpi/gpio-properties.html.
> But it's just to double check that you got the idea how your hardware 
> works (maybe it uses open-drain or so and it's indeed the correct setting).
> 
> > I forgot to cp/paste one more line from the ACPI tables. I have 
> > created a DSD entry and named the gpio (although as you pointed 
> > below, it is not really needed in this case):

> Name (_DSD, Package () {
>            ToUUID ("daffd814-6eba-4d8c-8a91-bc9bbf4aa301") /* Device Properties for _DSD */,
>            Package ()
>            {
>               Package () { "phy-gpios", Package() {^OOB, 0, 0, 0 }},
>            }
>        })

Yes. in this case it's not needed. Only case when it might if you have a bunch of platforms where a few GpioInt() resources are present and they have no mapping. This will help to figure out what resource is related to what IRQ line. Overall it's not bad decision to add one.

> The interrupt that we care about (which signals link up/link down 
> events) is actually the shared HW irq BF_RSH0_DEVICE_YU_INT (edge 
> triggered, active high whenever there is an i2c, mdio or gpio 
> interrupt).  We get that interrupt value from the ACPI table as follows:
> priv->hw_phy_irq = platform_get_irq(pdev, MLXBF_GIGE_PHY_INT_N);

Wait, what you are telling is that the GpioInt() resource is a dup for one of
Interrupt() resource. Is it correct interpretation?

Yes GpioInt is not needed since the ACPI entry (from the above code snippet) defines the shared interrupt as:
           Interrupt (ResourceConsumer, Edge, ActiveHigh, Shared) { BF_RSH0_DEVICE_YU_INT }

> Although it is overkill, I only used "GpioInt (Edge, ActiveHigh, 
> Exclusive, PullUp, \\_SB.GPI0") {12}" to retrieve the GPIO pin number 
> (12 or 9) in mlxbf-gige.

Yes, but why do you need to know this pin in software?

I need to know this pin in software to be able to access the corresponding GPIO bits in the control registers. Each gpio register (there are more than 30 HW GPIO related registers) is a 32 bit register. Each bit in those registers corresponds to a different GPIO pin.
For example, in each of the following registers (which are used in mlxbf_gige_gpio.c), we only care about R/W to bit 9 or 12 (depending on the board):
MLXBF_GIGE_GPIO_CAUSE_OR_CLRCAUSE
MLXBF_GIGE_GPIO_CAUSE_OR_CAUSE_EVTEN0
MLXBF_GIGE_GPIO_CAUSE_FALL_EN
We don't want to modify any other bits since they are bound to other GPIO pins which have specific HW functionalities.

> We could also have created a property (phy-gpio-pin) to pass the GPIO 
> pin and that would enable us to remove all code related to "GpioInt"
> code in the acpi and mlxbf-gige driver. But I thought that properties 
> are in general not the preferred approach?

Properties make sense when there is no standard ACPI approach or it lacks of some information. As far as I can tell here the properties is better, but I would like to understand first the need for this information in the first place (see above comment).

Ok. Please see my reply above.

> So whenever that shared interrupt is triggered, this routine is executed mlxbf_gige_gpio_handler:
> ret = devm_request_irq(dev, priv->hw_phy_irq, mlxbf_gige_gpio_handler,
>                                 IRQF_ONESHOT | IRQF_SHARED, 
> "mlxbf_gige_phy", priv); It checks whether the interrupt is for GPIO 
> pin 9 or 12 (depending on the board). If it is, it clears the interrupt accordingly and triggers the generic phy_interrupt routine (in phy.c) phy_interrupt is registered via phy_connect_direct.

This sounds strange to me. What you are telling is that there is no hw register from which you may retrieve this information? So it's a workaround of silicon bug?

Yes. There is no register where I can retrieve this information : ( . This is why firmware decides that at boot time based on the board id.
And that is a good idea, I will share this proposal with the HW team as this will make the code a lot cleaner for future generations.

> What I have seen here is a regular GpioInt() resource with a single pin.
> 
> Asmaa> Yes we only use one GPIO pin.
> 
> As far as I can see in the code it has the flaw that it actually will use the last GpioInt() resource available in _CRS.
> 
> Besides that, why do you need to know the pin name and can't simply request an IRQ as every other driver does (the exception is only yours in the entire kernel)? The acpi_dev_gpio_irq_get() call can get Linux vIRQ for you same way you have got it for Interrupt() resources via platform_get_irq().

> To understand better this piece, can you point out to the GPIO driver 
> code, which implements the driver for _SB.GPI0 in the kernel?

Any comments on this? Do you have a GPIO driver available?

Yes please see below:
The gpio driver is gpio-mlxbf2.c

Here is the ACPI table for it:

Device(GPI0) {
        Name(_HID, "MLNXBF22")
        Name(_UID, Zero)
        Name(_CCA, 1)
        Name(_CRS, ResourceTemplate() {
          // for gpio[0] yu block
          Memory32Fixed(ReadWrite, 0x0280c000, 0x00000100)
        })
  }

Device(GPI1) {
        Name(_HID, "MLNXBF22")
        Name(_UID, 1)
        Name(_CCA, 1)
        Name(_CRS, ResourceTemplate() {
          // for gpio[1] yu block
          Memory32Fixed(ReadWrite, 0x0280c100, 0x00000100)
        })
      }
        

Device(GPI2) {
        Name(_HID, "MLNXBF22")
        Name(_UID, 2)
        Name(_CCA, 1)
        Name(_CRS, ResourceTemplate() {
          // for gpio[2] yu block
          Memory32Fixed(ReadWrite, 0x0280c200, 0x00000100)
        })
        Name(_DSD, Package() {
          ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
          Package() {
            Package () { "npins", 6 }, // Number of GPIO pins on gpio block 2
          }
        })
      }

> >         }) // Name(_CRS)

> > From: Andy Shevchenko <andriy.shevchenko@xxxxxxxxxxxxxxx>
> > Sent: Thursday, August 12, 2021 10:14 AM

> > From time to time I do grep kernel for ACPI_RESOURCE_TYPE_GPIO usage.
> > Recently the 
> > drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_gpio.c
> > caught my eye.
> > 
> > Looking into the code I see that it looks like misunderstanding of 
> > how ACPI works with GPIOs. First of all, I would like to inform that 
> > this code has been properly reviewed neither by GPIO nor by ACPI 
> > maintainers. Second, before going it to the real conclusions (and 
> > potential revert of this), I would like to see the real ACPI tables 
> > for this and some explanations from the authors of the driver about 
> > GPIO usage here (from hw and sw perspectives).  It makes sense to 
> > discuss ASAP, otherwise I would really want to revert it.

--
With Best Regards,
Andy Shevchenko






[Index of Archives]     [Linux SPI]     [Linux Kernel]     [Linux ARM (vger)]     [Linux ARM MSM]     [Linux Omap]     [Linux Arm]     [Linux Tegra]     [Fedora ARM]     [Linux for Samsung SOC]     [eCos]     [Linux Fastboot]     [Gcc Help]     [Git]     [DCCP]     [IETF Announce]     [Security]     [Linux MIPS]     [Yosemite Campsites]

  Powered by Linux