Re: chv-gpio interrupt storm on UMAX VisionBook 10Wi Pro

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 7/8/20 1:02 PM, Jiri Slaby wrote:
Hi,

On 08. 07. 20, 12:15, Hans de Goede wrote:
Hi,

On 7/8/20 10:52 AM, Jiri Slaby wrote:
On 08. 07. 20, 10:23, Hans de Goede wrote:
Hi all,

On 7/8/20 9:47 AM, Linus Walleij wrote:
On Wed, Jul 8, 2020 at 9:18 AM Jiri Slaby <jirislaby@xxxxxxxxxx> wrote:

I installed Linux on UMAX VisionBook 10Wi Pro. It sometimes boots, but
even then it encounters lags, soft lockups:
watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:0]
watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [kworker/0:0H:6]
watchdog: BUG: soft lockup - CPU#0 stuck for 21s! [kworker/0:2:133]
watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:0]

Adding Hans de Goede to Cc, he often deals with this kind of weirdness
so he might have some ideas here.

Thank you for looping me in Linus. I've read up on the rest of the
thread in the archives.

So looking at this:
https://www.umax.cz/umax-visionbook-10wi/

This device appears to be a pretty standard Cherry Trail based 2-in-1
with detachable keyboard. Which usually means (with all the
hw-enablement
I've been doing the last 2 years for these) that it will just work.
But no such luck this time it seems.
...
What I find interesting / weird is that you (Jiri) get an active
(/sys/bus/acpi/devices/INT3496\:00/status != 0) INT3496 device at
all. That typically only happens when the BIOS thinks you are booting
Android.

15 that is.

Right, that is normal for an enabled device the ACPI method
implementing the status attribute for your tablet looks like this:

             Method (_STA, 0, NotSerialized)  // _STA: Status
             {
                 If (((BDID == One) && (OSID != One)))
                 {
                     Return (0x0F)
                 }

                 Return (Zero)
             }

So now we know that BDID == One and OSID != One, OSID == One
typically is Windows...

Looking at the buttons next, can you do:

cat /sys/bus/acpi/devices/ACPI0011:00/status

Gives 0

and:

cat /sys/bus/acpi/devices/INTCFD9:00/status

Gives 15


If the BIOS thinks you are booting normal Windows the first
one should output 15  (0xf) aka present and the second one
should output 0, but I suspect it is the other way around...

Right.

So looking at the GPIO resource definitions for BDID == One
for the ACPI0011 device we have:

                 Name (PBUF, ResourceTemplate ()
                 {
                     GpioInt (Edge, ActiveBoth, ExclusiveAndWake, PullUp,
0x0BB8,
                         "\\_SB.GPO2", 0x00, ResourceConsumer, ,
                         )
                         {   // Pin list
                             0x0008
                         }
                     GpioInt (Edge, ActiveBoth, Exclusive, PullDefault,
0x0BB8,
                         "\\_SB.GPO0", 0x00, ResourceConsumer, ,
                         )
                         {   // Pin list
                             0x004E
                         }
                     GpioInt (Edge, ActiveBoth, Exclusive, PullDefault,
0x0BB8,
                         "\\_SB.GPO0", 0x00, ResourceConsumer, ,
                         )
                         {   // Pin list
                             0x0050
                         }
                 })

With a mapping of first resource to KEY_POWER, second resource to
KEY_VOLUMEUP and third resource to KEY_VOLUMEDOWN

The INTCFD9 device OTOH has the following resource definitions for BDID
== One

                 Name (PBUF, ResourceTemplate ()
                 {
                     GpioInt (Edge, ActiveBoth, ExclusiveAndWake, PullUp,
0x0BB8,
                         "\\_SB.GPO2", 0x00, ResourceConsumer, ,
                         )
                         {   // Pin list
                             0x0008
                         }
                     GpioInt (Edge, ActiveBoth, ExclusiveAndWake,
PullDefault, 0x
                         "\\_SB.GPO1", 0x00, ResourceConsumer, ,
                         )
                         {   // Pin list
                             0x0008
                         }
                     GpioInt (Edge, ActiveBoth, Exclusive, PullDefault,
0x0BB8,
                         "\\_SB.GPO0", 0x00, ResourceConsumer, ,
                         )
                         {   // Pin list
                             0x004E
                         }
                     GpioInt (Edge, ActiveBoth, Exclusive, PullDefault,
0x0BB8,
                         "\\_SB.GPO0", 0x00, ResourceConsumer, ,
                         )
                         {   // Pin list
                             0x0050
                         }
                 })

With a mapping of resource1: power, resource2: home, resource3: volume_up,
resource4: volume_down.

So what we see here is that the "Android" style INTCFD9 device has
an extra entry for the home-button and I guess (hard to see on the
pictures) that there is no physical home-button.

I don't know what it should look like, but there is no button with
house-like painting. There is only a standard "Home" button -- Fn+PgUp.
And that works even without that module.

Since the IRQ storm you are seeing on the home button is happening on
GPO1 pin 8 which is only listed as a button on the "Android" style
INTCFD9 device. I guess that the manufacturer started with a standard
ACPI DSDT for these devices and then hacked up the Windows entries until
they work.

Likewise the INT3496 entry likely is non-sense too. So you are seeing
a storm from some floating GPIOs which are close enough to some
other signals to pick up interference from them.

Conclusion: we need to get your BIOS to stop setting OSID to 0
(Android) and get it to set it to 1 (Windows).

Now you may think that Android == Linux so that should be good,
but Intel did a real frankenstein solution for Android X86, see:
https://github.com/intel/ProductionKernelQuilts
for all the 5000 downstream patches in al their glory (hint your
life will be better if you don't take a look).

The much saner support for these devices which eventually got added
to the mainline kernel actually works much better with the "Windows"
profile of the BIOS, since the mainline code expects sane ACPI tables
and the Android targetting ACPI tables are a bit of a mess.

So the first thing to do is to go into the BIOS setup and see if
there is a setting for this (this depends on if the BIOS is
unlocked and has like a gazillion settings, or if it is locked
to only show a few settings).

I just checked on one of own CHT devices and there the option is
under Advanced -> System Component -> OS IMAGE ID

I had/have:
Advanced
    -> Droid boot = disabled
    -> Android boot = disabled
    -> OS selection = Windows 8.x (there is also GMIN and Android to
select)

So there seems nothing I should change?

Ok, so some of these devices have some multi-boot code inside for
dualbooting both Android and Windows and they automatically override
the "OS selection" on every boot.

Since your device has only 1G of RAM it likely shipped with a 32
bit Windows to save RAM and thus has either a 32 bit only UEFI,
or a dual-mode UEFI. I'm guessing that it is the latter and when
you inserted the boot-medium you used to install, the BIOS saw a
EFI/BOOT/bootx64.efi binary on the boot-medium and switches to
64 bit mode which it associates with Android.

No, it has 2G of RAM.
# free -h
               total        used        free      shared  buff/cache
available
Mem:          1,8Gi       497Mi       786Mi       108Mi       567Mi
   1,1Gi
Swap:         2,0Gi          0B       2,0Gi

It also has only 32 but EFI. It doesn't recognize 64-bit binaries. I had
to load 32-bit grub first to load the installer from a USB. So this is
EFI-mixed mode as it is called.

Hmm, ok, with CHT I would really expect there to be a 64 bit UEFI and
your DSDT and the fact that my untested patch broke your boot, all do
show that this is Cherry Trail / Cherryview and not a Bay Trail.

I guess that doing:

cat /proc/cpuinfo  | grep "model name"

Will output something like this:

model name      : Intel(R) Atom(TM) x5-Z8350  CPU @ 1.44GHz
model name      : Intel(R) Atom(TM) x5-Z8350  CPU @ 1.44GHz
model name      : Intel(R) Atom(TM) x5-Z8350  CPU @ 1.44GHz
model name      : Intel(R) Atom(TM) x5-Z8350  CPU @ 1.44GHz

Note the model bould be some other Z8xxx nummer, likely it is a
Z8350, and if not a Z8300 but any Z8xxx number is CHT.

Further confirming that this really is Cherry Trail. Which
at least means that my patches might help a bit.

But ideally we would still be able to get the BIOS to see
us as Windows and set its OSID variable to 1. So we don't
try to use the wrong GPIOs as IRQ at all. Can you try loading
the BIOS setup-defaults / optimal defaults?

If that does not get rid of the IN3496 device (changes its
status to 0), then try this:

Maybe you have a "Boot Architecture" option under the "Boot"
menu in the BIOS? I know you are already at 32 bits, but
maybe changing it to 64 bits helps? (after installing a 64 bit
shim + grub)

If you run:
efibootmgr -v

You will likely see your current active boot entry point to
something with x64 in the name, e.g. I have:

Boot0000* Fedora
HD(1,GPT,a662134d-b40c-48de-8811-e43fee1adfa3,0x800,0x82000)/File(\EFI\fedora\shimx64.efi)

As I wrote above:
Boot0001* opensuse
HD(1,GPT,3f7cc368-0736-45a3-b23e-e1c0eda840be,0x800,0xfa000)/File(\EFI\opensuse\grub32.efi)

I've tried your patches now, but it crashes the kernel due to omitted
chv_padreg(), so rebuilding with that fixed...

Oops, yeah I just noticed that too (while testing some other
kernel patches). I've this fixed locally now...

Regards,

Hans




[Index of Archives]     [Linux SPI]     [Linux Kernel]     [Linux ARM (vger)]     [Linux ARM MSM]     [Linux Omap]     [Linux Arm]     [Linux Tegra]     [Fedora ARM]     [Linux for Samsung SOC]     [eCos]     [Linux Fastboot]     [Gcc Help]     [Git]     [DCCP]     [IETF Announce]     [Security]     [Linux MIPS]     [Yosemite Campsites]

  Powered by Linux