Search Linux Wireless

Re: [ath9k-devel] [PATCH] ath10k: Fix crash when using v1 hardware.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 07/11/2013 02:36 AM, Kalle Valo wrote:
greearb@xxxxxxxxxxxxxxx writes:

From: Ben Greear <greearb@xxxxxxxxxxxxxxx>

I put a v1 NIC from an TP-LINK AC 1750 AP in
a 64-bit PC, and the OS crashes on bootup.  I'm not
sure how broken my hardware is (possibly completely non
functional), but at least with this patch it will no longer
crash the OS.  Not sure it ever got far enough to try,
but I also do not have firmware for the NIC.

With this patch I get this info on module load:

ath10k_pci 0000:05:00.0: BAR 0: assigned [mem 0xf4400000-0xf45fffff 64bit]
ath10k_pci 0000:05:00.0: BAR 0: error updating (0xf4400004 != 0xffffffff)
ath10k_pci 0000:05:00.0: BAR 0: error updating (high 0x000000 != 0xffffffff)
ath10k_pci 0000:05:00.0: Refused to change power state, currently in D3
ath10k: MSI-X interrupt handling (8 intrs)
ath10k: Unable to wakeup target
ath10k: target takes too long to wake up (awake count 1)
ath10k: src_ring ffff88020c0d0a00:  write_index is out of bounds: 4294967295  nentries_mask: 15.
ath10k: dest_ring ffff88020db2c000:  write_index is out of bounds: 4294967295  nentries_mask: 511.
ath10k: dest_ring ffff880210d56400:  write_index is out of bounds: 4294967295  nentries_mask: 31.
ath10k: src_ring ffff880210d57600:  write_index is out of bounds: 4294967295  nentries_mask: 31.
ath10k: src_ring ffff88020fe70000:  write_index is out of bounds: 4294967295  nentries_mask: 2047.
ath10k: src_ring ffff880212989b40:  write_index is out of bounds: 4294967295  nentries_mask: 1.
ath10k: dest_ring ffff880212989960:  write_index is out of bounds: 4294967295  nentries_mask: 1.
ath10k: Failed to get pcie state addr: -5
ath10k: early firmware event indicated
------------[ cut here ]------------
WARNING: at /home/greearb/git/linux.wireless-testing/drivers/net/wireless/ath/ath10k/ce.c:771 ath10k_ce_per_engine_service+0x53/0x1b4 [ath10k_pci]()
....
(it hits the warning case about 5-6 times and then seems to quiesce OK).

I haven't seen this myself so it might be a hw problem, but difficult to
say.

+	/* On v1 hardware at least, setup can fail, causing ce_id_state to
+	 * be cleaned up, but this method is still called a few times.  Check
+	 * for NULL here so we don't crash.  Probably a better fix is to stop
+	 * the ath10k_pci_ce_tasklet sooner.
+	 */
+	if (WARN_ONCE(!ce_state, "ce_id_to_state[%i] is NULL\n", ce_id))
+		return;
+
+	ctrl_addr = ce_state->ctrl_addr;
+

The tests you add look like workarounds. I would prefer to try fix these
by going to the source of the problem. Maybe we should add
ath10k_pci_wake() and ath10k_do_pci_wake()?

These are work-arounds, but you should not let a bad piece of hardware/firmware crash
the entire OS just because you don't want to do sanity checking on the
values you get from the firmware.  Perhaps there is a better fix for the
code above, but the warning splat should still provide incentive to get
it right, while not crashing the OS in the meantime.


Can you enable few debug logs, like ATH10K_DBG_PCI, and post them? That
would give more hint there things are going wrong.


Yes, I can do that.

Thanks,
Ben

--
Ben Greear <greearb@xxxxxxxxxxxxxxx>
Candela Technologies Inc  http://www.candelatech.com

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Wireless Personal Area Network]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Hiking]     [MIPS Linux]     [ARM Linux]     [Linux RAID]

  Powered by Linux