bug: greybus: loopback

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I got the following issue while I was testing loopack protocol on
Zephyr:
[76617.493324] greybus 1-2.2: Interface added (greybus)
[76617.493328] greybus 1-2.2: GMP VID=0x00000126, PID=0x00000126
[76617.493330] greybus 1-2.2: DDBL1 Manufacturer=0x00000126,
Product=0x00000126
[76618.515124] greybus greybus1: 3/2:0: synchronous operation id 0x0002
of type 0x01 failed: -110
[76618.515136] greybus 1-2.2: failed to get control-protocol version: -110
[76619.539015] greybus 1-2.2.ctrl: failed to send disconnecting: -110
[76619.539029] greybus greybus1: 3/2:0: failed to send disconnecting: -110
[76620.562921] greybus greybus1: 3/2:0: failed to send cport shutdown
(phase 1): -110
[76621.586890] greybus greybus1: 3/2:0: failed to send cport shutdown
(phase 2): -110
[76621.587036] greybus 1-2: failed to enable interface 2: -110
[81851.515405] greybus greybus1: 0/0:0: synchronous operation id 0x0c41
of type 0x19 failed: -110
[81852.539233] greybus greybus1: 0/0:0: synchronous operation id 0x0c43
of type 0x0e failed: -110
[81852.795204] greybus greybus1: 0/0:0: synchronous operation id 0x0c42
of type 0x13 failed: -110
[81852.795210] greybus 1-svc: SVC ping has returned -110, something is
wrong!!!
[81852.795212] greybus 1-svc: Resetting the greybus network, watch out!!!
[81852.795356] greybus 1-2.2: Interface removed
[81852.795389] loopback 1-1.1.1: failed to resume bundle: -22
[81852.795392] loopback 1-1.1.1: pm_runtime_get_sync failed: -22
[81852.795421] general protection fault: 0000 [#1] PREEMPT SMP
[81852.795449] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4
xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp iptable_filter
nf_nat_h323 nf_conntrack_h323 nf_nat_pptp nf_nat_proto_gre
nf_conntrack_pptp nf_conntrack_proto_gre nf_nat_tftp nf_conntrack_tftp
nf_nat_sip nf_conntrack_sip nf_nat_irc nf_conntrack_irc nf_nat_ftp
nf_conntrack_ftp iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4
nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables cdc_subset cdc_ether
usbnet mii hid_logitech_hidpp hid_logitech_dj usbhid hid snd_seq_dummy
snd_seq snd_seq_device vboxnetadp(O) pci_stub vboxpci(O) vboxnetflt(O)
vboxdrv(O) pl2303 usbserial gb_netlink(C) gb_loopback(C) greybus(C) fuse
ctr ccm bnep nls_utf8 nls_cp437 vfat fat joydev intel_rapl
x86_pkg_temp_thermal uvcvideo intel_powerclamp videobuf2_vmalloc coretemp
[81852.795794]  videobuf2_memops arc4 kvm_intel videobuf2_v4l2 kvm
videobuf2_core videodev irqbypass crct10dif_pclmul media btusb
crc32_pclmul btrtl ghash_clmulni_intel iwlmvm btbcm aesni_intel btintel
aes_x86_64 bluetooth lrw mac80211 gf128mul iTCO_wdt glue_helper crc16
ablk_helper iTCO_vendor_support snd_hda_codec_realtek cryptd efi_pstore
snd_hda_codec_hdmi snd_hda_codec_generic psmouse efivars pcspkr iwlwifi
rtsx_pci_ms cfg80211 memstick snd_hda_intel thermal snd_hda_codec wmi
thinkpad_acpi snd_hda_core e1000e nvram snd_hwdep rfkill snd_pcm battery
ac mei_me tpm_tis snd_timer ptp evdev i2c_i801 tpm_tis_core mei snd tpm
pps_core lpc_ich i2c_smbus shpchp soundcore nfsd auth_rpcgss
oid_registry nfs_acl lockd grace sunrpc sch_fq_codel sd_mod mmc_block
rtsx_pci_sdmmc mmc_core crc32c_intel ahci libahci
[81852.796155]  serio_raw i915 ehci_pci libata ehci_hcd scsi_mod usbcore
rtsx_pci mfd_core video button efivarfs
[81852.796205] CPU: 0 PID: 16298 Comm: gbridge Tainted: G        WC O
4.9.0+ #8
[81852.796231] Hardware name: LENOVO 20AQCTO1WW/20AQCTO1WW, BIOS
GJET77WW (2.27 ) 05/20/2014
[81852.796259] task: ffff8801656601c0 task.stack: ffffc90002660000
[81852.796282] RIP: 0010:[<ffffffff81098a45>]  [<ffffffff81098a45>]
kthread_stop+0x25/0x140
[81852.796317] RSP: 0018:ffffc90002663988  EFLAGS: 00010202
[81852.796336] RAX: 1da0ffffffffffc8 RBX: 1da0ffffffffffc8 RCX:
1da1000000000000
[81852.796361] RDX: 0000000000000005 RSI: 0000000000000005 RDI:
ffff880251aa1080
[81852.796386] RBP: ffff880251aa1080 R08: ffff880251aa1a08 R09:
0000000000ffff0a
[81852.796411] R10: 0000000000000000 R11: 0000000000000ffc R12:
ffff880241cc96f8
[81852.796436] R13: ffffffffa09fa020 R14: ffff8802caa1cee8 R15:
ffff88023ecf5200
[81852.796462] FS:  00007fb7e17917c0(0000) GS:ffff88031e200000(0000)
knlGS:0000000000000000
[81852.796490] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[81852.796517] CR2: 00007fae1131ca80 CR3: 0000000251a26000 CR4:
00000000001406f0
[81852.796533] Stack:
[81852.796540]  ffff8802ce703e00 ffff880241cc9400 ffff880241cc96f8
ffffffffa09f8606
[81852.796561]  ffffffff816f217a ffffffff816f1dcd ffffffffa09d67a6
ffff8800080e8300
[81852.796582]  ffff880241cc9400 00000000aa6146f5 ffff880241cc96c8
ffff880241cc9400
[81852.796602] Call Trace:
[81852.796613]  [<ffffffffa09f8606>] ? gb_loopback_disconnect+0x56/0x1a0
[gb_loopback]
[81852.796632]  [<ffffffff816f217a>] ? _raw_spin_lock_irq+0x1a/0x40
[81852.796647]  [<ffffffff816f1dcd>] ? _raw_spin_unlock_irq+0x1d/0x30
[81852.796668]  [<ffffffffa09d67a6>] ?
gb_connection_disable_rx+0x36/0x140 [greybus]
[81852.796688]  [<ffffffffa09d20f4>] ? greybus_remove+0x84/0x190 [greybus]
[81852.796705]  [<ffffffff81555715>] ? __device_release_driver+0x95/0x140
[81852.796721]  [<ffffffff815557de>] ? device_release_driver+0x1e/0x30
[81852.796737]  [<ffffffff81554465>] ? bus_remove_device+0xf5/0x170
[81852.796753]  [<ffffffff815508f7>] ? device_del+0x127/0x260
[81852.796769]  [<ffffffffa09d626d>] ? gb_bundle_destroy+0x1d/0x90 [greybus]
[81852.796787]  [<ffffffffa09d4e17>] ?
gb_interface_disable.part.12+0x57/0x1a0 [greybus]
[81852.796807]  [<ffffffffa09d3f1b>] ? gb_module_del+0x5b/0xf0 [greybus]
[81852.796825]  [<ffffffffa09da8ee>] ? gb_svc_del+0x9e/0xe0 [greybus]
[81852.796841]  [<ffffffffa09d2d8b>] ? gb_hd_del+0x1b/0x80 [greybus]
[81852.796857]  [<ffffffffa090302a>] ? _gb_netlink_exit+0x1a/0x30
[gb_netlink]
[81852.796874]  [<ffffffffa09030b5>] ? gb_netlink_hd_reset+0x15/0x30
[gb_netlink]
[81852.796892]  [<ffffffff8162414f>] ? genl_family_rcv_msg+0x1bf/0x360
[81852.796907]  [<ffffffff816242f0>] ? genl_family_rcv_msg+0x360/0x360
[81852.796922]  [<ffffffff81624360>] ? genl_rcv_msg+0x70/0xb0
[81852.797918]  [<ffffffff81623931>] ? netlink_rcv_skb+0xa1/0xc0
[81852.798909]  [<ffffffff81623f74>] ? genl_rcv+0x24/0x40
[81852.799975]  [<ffffffff8162331c>] ? netlink_unicast+0x16c/0x200
[81852.801015]  [<ffffffff816236ed>] ? netlink_sendmsg+0x33d/0x3b0
[81852.801985]  [<ffffffff815cf380>] ? sock_sendmsg+0x30/0x40
[81852.802956]  [<ffffffff815d01df>] ? ___sys_sendmsg+0x27f/0x290
[81852.803950]  [<ffffffff811eff86>] ?
mem_cgroup_event_ratelimit.isra.34+0x36/0xa0
[81852.804926]  [<ffffffff811f2e1a>] ? memcg_check_events+0x2a/0x1b0
[81852.805900]  [<ffffffff8118d8c5>] ? __lru_cache_add+0x75/0xa0
[81852.806880]  [<ffffffff816f1d56>] ? _raw_spin_unlock+0x16/0x30
[81852.807881]  [<ffffffff811b48ec>] ? handle_mm_fault+0x3dc/0x13e0
[81852.808861]  [<ffffffff8108734f>] ? do_send_specific+0x7f/0x90
[81852.809843]  [<ffffffff8121db60>] ? __fget+0x70/0xc0
[81852.810819]  [<ffffffff815d0b01>] ? __sys_sendmsg+0x51/0x90
[81852.811817]  [<ffffffff816f247b>] ? entry_SYSCALL_64_fastpath+0x1e/0xad
[81852.812794] Code: 1f 80 00 00 00 00 0f 1f 44 00 00 41 54 55 48 89 fd
53 0f 1f 44 00 00 f0 ff 45 18 48 89 ef e8 e3 f2 ff ff 48 85 c0 48 89 c3
74 28 <f0> 80 08 02 48 89 c6 48 89 ef e8 4c ff ff ff 48 89 ef e8 64 ae
[81852.813896] RIP  [<ffffffff81098a45>] kthread_stop+0x25/0x140
[81852.814952]  RSP <ffffc90002663988>
[81852.822091] ---[ end trace 73645b448dac6603 ]---
[81853.307086] svc_watchdog: calling "/system/bin/start unipro_reset" to
reset greybus network!

The issue happens when I rmmod gb_loopback after some control
operations failed (because of firmware issues).
Actually, the thread in loopback exits when pm_runtime fail.
But when we unload the loopback module, we call kthread_stop(),
whereas the thread has already exited.

The following workaround unblocked me:
diff --git a/drivers/staging/greybus/loopback.c
b/drivers/staging/greybus/loopback.c
index ada00cb..156b09d 100644
--- a/drivers/staging/greybus/loopback.c
+++ b/drivers/staging/greybus/loopback.c
@@ -994,8 +994,11 @@ static int gb_loopback_fn(void *data)
                        wait_event_interruptible(gb->wq, gb->type ||
                                                 kthread_should_stop());
                        ret = gb_pm_runtime_get_sync(bundle);
-                       if (ret)
+                       if (ret) {
+                               wait_event_interruptible(gb->wq,
+
kthread_should_stop());
                                return ret;
+                       }
                }

                if (kthread_should_stop())


I guess that's not a good way to fix the issue.
I think the best thing to do will be to just trigger a disconnect.
Exiting the thread let loopback in incorrect state:
nothing will work because the thread stopped but sysfs / debugfs files
are still there.

I'm volunteer to fix the issue but I'm not sure how to do it,
and may be there is other better solutions I might have missed.
So, all suggestion are welcome.

Thanks,
Alexandre
_______________________________________________
greybus-dev mailing list
greybus-dev@xxxxxxxxxxxxxxxx
https://lists.linaro.org/mailman/listinfo/greybus-dev




[Index of Archives]     [Asterisk App Development]     [PJ SIP]     [Gnu Gatekeeper]     [IETF Sipping]     [Info Cyrus]     [ALSA User]     [Fedora Linux Users]     [Linux SCTP]     [DCCP]     [Gimp]     [Yosemite News]     [Deep Creek Hot Springs]     [Yosemite Campsites]     [ISDN Cause Codes]     [Asterisk Books]

  Powered by Linux