On 24/06/2021 13:11, Jonathan Billings
wrote:
On Thu, Jun 24, 2021 at 12:15:48PM +0100, Christopher Ross wrote:Fedora 34 on my i7 with 32G RAM and nvidia RTX2060 card takes minutes to boot, and when it finally does there are a number of "something went wrong" notifications. How best can I diagnose and fix this so that it boots quickly and without errors?Do you have the RPMFusion nvidia packages installed? Are you using any 3rd-party nvidia drivers? Or are you using the nouveau driver?
Yes, RPMFusion is enabled and nvidia drivers installed.
The nvidia driver might be compiling on boot (dkms) which takes a long time, and if it fails, will cause GL issues that can break 'nautilus', and if 'nautilus' crashes, the GNOME session will do the 'Something went wrong' alert.
It happens every boot, not just when there is a kernel or driver update. In that instance it should not need to recompile.
I should have mentioned that I'm running KDE not Gnome, but at this stage it hasn't run. The 3½ minutes is the time it takes from power on to bring up the GDM login screen.
I can't remember the exact message from abrtd. I good follow up question is probably how can I get more information about the oopses?
Looking in /var/spool/abrt/oops-2021-06-24-09:48:20-3565-0/dmesg
It does seem the first oops was nvidia
...
[ 14.920735] intel_rapl_common: RAPL package-0 domain package locked by BIOS
[ 14.956559] pktcdvd: pktcdvd0: writer mapped to sr0
[ 15.073374] zram0: detected capacity change from 0 to 16777216
[ 15.095100] Adding 8388604k swap on /dev/zram0. Priority:100 extents:1 across:8388604k SSFS
[ 15.374420] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:01:00.0 on minor 1
[ 15.684579] nvidia-gpu 0000:01:00.3: i2c timeout error e0000000
[ 15.684583] ucsi_ccg 8-0008: i2c_transfer failed -110
[ 15.684585] ucsi_ccg 8-0008: ucsi_ccg_init failed - -110
[ 15.684590] ucsi_ccg: probe of 8-0008 failed with error -110
[ 40.291071] watchdog: BUG: soft lockup - CPU#3 stuck for 26s! [plymouthd:445]
[ 40.291074] Modules linked in: intel_rapl_msr intel_rapl_common at24 mei_hdcp iTCO_wdt intel_pmc_bxt iTCO_vendor_support ucsi_ccg typec_ucsi typec pktcdvd x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass rapl intel_cstate intel_uncore raid0 eeepc_wmi asus_wmi sparse_keymap rfkill wmi_bmof pcspkr snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio nvidia_drm(POE) joydev snd_hda_codec_hdmi nvidia_modeset(POE) snd_hda_intel i2c_i801 snd_intel_dspcfg i2c_smbus snd_intel_sdw_acpi snd_usb_audio apple_mfi_fastcharge snd_hda_codec nvidia_uvm(POE) snd_usbmidi_lib snd_hda_core snd_rawmidi mc snd_hwdep snd_seq snd_seq_device lpc_ich snd_pcm nvidia(POE) mei_me snd_timer snd mei i2c_nvidia_gpu soundcore nfsd auth_rpcgss nfs_acl lockd grace binfmt_misc sunrpc nfs_ssc zram ip_tables i915 crct10dif_pclmul crc32_pclmul crc32c_intel cdc_mbim cdc_wdm e1000e mxm_wmi i2c_algo_bit ghash_clmulni_intel drm_kms_helper cec cdc_ncm cdc_ether drm usbnet mii wmi video fuse
[ 40.291104] CPU: 3 PID: 445 Comm: plymouthd Tainted: P OE 5.12.11-300.fc34.x86_64 #1
[ 40.291106] Hardware name: System manufacturer System Product Name/MAXIMUS V FORMULA, BIOS 1903 08/19/2013
[ 40.291107] RIP: 0010:os_io_read_dword+0x8/0x10 [nvidia]
[ 40.291311] Code: 00 00 0f 1f 44 00 00 89 fa ec c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 89 fa 66 ed c3 66 0f 1f 44 00 00 0f 1f 44 00 00 89 fa ed <c3> 0f 1f 80 00 00 00 00 0f 1f 44 00 00 48 85 ff 75 0c 48 8b 05 c7
[ 40.291312] RSP: 0018:ffffaeb9c05cf7d8 EFLAGS: 00000202
[ 40.291313] RAX: 0000000003763580 RBX: 00000000000017d3 RCX: 00000000000c0000
[ 40.291314] RDX: 000000000000e00c RSI: 00000000000c40e4 RDI: 000000000000e00c
[ 40.291314] RBP: ffff8c7220e12b10 R08: ffffffffc2aa5380 R09: 0000000000000282
[ 40.291315] R10: 0000000000000202 R11: 0000000000000040 R12: ffff8c7220e12b3c
[ 40.291316] R13: ffff8c7220e12b38 R14: 000000000000c000 R15: 000000000000c000
[ 40.291316] FS: 00007f27013a4800(0000) GS:ffff8c790fec0000(0000) knlGS:0000000000000000
[ 40.291317] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 40.291318] CR2: 00007ffca2940078 CR3: 000000010a834001 CR4: 00000000001706e0
[ 40.291319] Call Trace:
[ 40.291321] _nv041000rm+0x4c/0x70 [nvidia]
[ 40.291545] ? _nv040998rm+0x30/0x30 [nvidia]
[ 40.291768] ? _nv000834rm+0x4f/0x130 [nvidia]
...
The top part of systemd-analyze-blame is 1min 23.232s plymouth-quit-wait.service 53.077s cs-firewall-bouncer.service 52.219s dovecot.service 26.525s crowdsec.serviceIt appears you're using some sort of 3rd-party firewall driver called 'crowdsec'. Is that the problem? It does seem to be up there, although it could be waiting for something else to start.
Crowdsec is the modern replacement for fail2ban
https://crowdsec.net/
But disabling it doesn't help. It's presumably waiting on the network.
Many thanks,
Chris R.
_______________________________________________ users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure