18.01.2022 20:52, Eric W. Biederman пишет: > Dmitry Osipenko <digetx@xxxxxxxxx> writes: > >> 11.01.2022 20:20, Eric W. Biederman пишет: >>> Dmitry Osipenko <digetx@xxxxxxxxx> writes: >>> >>>> 08.01.2022 21:13, Eric W. Biederman пишет: >>>>> Dmitry Osipenko <digetx@xxxxxxxxx> writes: >>>>> >>>>>> 05.01.2022 22:58, Eric W. Biederman пишет: >>>>>>> >>>>>>> I have not yet been able to figure out how to run gst-pluggin-scanner in >>>>>>> a way that triggers this yet. In truth I can't figure out how to >>>>>>> run gst-pluggin-scanner in a useful way. >>>>>>> >>>>>>> I am going to set up some unit tests and see if I can reproduce your >>>>>>> hang another way, but if you could give me some more information on what >>>>>>> you are doing to trigger this I would appreciate it. >>>>>> >>>>>> Thanks, Eric. The distro is Arch Linux, but it's a development >>>>>> environment where I'm running latest GStreamer from git master. I'll try >>>>>> to figure out the reproduction steps and get back to you. >>>>> >>>>> Thank you. >>>>> >>>>> Until I can figure out why this is causing problems I have dropped the >>>>> following two patches from my queue: >>>>> signal: Make SIGKILL during coredumps an explicit special case >>>>> signal: Drop signals received after a fatal signal has been processed >>>>> >>>>> I have replaced them with the following two patches that just do what >>>>> is needed for the rest of the code in the series: >>>>> signal: Have prepare_signal detect coredumps using >>>>> signal: Make coredump handling explicit in complete_signal >>>>> >>>>> Perversely my failure to change the SIGKILL handling when coredumps are >>>>> happening proves to me that I need to change the SIGKILL handling when >>>>> coredumps are happening to make the code more maintainable. >>>> >>>> Eric, thank you again. I started to look at the reproduction steps and >>>> haven't completed it yet. Turned out the problem affects only older >>>> NVIDIA Tegra2 Cortex-A9 CPU that lacks support of ARM NEON instructions >>>> set, hence the problem isn't visible on x86 and other CPUs out of the >>>> box. I'll need to check whether the problem could be simulated on all >>>> arches or maybe it's specific to VFP exception handling of ARM32. >>> >>> It sounds like the gstreamer plugins only fail on certain hardware on >>> arm32, and things don't hang in coredumps unless the plugins fail. >>> That does make things tricky to minimize. >>> >>> I have just verified that the known problematic code is not >>> in linux-next for Jan 11 2022. >>> >>> If folks as they have time can double check linux-next and verify all is >>> well I would appreciate it. I don't expect that there are problems but >>> sometimes one problem hides another. >> >> Hello Eric, >> >> I reproduced the trouble on x86_64. >> >> Here are the reproduction steps, using ArchLinux and linux-next-20211224: >> >> ``` >> sudo pacman -S base-devel git mesa glu meson wget >> git clone https://github.com/grate-driver/gstreamer.git >> cd gstreamer >> git checkout sigill >> meson --prefix=/usr -Dgst-plugins-base:playback=enabled -Dgst-devtools:validate=disabled build >> cd build >> sudo ninja install >> wget https://www.peach.themazzone.com/big_buck_bunny_720p_h264.mov >> rm -r ~/.cache/gstreamer-1.0 >> gst-play-1.0 ./big_buck_bunny_720p_h264.mov >> ``` >> >> The SIGILL, thrown by [1], causes the hang. There is no hang using v5.16.1 kernel. >> >> [1] https://github.com/grate-driver/gstreamer/commit/006f9a2ee6dcf7b31c9b5413815d6054d82a3b2f > > Thank you. > > I will verify this works before I add my updated version to > my signal-for-v5.18 branch. > > Have you by any chance tried a newer version of linux-next without > commit fbc11520b58a ("signal: Make SIGKILL during coredumps an explicit > special case") in it? > > If not I will double check that my pulling the commit out does not break > in the case you have documented. Recent linux-next works fine.