>From my earlier message on the mailing list: [...] "Hitting the bug corrupts the underlying filesystem very thoroughly, wiping out large amount of data from the beginning of the partition which leaves fsck sad with thousands of items lost. Bisection of the IGT testlist was done with two root filesystems, where testable kernel booted from 2. partition, and copy of the 2. partition was stored on 1. partition and could be restored at will." The CI public interface doesn't really show this: the hosts started testing, died, and in boot stuck to the grub menu because grub.cfg (or anything) wasn't available on root disk. Decision to shut down the extended testing was mine, when I saw ~1 host per shard dying each testing round (couple of hosts per hour). It's a kind of bug our CI is not handling well, because on the catastrophic scale the effects are close to the maximum (where max would be permanent hw damage), and cause is not related to i915 at all. Regards, Tomi Sarvela > From: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> > Sent: Wednesday, March 3, 2021 1:28 AM > To: Dave Airlie <airlied@xxxxxxxxx>; Jens Axboe <axboe@xxxxxxxxx>; > Christoph Hellwig <hch@xxxxxx>; Damien Le Moal > <damien.lemoal@xxxxxxx>; Johannes Thumshirn > <johannes.thumshirn@xxxxxxx>; Chaitanya Kulkarni > <chaitanya.kulkarni@xxxxxxx> > Cc: Sarvela, Tomi P <tomi.p.sarvela@xxxxxxxxx>; Linux Memory Management > List <linux-mm@xxxxxxxxx>; Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>; > intel-gfx@xxxxxxxxxxxxxxxxxxxxx > Subject: Re: [Intel-gfx] Public i915 CI shardruns are disabled > > Adding the right people. > > It seems that the three commits that needed reverting are > > f885056a48cc ("mm: simplify swapdev_block") > 3e3126cf2a6d ("mm: only make map_swap_entry available for > CONFIG_HIBERNATION") > 48d15436fde6 ("mm: remove get_swap_bio") > > and while they look very harmless to me, let's bring in Christoph and > Jens who were actually involved with them. > > I'm assuming that it's that third one that is the real issue (and the > two other ones were to get to it), but it would also be good to know > what the actual details of the regression actually were. > > Maybe that's obvious to somebody who has more context about the 9815 > CI runs and its web interface, but it sure isn't clear to me. > > Jens, Christoph? > > Linus > > On Tue, Mar 2, 2021 at 11:31 AM Dave Airlie <airlied@xxxxxxxxx> wrote: > > > > On Wed, 3 Mar 2021 at 03:27, Sarvela, Tomi P <tomi.p.sarvela@xxxxxxxxx> > wrote: > > > > > > The regression has been identified; Chris Wilson found commits touching > > > > > > swapfile.c, and reverting them the issue couldn’t be reproduced any > more. > > > > > > > > > > > > https://patchwork.freedesktop.org/series/87549/ > > > > > > > > > > > > This revert will be applied to core-for-CI branch. When new CI_DRM has > > > > > > been built, shard-testing will be enabled again. > > > > Just making sure this is on the radar upstream. > > > > Dave.