On 1/23/25 1:05 PM, Salvatore Bonaccorso wrote: > Hi all, > > On Wed, Jan 22, 2025 at 08:49:13PM +0100, Salvatore Bonaccorso wrote: >> Control: forwarded -1 https://jira.mariadb.org/projects/MDEV/issues/MDEV-35886 >> Hi, >> >> On Tue, Jan 21, 2025 at 08:06:18PM +0100, Bernhard Schmidt wrote: >>> Control: affects -1 src:mariadb >>> Control: tags -1 + confirmed >>> Control: severity -1 critical >>> >>> Seeing this too. We have two standalone systems running the stock >>> bookworm MariaDB and the opensource network management system LibreNMS, >>> which is quite write-heavy. After some time (sometimes a couple of >>> hours, sometimes 1-2 days) all connection slots to the database are >>> full. >>> >>> When you kill one client process you can connect and issue "show >>> processlist", you see all slots busy with easy update/select queries >>> that have been running for hours. You need to SIGKILL mariadbd to >>> recover. >>> >>> The last two days our colleagues running a Galera cluster (unsure about >>> the version, inquiring) have been affected by this as well. They found >>> an mariadb bug report about this. >>> >>> https://jira.mariadb.org/projects/MDEV/issues/MDEV-35886?filter=allopenissues >>> >>> Since there have been reports about data loss I think it warrants >>> increasing the severity to critical. >>> >>> I'm not 100% sure about -30 though, we have been downgrading the >>> production system to -28 and upgraded the test system to -30, and both >>> are working fine. The test system has less load though, and I trust the >>> reports here that -30 is still broken. >> >> I would be interested to know if someone is able to reproduce the >> issue more in under lab conditions, which would enable us to bisect >> the issue. >> >> As a start I set the above issue as a forward, to have the issues >> linked (and we later on can update it to the linux upstream report). > > I suspect this might be introduced by one of the io_uring related > changes between 6.1.119 and 6.1.123. > > But we need to be able to trigger the issue in an environment not in > production, and then bisect those upstream changes. I'm still looping > in already Jens Axboe if this rings some bell. > > Jens, for context, we have reports in Debian about MariaDB hangs after > updating from 6.1.119 based kernel to 6.1.123 (and 6.1.144) as > reported in https://bugs.debian.org/1093243 Thanks for the report, that's certainly unexpected. I'll take a look. -- Jens Axboe