On Fri, May 29, 2020 at 03:41:57PM +0200, Jean Delvare wrote: > On Thu, 28 May 2020 17:18:58 -0700, Darrick J. Wong wrote: > > I vaguely remember that the adt7470 temperature inputs were connected to > > the CPU, and the PWM outputs were connected to the CPU heatsink fans. > > The BIOS appeared to set up the adt7470 for automatic thermal management > > (i.e. when you cranked all four cores of the machine to maximum) it > > would gradually raise the CPU fan speed, like you'd expect. > > > > The reality (again, vaguely remembered) was that the chip wouldn't run > > its pwm control loop unless *something* poked it to reread the > > temperature sensors. A different model of the same machine had a BMC > > which would talk to the adt7470 over i2c and take care of that. > > That I understand, and while it is poor design in my opinion, it makes > sense to some degree. > > > The other problem was that /some/ of the machines for whatever reason > > would adjust the pwm value that you could read out over i2c, but > > wouldn't actually change the fan speed unless you whacked the adt into > > manual modem. > > Ah. That would be the reason for the extra code. Automatic fan speed > control that needs to be refreshed manually. Oh my. > > > Neither of those two behaviors were listed in the datasheet, and we > > (IBM) could never get an answer out of either Analog or our own hardware > > group about whether or not this was the expected behavior. I > > disassembled the BMC code to figure out what the other model computer > > was doing, and (clumsily) wrote that into the driver. For all I know we > > got a bad batch of adt7470s and all these weird gymnastics aren't > > supposed to be necessary. > > > > The next generation switched to a totally different chip and supplier, > > so I surmise they weren't happy with the results either. Those machines > > tended to overheat if you were in Windows. > > > > > > 4* Why are you calling msleep_interruptible() in > > > > adt7470_read_temperatures() to wait for the temperature conversions? We > > > > return -EAGAIN if that happens, but then ignore that error code, and we > > > > log a cryptic error message. Do I understand correctly that the only > > > > case where this should happen is when the user unloads the kernel > > > > driver, in which case we do not care about having been interrupted? I > > > > can't actually get the error message to be logged when rmmod'ing the > > > > module so I don't know what it would take to trigger it. > > > > Urrk, what a doof who wrote that. /me smacks 2009-era djwong. :P > > > > kthread_stop blocks until the thread exits... > > My experiments seem to confirm this. > > > but strangely we don't > > even try to interrupt the msleep_interruptible call. > > How would we do that if we wanted to? Later you say this is not > possible? You /could/ theoretically send the kthread a signal to interrupt the sleep, though I can't remember if kthreads are sufficiently special that signals don't work... > > That's fine, > > though device removal will take longer than it needs to. > > Yes, up to 2 seconds in my tests. Not pleasant, but also not > necessarily something to worry about, as rmmod is usually not needed. ...but probably not necessary since nobody's complained about the 2s yet. > > We also don't > > care about the return value of msleep_interruptible at all since one > > cannot interrupt the kthread. > > > > I probably picked interruptible sleep to avoid triggering the hangcheck > > timer. > > I don't understand that part. Is a 2 second uninteruptible sleep in a > kthread considered bad somehow? Not really, but the sysadmin can (probably ill-advisedly) set the hangcheck timer to go off after 1 second. > > > > 5* Is there any reason why the update thread is being started > > > > unconditionally? As I understand it, it is only needed if at least one > > > > PWM output is configured in automatic mode, which (I think) is not the > > > > default. It is odd that the bug reporter hits a problem with the > > > > Yes, the driver should only start the kthread loop if someone wants > > automatic temp control. > > OK, I'll give it a try. I don't want to add too much complexity though. <nod> --D > Thanks, > -- > Jean Delvare > SUSE L3 Support