Hi Brad,
Out of morbid curiosity I grabbed an older MacOS AppleSMC.kext (10.7) and ran it through the disassembler. Every read/write to the SMC starts the same way with a check to make sure the SMC is in a sane state. If it's not, a read command is sent to try and kick it back into line : Wait for 0x04 to clear. This is 1,000,000 iterations of "read status, check if 0x04 is set, delay 10uS". If it clears, move on. If it doesn't, try and send a read command (just the command 0x10) and wait for the busy flag to clear again with the same loop. So in theory if the SMC was locked up, it'd be into the weeds for 20 seconds before it pushed the error out. So, lets say we've waited long enough and the busy flag dropped : Each command write is : Wait for 0x02 to clear. This is 1,000,000 iterations of "read status, check if 0x02 is set, delay 10uS". Send command Each data byte write is : Wait for 0x02 to clear. This is 1,000,000 iterations of "read status, check if 0x02 is set, delay 10uS". Immediate and single status read, check 0x04. If not set, abort. Send data byte Each data byte read is : read staus, wait for 0x01 and 0x04 to be set. delay 10uS and repeat. Abort if fail. Each timeout is 1,000,000 loops with a 10uS delay. So aside from the startup set which occurs on *every* read or write set, status checks happen before a command or data write, and not at all after. Under no circumstances are writes of any kind re-tried, but these timeouts are up to 10 seconds!
Great findings here. But from this, it would seem we are doing almost the right thing already, no? The essential difference seems to be that where the kext does a read to wake up the SMC, while we retry the first command until it works. If would of course be very interesting to know if that makes a difference.
That would indicate that the requirement for retries on the early Mac means we're not waiting long enough somewhere. Not that I'm suggesting we do another re-work, but when I get back in front of my iMac which does 17 transactions per second with this driver, I might re-work it similar to the Apple driver and see what happens. Oh, and it looks like the 0x40 flag that is on mine is the "I have an interrupt pending" flag, and the result should be able to be read from 0x31F. I'll play with that when I get time. That probably explains why IRQ9 screams until the kernel gags it on this machine as it's not being given any love.
Sounds good, getting interrupts working would have been nice. Henrik