Alexey Starikovskiy wrote: > Alan, Thanks for your robust response. Sorry for not discussing this more in the first place. I wanted to write the code that fixes my system first, and post it as soon as possible. I think I missed a flag to say "this post is not an immediate request for submission". > I don't think this is very good idea -- now every interrupt will add a > new entry to workqueue, > and given the rate of this interrupt, it could be become quite long. > I confess I gave up too quickly on the synchronisation issues and resorted to brute force. Patch #1 is preventing a narrow race: - acpi_ec_write_cmd() (QUERY) - EC writes 0 (no event) to input buffer - new event arrives, triggering a GPE interrupt which is ignored because QUERY_PENDING is set. - QUERY_PENDING is cleared (but too late). Now we ignored an event, which will be delayed until the next GPE interrupt. The original reason for patch #1 was to stop patch #2 driving me insane. If you apply the second patch on it's own, you break EC_FLAGS_QUERY_PENDING. It gets cleared as soon as the _first_ query command is submitted. It works, but it means the flag no longer has a clearly defined meaning. I tried to fix that by only clearing it once the loop has finished - and then realised I was widening a pre-existing race condition. As you say, I took the easy way out and pushed it all into the workqueue. So the workqueue ends up as a buffer of GPE occurrences. I did look at reducing the memory usage while avoiding race conditions, but I couldn't find a reasonable solution. But I looked at it again now and I have a better solution: ... /* moved here from acpi_ec_transaction_unlocked() */ clear_bit(EC_FLAGS_QUERY_PENDING, &ec->flags); while (acpi_ec_query(ec, &value) == 0) acpi_ec_gpe_run_handler(ec, value); } The PENDING flag then reflects whether or not there's an item on the workqueue. Does that make sense? > Your second patch is redundant if you add queue entry for each > interrupt, and as it does not > require any investments into memory, I like it more. > It's not quite redundant with the first patch. We still have GPE polling mode - patch #1 doesn't affect that. In polling mode, it's essential to query all the pending events - otherwise, if they arrive more frequently than the polling interval then you will inevitably drop events. Patch #2 is also required to fix my buggy hardware. My laptop's EC buffers multiple events, but clears SCI_EVT after every query. This caused problems in polling mode; with multiple events between polling intervals only one gets queried - and after a while the buffer overflows and it breaks completely. > Also, it is very cool to rip of things you don't care for, but that > essentially reverts a fix done for 9998, > so at least, you should ask these people if you broke their setups. > I assume you're referring to the "would benefit from wider testing" patch #3. Thanks for identifying the bugzilla entry - I had difficulty separating the different entries on GPEs. I'm optimistic that we can fix all these crazy buffering EC's without having to disable GPE interrupts. The reason I like my proposed fix is that it makes the code simple enough that I can understand it, without making any assumptions about how many interrupts arrive per GPE. The EC can be broken in lots of ways, so long as: 1. We receive interrupts when one or more GPE's are pending. 2. We don't get a constant interrupt storm. I don't think I missed anything. Is there anything else I should check before I try to get testing? Thanks Alan -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html