On Thu, 4 Aug 2022, Even Xu wrote: > There is a timing issue captured during ishtp client sending stress tests. > It was observed during stress tests that ISH firmware is getting out of > ordered messages. This is a rare scenario as the current set of ISH client > drivers don't send much data to firmware. But this may not be the case > going forward. > > When message size is bigger than IPC MTU, ishtp splits the message into > fragments and uses serialized async method to send message fragments. > The call stack: > ishtp_cl_send_msg_ipc->ipc_tx_callback(first fregment)-> > ishtp_send_msg(with callback)->write_ipc_to_queue-> > write_ipc_from_queue->callback->ipc_tx_callback(next fregment)...... > > When an ipc write complete interrupt is received, driver also calls > write_ipc_from_queue->ipc_tx_callback in ISR to start sending of next fragment. > > Through ipc_tx_callback uses spin_lock to protect message splitting, as the > serialized sending method will call back to ipc_tx_callback again, so it doesn't > put sending under spin_lock, it causes driver cannot guarantee all fragments > be sent in order. > > Considering this scenario: > ipc_tx_callback just finished a fragment splitting, and not call ishtp_send_msg > yet, there is a write complete interrupt happens, then ISR->write_ipc_from_queue > ->ipc_tx_callback->ishtp_send_msg->write_ipc_to_queue...... > > Because ISR has higher exec priority than normal thread, this causes the new > fragment be sent out before previous fragment. This disordered message causes > invalid message to firmware. I can imagine that this must have been nightmare to debug :) > The solution is, to send fragments synchronously: Use > ishtp_write_message writing fragments into tx queue directly one by one, > instead of ishtp_send_msg only writing one fragment with completion > callback. As no completion callback be used, so change ipc_tx_callback > to ipc_tx_send. Applied, thank you. -- Jiri Kosina SUSE Labs