From: Jiri Olsa > Sent: 14 January 2025 14:03 > > hi, > while checking on similar code for uprobes I was wondering if we > can merge first 2 steps of instruction update in text_poke_bp_batch > function. > > Basically the first step now would be to write int3 byte together > with the rest of the bytes of the new instruction instead of doing > that separately. And the second step would be to overwrite int3 > byte with first byte of the new instruction. > > Would that work or do I miss some x86 detail that could lead to crash? I suspect it will 'crash and burn'. Consider what happens if there is a cache-line boundary in the middle of an instruction. (Actually an instruction fetch boundary will do.) cpu0: reads the old instructions from the old cache line. cpu0: pipeline busy (or similar) so doesn't read the next cache line. cpu1: writes the new instructions. cpu0: reads the second cache line. cpu0 now has a mix of the old and new instruction bytes. Writing the int3 is safe - provided they don't return until all the patching is over. But between writing the int3 (over the first opcode byte) and updating anything else I suspect you need something that does a complete synchronise between the cpu that discards any bytes in the decode pipeline as well as flushing the I-cache (etc). I suspect that requires an acked IPI. Very long cpu stalls are easy to generate. Any read from PCIe will be slow (I've at fpga target that takes ~1us). You'd need to be unlucky to be patching an instruction while one was pending, but a DMA access might just be enough to cause grief. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)