[Hari, can you comment on the last paragraph below regarding the events limit? Would it hurt the runtime much to lose 8 signals out of 4096?] Hi Oded, I'm looking into upstreaming some event handling enhancements. I see that you actually worked on increasing the number of events to 4096 and working around some problems with debug events when you were at AMD. As I'm looking at this, I'm confused by the code that allocates signal pages. It allocates a single signal page that's big enough to accommodate all signals. But allocate_signal_page looks like it was meant to allocate smaller signal pages incrementally that are accumulated in a kfd_process::signal_event_pages list. However, this feature never gets used. It would also break user mode because the Thunk expects to be able to map a single signal page for all its signals. Therefore I'm inclined to simplify allocate_signal_page and related code to only deal with a single allocation. Do you have any concerns or objections about that? Somewhat related to that, you also added a workaround that increases the signal number to 4096+512 to accommodate 8 debug events while maintaining page alignment. However, the signal page is allocated with __get_free_pages, which allocates in powers of 2. So in fact it needs to allocate 8192 signals, wasting about 28KB per process. It's not a lot of waste. But an alternative would be to instead sacrifice 8 user signals. The runtime has a fallback for when it runs out of signals, and the difference between 4096 and 4088 seems insignificant. What do you think? Regards, Â Felix -- F e l i x K u e h l i n g PMTS Software Development Engineer | Vertical Workstation/Compute 1 Commerce Valley Dr. East, Markham, ON L3T 7X6 Canada (O) +1(289)695-1597 _ _ _ _____ _____ / \ | \ / | | _ \ \ _ | / A \ | \M/ | | |D) ) /|_| | /_/ \_\ |_| |_| |_____/ |__/ \| facebook.com/AMD | amd.com