Hi, So, here's another issue that we are grappling with, where we have a root-cause but don't currently have a good fix for. BLKSECDISCARD is an operation used for securely destroying a subset of the data on a device. Unfortunately, on SSDs, this is an operation with variable performance. It can be O(minutes) in the worst case. The pathological case is when many erase blocks on the flash contain a small amount of data that is part of the discard and a large amount of data that isn't. In such cases, the erase blocks have to be copied almost in entirety to fresh blocks, in order to erase the sectors to be discarded. This can be thought of as a defragmentation operation on the drive and can be expected to cost in the same ballpark as rewriting most of the contents of the drive. Therefore, it is possible for the thread waiting in the IOCTL in submit_bio_wait call in blkdev_issue_discard to wait for several minutes. The hung task watchdog is usually configured for 2 minutes, and this can expire before the operation finishes. This operation is very important to the security model of Chrome OS devices. Therefore, we would like the kernel to survive this even if it takes several minutes. Three approaches come to mind: One approach is to somehow avoid waiting for a single monolithic operation and instead wait on bits and pieces of the operation. These can be sized to finish within a reasonable timeframe. The exact size is likely device-specific. We already split these operations before issuing to the device, but the IOCTL thread is waiting for the whole rather than the parts. The hung task watchdog only sees the total amount of time the thread slept and not the forward progress taking place quietly. Another approach might be to do something in the spirit of the write system call: complete the partial operation (whatever the kernel thinks is reasonable), adjust the IOCTL argument and have the userspace reissue the syscall to continue the operation. The second option should probably be done with a different IOCTL name to avoid breaking userspace. A third approach, which is perhaps more adventurous, is to have a notion of forward progress that a thread can export and the hung task watchdog can evaluate. This can take the form of a function pointer and an argument. The result of the function is a monotonically decreasing unsigned value. When this value stops changing, we can conclude that the thread is hung. This can be used in place of context switch count for tasks where this function is available. This can potentially solve other similar issues, there is a way to tell if there is forward progress, but it is not as straightforward as the context switch count. What are your thoughts? Thanks in advance, Salman