Fredrik Andersson wrote: >> To try to emulate, how does it write into the preallocated space; large or >> small IOs? Sequential streaming? mmap writes? It may not be relevant but >> would be nice to try to match it as closely as possible. > > This is a big file that is written sequentially using stdio buffered > I/O (with a setvbuf of about 4K) in the drdbmake process. No mmap. > It is regenerated from an earlier version of the same file, and we > preallocate a file that is 25% bigger than the > previous version, to allow for more data than was in the previous file > and to utilize the extent concept in ext4. FWIW, you do not need to preallocate to get extents. Preallocation fundamentally only guarantees space available (somewhere) though in practice, it can lead to more contiguous allocation of that space since it's all done up front ... > We then read the previous file sequentially, update some entries here > and there and > rewrite it sequentially into the new, fallocated file. There is one > single instance of random I/O: Once the whole new > file has been written, we seek back to the start to write a fixed-size > header. We then ftruncate the file to the proper size. > No process is concurrently reading from the file that is being > written. There is however another process, nodeserv, > that does random reads from the "previous" file (the one we're > sequentially reading in drdbmake). > The deadlock is always in the final ftruncate. It does not help to > close the file and reopen it again before the ftruncate call. Thanks. If find time to think enough about the backtraces you sent it'll probably be obvious, but the complete description of your workload is helpful. Just out of curiosity have you verified that the deadlock doesn't exist if you skip the preallocation? I wonder about a fake test where you simply write a bit extra, and truncate that back. -Eric > /Fredrik -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html