On Jul 16, 2014, at 11:16 AM, Mason <mpeg.blue@xxxxxxx> wrote: > (I hope you'll forgive me for reformatting the quote characters > to my taste.) Thank you. > On 16/07/2014 17:16, John Stoffel wrote: >> Mason wrote: >>> I'm using Linux (3.1.10 at the moment) on a embedded system >>> similar in spec to a desktop PC from 15 years ago (256 MB RAM, >>> 800-MHz CPU, USB). >> >> Sounds like a Raspberry Pi... And have you investigated using >> something like XFS as your filesystem instead? > > The system is a set-top box (DVB-S2 receiver). The system CPU is > MIPS 74K, not ARM (not that it matters, in this case). > > No, I have not investigated other file systems (yet). > >>> I need to be able to create large files (50-1000 GB) "as fast >>> as possible". These files are created on an external hard disk >>> drive, connected over Hi-Speed USB (typical throughput 30 MB/s). >> >> Really... so you just need to create allocations of space as quickly >> as possible, > > I may not have been clear. The creation needs to be fast (in UX terms, > so less than 5-10 seconds), but it only occurs a few times during the > lifetime of the system. > >> which will then be filled in later with actual data? > > Yes. In fact, I use the loopback device to format the file as an > ext4 partition. > > The use case is > - allocate a large file > - stick a file system on it > - store stuff (typically video files) inside this "private" FS > - when the user decides he doesn't need it anymore, unmount and unlink > (I also have a resize operation in there, but I wanted to get the > basics before taking the hard stuff head on.) > > So, in the limit, we don't store anything at all: just create and > immediately delete. This was my test. I would agree that LVM is the real solution that you want to use. It is specifically designed for this, and has much less overhead than a filesystem on a loopback device on a file on another filesystem. The amount of space overhead is tuneable, but typically the volumes are allocated in multiples of 4MB chunks. That said, I think you've found some kind of strange performance problem, and it is worthwhile to figure this out. >>> /tmp # time ./foo /mnt/hdd/xxx 5 >>> posix_fallocate(fd, 0, size_in_GiB << 30): 0 [68 ms] >>> unlink(filename): 0 [0 ms] >>> 0.00user 1.86system 0:01.92elapsed 97%CPU (0avgtext+0avgdata 528maxresident)k >>> 0inputs+0outputs (0major+168minor)pagefaults 0swaps >>> >>> /tmp # time ./foo /mnt/hdd/xxx 10 >>> posix_fallocate(fd, 0, size_in_GiB << 30): 0 [141 ms] >>> unlink(filename): 0 [0 ms] >>> 0.00user 3.71system 0:03.83elapsed 96%CPU (0avgtext+0avgdata 528maxresident)k >>> 0inputs+0outputs (0major+168minor)pagefaults 0swaps >>> >>> /tmp # time ./foo /mnt/hdd/xxx 100 >>> posix_fallocate(fd, 0, size_in_GiB << 30): 0 [1882 ms] >>> unlink(filename): 0 [0 ms] >>> 0.00user 37.12system 0:38.93elapsed 95%CPU (0avgtext+0avgdata 528maxresident)k >>> 0inputs+0outputs (0major+168minor)pagefaults 0swaps >>> >>> /tmp # time ./foo /mnt/hdd/xxx 300 >>> posix_fallocate(fd, 0, size_in_GiB << 30): 0 [3883 ms] >>> unlink(filename): 0 [0 ms] >>> 0.00user 111.38system 1:55.04elapsed 96%CPU (0avgtext+0avgdata 528maxresident)k >>> 0inputs+0outputs (0major+168minor)pagefaults 0swaps Firstly, have you tried using "fallocate()" directly, instead of posix_fallocate()? It may be (depending on your userspace) that posix_fallocate() is writing zeroes to the file instead of using the fallocate() syscall, and the kernel is busy cleaning up all of the dirty pages when the file is unlinked. You could try using strace to see what system calls are actually being used. Secondly, where is the process actually stuck? From your output above, the unlink() call takes no measurable time before returning, so I don't see where it is actually stuck. Again, running your test with "strace -tt -T ./foo /mnt/hdd/xxx 300" will show which syscall is actually taking so much time to complete. I don't think it is unlink(). Cheers, Andreas
Attachment:
signature.asc
Description: Message signed with OpenPGP using GPGMail