Hi >>>>> Alex Tomas (AT) writes: >>> it depends on underlaying storage and workload. mballoc uses buddy >>> internally. it's much simpler and cheaper to find free 2^N blocks >>> compared to bitmap. AM> So mballoc's application is to save CPU cycles? AFAIU, we don't implement complex scanning for given size in balloc.c because bitmap isn't very comfortable structure for this and that would require many cycles. with mballoc it becomes possible. for example, to find 1MB free chunk one has to choose group (mballoc tracks number of free chunks in every buddy) and then scan just few bits). thus we can produce better layout and improve performance. >>> this is especially important for arrays like >>> DDN and raid5/6 because they require stripe-aligned/-sized requests >>> for good throughput. AM> Does this not imply that there needs to be new linkage between the AM> filesystem and the lower layers? So that raid/etc can inform the AM> filesystem driver about its alignment and striping requirements? currently, we pass preferred I/O size with mount option (stripe=N). I'd like that sort of communication between block driver and fs. something like f_bsize. >>> also, last mballoc takes logical block into >>> account and can preallocate few chunks at different logical offsets >>> for a file. imagine torrent downloading different pieces from few peers. AM> hm. You don't need anything as exotic as bittorrent to show up problems in AM> that area: AM> box:/usr/src/25> sudo bmap vmlinux | wc -l AM> 1152 well, this can be (and will be, I very hope :) solved by delayed allocation. I mentioned torrent because it's often used to get really large files. so large that they don't fit cache and delayed allocation won't help much. preallocation can help, but then few preallocations per file is required. thanks, Alex - To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html