correcting Christoph's email address - no other edits/comments On Wed, Apr 21, 2010 at 4:44 PM, Greg Freemyer <greg.freemyer@xxxxxxxxx> wrote: > On Wed, Apr 21, 2010 at 3:22 PM, Jeff Moyer <jmoyer@xxxxxxxxxx> wrote: >> Ric Wheeler <rwheeler@xxxxxxxxxx> writes: >> >>> On 04/21/2010 02:59 PM, Greg Freemyer wrote: >>>> On Tue, Apr 20, 2010 at 10:45 PM, Eric Sandeen<sandeen@xxxxxxxxxx> wrote: >>>>> Mark Lord wrote: >>>>>> On 20/04/10 05:21 PM, Greg Freemyer wrote: >>>>>>> Mark, >>>>>>> >>>>>>> This is the patch implementing the new discard logic. >>>>>> .. >>>>>>> Signed-off-by: Lukas Czerner<lczerner@xxxxxxxxxx> >>>>>> .. >>>>>>>> +void ext4_trim_extent(struct super_block *sb, int start, int count, >>>>>>>> + ext4_group_t group, struct ext4_buddy *e4b) >>>>>>>> +{ >>>>>>>> + ext4_fsblk_t discard_block; >>>>>>>> + struct ext4_super_block *es = EXT4_SB(sb)->s_es; >>>>>>>> + struct ext4_free_extent ex; >>>>>>>> + >>>>>>>> + assert_spin_locked(ext4_group_lock_ptr(sb, group)); >>>>>>>> + >>>>>>>> + ex.fe_start = start; >>>>>>>> + ex.fe_group = group; >>>>>>>> + ex.fe_len = count; >>>>>>>> + >>>>>>>> + mb_mark_used(e4b,&ex); >>>>>>>> + ext4_unlock_group(sb, group); >>>>>>>> + >>>>>>>> + discard_block = (ext4_fsblk_t)group * >>>>>>>> + EXT4_BLOCKS_PER_GROUP(sb) >>>>>>>> + + start >>>>>>>> + + le32_to_cpu(es->s_first_data_block); >>>>>>>> + trace_ext4_discard_blocks(sb, >>>>>>>> + (unsigned long long)discard_block, >>>>>>>> + count); >>>>>>>> + sb_issue_discard(sb, discard_block, count); >>>>>>>> + >>>>>>>> + ext4_lock_group(sb, group); >>>>>>>> + mb_free_blocks(NULL, e4b, start, ex.fe_len); >>>>>>>> +} >>>>>>> >>>>>>> Mark, unless I'm missing something, sb_issue_discard() above is going >>>>>>> to trigger a trim command for just the one range. I thought the >>>>>>> benchmarks you did showed that a collection of ranges needed to be >>>>>>> built, then a single trim command invoked that trimmed that group of >>>>>>> ranges. >>>>>> .. >>>>>> >>>>>> Mmm.. If that's what it is doing, then this patch set would be a >>>>>> complete disaster. >>>>>> It would take *hours* to do the initial TRIM. >> >> Except it doesn't. Lukas did provide numbers in his original email. >> > > Looking at the benchmarks (for the first time) at > http://people.redhat.com/jmoyer/discard/ext4_batched_discard/ > > I don't see anything that says how long the proposed trim ioctl takes > to complete on the full filesystem. > > What they do show is that with the 3 test SSDs used for this > benchmark, the current released discard implementation is a net loss. > ie. You are better off running without the discards for all 3 vendors. > (at least under the conditions tested.) > > After the patch is applied and optimizing the discards to large free > extents only, it works out to same performance with or without the > discards. ie. no net gain or loss. > > That is extremely cool because one assumes that the non-discard case > would degrade over time, but that the discard case will not. > > So that argues for the current proposed patch going in. > > But quoting from the first email: > > == > The basic idea behind my discard support is to create an ioctl which > walks through all the free extents in each allocating group and discard > those extents. As an addition to improve its performance one can specify > minimum free extent length, so ioctl will not bother with shorter extents. > > This of course means, that with each invocation the ioctl must walk > through whole file system, checking and discarding free extents, which > is not very efficient. The best way to avoid this is to keep track of > deleted (freed) blocks. Then the ioctl have to trim just those free > extents which were recently freed. > > In order to implement this I have added new bitmap into ext4_group_info > (bb_bitmap_deleted) which stores recently freed blocks. The ioctl then > walk through bb_bitmap_deleted, compare deleted extents with free > extents trim them and then removes it from the bb_bitmap_deleted. > > But you may notice, that there is one problem. bb_bitmap_deleted does > not survive umount. To bypass the problem the first ioctl call have to > walk through whole file system trimming all free extents. But there is a > better solution to this problem. The bb_bitmap_deleted can be stored on > disk an can be restored in mount time along with other bitmaps, but I > think it is a quite big change and should be discussed further. > == > > The above seems to argue against the patch going in until the > mount/umount issues are addressed. > > So in addition to this patch, Lukas is proposing a on disk change to > address the fact that calling trim upteen times at mount time is too > slow. > > Per Mark's testing of last summer, an alternative solution is to use a > vectored trim approach that is far more efficient. > > Mark's benchmarks showed this as doable in seconds which seems like a > reasonable amount of time for a mount time operation. > > Greg > -- Greg Freemyer Head of EDD Tape Extraction and Processing team Litigation Triage Solutions Specialist http://www.linkedin.com/in/gregfreemyer CNN/TruTV Aired Forensic Imaging Demo - http://insession.blogs.cnn.com/2010/03/23/how-computer-evidence-gets-retrieved/ The Norcross Group The Intersection of Evidence & Technology http://www.norcrossgroup.com -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html