Hi Martin, I totally agree with better having a common block layer infrastructure to handle such discard misfit cases. But, for now, I do not have a good idea of how to aggregate in the block layer discard chunks (< discard_granularity) and issue later only a big one (== discard_granularity) to underlying block device in a generic and persistent fashion. For me, the current handling of discards by the block layer [blk_stack_limits() + blk_bio_discard_split()] seems to be inconsistent with the handling of normal (rd/wr) IO. It makes the life of block drivers developers harder as they can not rely on blk_queue_split() doing its job on discard bio's. Regards, Florian On Tue, Aug 2, 2016 at 4:08 AM, Martin K. Petersen <martin.petersen@xxxxxxxxxx> wrote: >>>>>> Florian-Ewald Müller <florian-ewald.mueller@xxxxxxxxxxxxxxxx> writes: > > Florian-Ewald, > >> Now my experiments show that, at least, dm-cache doesn't complain nor >> rejects those smaller discards than its discard_granularity, but >> possibly turning them into noop (?). > > Correct. Anything smaller than (an aligned) multiple of the discard > granularity will effectively be ignored. > > In practice this means that your device should allocate things in > aligned units of the underlying device's discard granularity. > >> May be that the needed functionality of accumulating small discards to >> a big one covering its own granularity (similar to SSDs block erasure) >> should be done at that driver level. > > Do you allocate blocks in a predictable pattern between your nodes? > > For MD RAID0, for instance, we issue many small discard requests. But > for I/Os that are bigger than the stripe width we'll wrap around and do > merging so that for instance blocks 0, n, 2*n, 3*n, etc. become part of > the same discard request sent to the device. > > If you want discards smaller than the underlying granularity to have an > effect then I'm afraid the burden is on you to maintain a bitmap of each > granularity sized region. And then issue a deferred discard when all > blocks inside that region have been discarded by the application or > filesystem above. > > If you want to pursue partial block tracking it would be good to come up > with a common block layer infrastructure for it. dm-thin could benefit > from it as well... > > -- > Martin K. Petersen Oracle Linux Engineering -- Florian-Ewald Mueller Architecture Board ProfitBricks GmbH Greifswalder Str. 207 D - 10405 Berlin Tel: +49 30 577 008 331 Fax: +49 30 577 008 598 Email: florian-ewald.mueller@xxxxxxxxxxxxxxxx URL: http://www.profitbricks.de Sitz der Gesellschaft: Berlin. Registergericht: Amtsgericht Charlottenburg, HRB 125506 B. Geschäftsführer: Andreas Gauger, Achim Weiss. Please consider the environment before printing this email. -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html