Kirill, >> Correct. We shouldn't go down this path unless a device is thinly >> provisioned (i.e. max_discard_sectors > 0). > > (I assumed it is a typo, and you mean max_allocate_sectors like bellow). No, this was in the context of not having an explicit queue limit for allocation. If a device does not have max_discard_sectors > 0 then it is not thinly provisioned and therefore attempting allocation makes no sense. >> I don't like "write_zeroes_can_allocate" because that makes assumptions >> about WRITE ZEROES being the command of choice. I suggest we call it >> "max_allocate_sectors" to mirror "max_discard_sectors". I.e. put >> emphasis on the semantic operation and not the plumbing. > > Hm. Do you mean "bool max_allocate_sectors" or "unsigned int max_allocate_sectors"? unsigned int. At least for SCSI we could have a device which would use UNMAP for discards and WRITE SAME for allocates. And therefore the range limit could be different for the two operations. Sadly. I have a patch in the pipeline which deals with some problems in this department because some devices have a split brain wrt. their discard limits. > In the second case we should make all the > q->limits.max_write_zeroes_sectors dereferencing as switches like the > below (this is a partial patch and only several of places are > converted to switches as examples): Something like that, yes. This is getting a bit messy :( However, I am not sure that scattering REQ_OP_ALLOCATE all over the I/O stack is particularly attractive either. Both REQ_OP_DISCARD and REQ_OP_WRITE_SAME come with some storage protocol baggage that forces us to have special handling all over the stack. But REQ_OP_WRITE_ZEROES is fairly clean and simple and, except for the potentially different block count limit, an allocate operation would be a carbon copy of the plumbing for write zeroes. A lot of duplication. So even through I'm increasingly torn on whether introducing separate REQ_OP_ALLOCATE plumbing throughout the stack or having a REQ_ALLOCATE flag for REQ_OP_WRITE_ZEROES is best, I still think I'm leaning towards the latter. That will also make it easier for me in the SCSI disk driver. -- Martin K. Petersen Oracle Linux Engineering