On Tue, Jun 11, 2024 at 07:19:10AM +0200, Christoph Hellwig wrote:
blkfront always had a robust negotiation protocol for detecting a write cache. Stop simply disabling cache flushes when they fail as that is a grave error.
It's my understanding the current code attempts to cover up for the lack of guarantees the feature itself provides: * feature-barrier * Values: 0/1 (boolean) * Default Value: 0 * * A value of "1" indicates that the backend can process requests * containing the BLKIF_OP_WRITE_BARRIER request opcode. Requests * of this type may still be returned at any time with the * BLKIF_RSP_EOPNOTSUPP result code. * * feature-flush-cache * Values: 0/1 (boolean) * Default Value: 0 * * A value of "1" indicates that the backend can process requests * containing the BLKIF_OP_FLUSH_DISKCACHE request opcode. Requests * of this type may still be returned at any time with the * BLKIF_RSP_EOPNOTSUPP result code. So even when the feature is exposed, the backend might return EOPNOTSUPP for the flush/barrier operations. Such failure is tied on whether the underlying blkback storage supports REQ_OP_WRITE with REQ_PREFLUSH operation. blkback will expose "feature-barrier" and/or "feature-flush-cache" without knowing whether the underlying backend supports those operations, hence the weird fallback in blkfront. I'm unsure whether lack of REQ_PREFLUSH support is not something that we should worry about, it seems like it was when the code was introduced, but that's > 10y ago. Overall blkback should ensure that REQ_PREFLUSH is supported before exposing "feature-barrier" or "feature-flush-cache", as then the exposed features would really match what the underlying backend supports (rather than the commands blkback knows about). Thanks, Roger.