Chris Mason <chris.mason@xxxxxxxxxxxx> writes: > Unfortunately, it's hard to say. I think the fusionio cards are the > only shipping devices that support this, but I've definitely heard that > others plan to support it as well. mariadb/percona already support the > atomics via fusionio specific ioctls, and turning that into a real > O_ATOMIC is a priority so other hardware can just hop on the train. > > This feature in general is pretty natural for the log structured squirrels > they stuff inside flash, so I'd expect everyone to support it. Matthew, > how do you feel about all of this? > > With the fusionio drivers, we've recently increased the max atomic size. > It's basically 1MB, disjoint or contig doesn't matter. We're powercut > safe at 1MB. > >> >> Basically, I'd like to avoid requiring a trial and error programming >> model to determine what an application can expect to work (like we have >> with O_DIRECT right now). > > I'm really interested in ideas on how to provide that. But, with dm, > md, and a healthy assortment of flash vendors, I don't know how... Well, we have control over dm and md, so I'm not worried about that. For the storage vendors, we'll have to see about influencing the standards bodies. The way I see it, there are 3 pieces of information that are required: 1) minimum size that is atomic (likely the physical block size, but maybe the logical block size?) 2) maximum size that is atomic (multiple of minimum size) 3) whether or not discontiguous ranges are supported Did I miss anything? > I've attached my current test program. The basic idea is to fill > buffers (1MB in size) with a random pattern. Each buffer has a > different random pattern. > > You let it run for a while and then pull the plug. After the box comes > back up, run the program again and it looks for consistent patterns > filling each 1MB aligned region in the file. [snip] > In order to reliably find torn blocks without O_ATOMIC, I had to bump > the write size to 1MB and run 24 instances in parallel. Thanks for the program (I actually have my own setup for verifying torn writes, the veritable dainto[1], which nobody uses). Just to be certain, you did bump /sys/block/<dev>/queue/max_sectors_kb to 1MB, right? Cheers, Jeff [1] http://people.redhat.com/jmoyer/dainto-0.99.4.tar.bz2 -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html