Quoting Jeff Moyer (2013-11-07 10:43:41) > Chris Mason <chris.mason@xxxxxxxxxxxx> writes: > > > Unfortunately, it's hard to say. I think the fusionio cards are the > > only shipping devices that support this, but I've definitely heard that > > others plan to support it as well. mariadb/percona already support the > > atomics via fusionio specific ioctls, and turning that into a real > > O_ATOMIC is a priority so other hardware can just hop on the train. > > > > This feature in general is pretty natural for the log structured squirrels > > they stuff inside flash, so I'd expect everyone to support it. Matthew, > > how do you feel about all of this? > > > > With the fusionio drivers, we've recently increased the max atomic size. > > It's basically 1MB, disjoint or contig doesn't matter. We're powercut > > safe at 1MB. > > > >> > >> Basically, I'd like to avoid requiring a trial and error programming > >> model to determine what an application can expect to work (like we have > >> with O_DIRECT right now). > > > > I'm really interested in ideas on how to provide that. But, with dm, > > md, and a healthy assortment of flash vendors, I don't know how... > > Well, we have control over dm and md, so I'm not worried about that. > For the storage vendors, we'll have to see about influencing the > standards bodies. > > The way I see it, there are 3 pieces of information that are required: > 1) minimum size that is atomic (likely the physical block size, but > maybe the logical block size?) > 2) maximum size that is atomic (multiple of minimum size) > 3) whether or not discontiguous ranges are supported > > Did I miss anything? It'll vary from vendor to vendor. A discontig range of two 512KB areas is different from 256 distcontig 4KB areas. And it's completely dependent on filesystem fragmentation. So, a given IO might pass for one file and fail for the next. In a DM/MD configuration, an atomic IO inside a single stripe on raid0 could succeed while it will fail if it spans two stripes to two different devices. > > > I've attached my current test program. The basic idea is to fill > > buffers (1MB in size) with a random pattern. Each buffer has a > > different random pattern. > > > > You let it run for a while and then pull the plug. After the box comes > > back up, run the program again and it looks for consistent patterns > > filling each 1MB aligned region in the file. > [snip] > > In order to reliably find torn blocks without O_ATOMIC, I had to bump > > the write size to 1MB and run 24 instances in parallel. > > Thanks for the program (I actually have my own setup for verifying torn > writes, the veritable dainto[1], which nobody uses). Just to be certain, > you did bump /sys/block/<dev>/queue/max_sectors_kb to 1MB, right? Since the atomics patch does things as a list of bios, there's no max_sectors_kb to worry about. Each individual bio was only 4K. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html