Re: O_DIRECT bug, looking for advice

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



i hope this old mailist archive can help u..
http://kerneltrap.org/mailarchive/linux-kernel/2007/1/11/44365/thread

O_DIRECT  seems always like a evil.

2009/5/15 David Wuertele <dave+gmane@xxxxxxxxxxxx>:
> I'm developing an embedded mipsel system with linux-mips.org's
> linux-2.6.18, and I'm finding that reads of disk files opened with
> O_DIRECT end up corrupted.  I've googled O_DIRECT corruption and the
> only advice I've come up with is make sure the reads are aligned, and
> I've done that, to no avail.  I wonder if anyone has some clue for
> sale.
>
> Here are some of the details:
>
> 0.  The corruption happens the same way regardless of what filesystem
>    I use.
>
> 1.  The corruption consists of 64 contiguous unset bytes in specific
>    places in the buffer.  I.e., if the buffer was zeroed before the
>    read, there will be some regions of 64 zeros in place of the
>    expected file data.  If the buffer was memset() to the character
>    'z', there will be some regions of 64 'z' characters in place of
>    the expected file data.
>
> 2.  Reading the same file without O_DIRECT results in expected file
>    data, no corruption.
>
> 3.  Performing the same O_DIRECT read multiple times into the same
>    buffer results in the same corruption
>
> 4.  Performing the same O_DIRECT read multiple times into different
>    buffers results in different patterns of corruption
>
> 5.  The corruptions occur on 64-byte aligned offsets, but usually not
>    on page (4096 byte)-aligned offsets.
>
> 6.  The corruptions only occur within 48 pages (196608 bytes) of the
>    end of the buffer, regardless of buffer size, read size, or buffer
>    alignment
>
> 7.  Two O_DIRECT reads of different offsets into the file into the
>    same buffer result in identical patterns of corruption!
>
> 8.  Create a big buffer.  Do an O_DIRECT read into offset 0 of that
>    buffer, then do an O_DIRECT read of the same file into offset X of
>    that buffer.  The pattern of corruption will be found at the same
>    offset into the buffer, which means that the pattern of corruption
>    will be shifted by an offset X between the reads.
>
> Here is a graphical representation of a series of small reads into
> different offsets of a single larger buffer.  The vertical bars ("|")
> represent a 64 page buffer at a 2MB alignment allocated as follows:
>
>    char *;
>    posix_memalign (buf, 2097152, 262144);
>
> I open the same file twice, once with O_DIRECT and once without.  I do
> reads with the O_DIRECT filehandle using different offsets into buf,
> and compare each time with identical reads (into a seperate, identical
> buffer) using the non-O_DIRECT filehandle.  Before each read, I use
> memset() to fill the buffer with a specific value, the "unset data
> character".  Somtimes I use zero, sometimes not.
>
> Each line of the following graph represents a read at a specific
> offset into buf.  The first line is read into buf with a zero offset.
> Each subsequent line is a read into buf with the offset increased by
> one page.  The read size happens to be 18 pages (73728 bytes), but
> that size is not significant --- the same style of corruption occurs
> regardless of the read size.
>
> A "." character represents a page which matches perfectly between the
> O_DIRECT read and the non-O_DIRECT read.  An "X" character represnts a
> page of the O_DIRECT read containing one or more of the 64-byte
> regions of the "unset data character".  A " " (space) character
> represents the part of the buffer which was unused for this read.
>
> |..................                                              |
> | ...............X.X                                             |
> |  ..............XXX.                                            |
> |   .............XXX..                                           |
> |    ............XXX..X                                          |
> |     ...........XXX..XX                                         |
> |      ..........XXX..XXX                                        |
> |       .........XXX..XXXX                                       |
> |        ........XXX..XXXX.                                      |
> |         .......XXX..XXXX..                                     |
> |          ......XXX..XXXX..X                                    |
> |           .....XXX..XXXX..X.                                   |
> |            ....XXX..XXXX..X.X                                  |
> |             ...XXX..XXXX..X.X.                                 |
> |              ..XXX..XXXX..X.X.X                                |
> |               .XXX..XXXX..X.X.XX                               |
> |                XXX..XXXX..X.X.XXX                              |
> |                 XX..XXXX..X.X.XXX.                             |
> |                  XX.XXXX..X.X.XXX..                            |
> |                   X.XXXX..X.X.XXX..X                           |
> |                    .XXXX..X.X.XXX..X.                          |
> |                     XXXX..X.X.XXX..X..                         |
> |                      XXX..X.X.XXX..X..X                        |
> |                       XX..X.X.XXX..X..XX                       |
> |                        X..X.X.XXX..X..XXX                      |
> |                         ..X.X.XXX..X..XXXX                     |
> |                          .X.X.XXX..X..XXXX.                    |
> |                           X.X.XXX..X..XXXX..                   |
> |                            .X.XXX..X..XXXX..X                  |
> |                             X.XXX..X..XXXX..XX                 |
> |                              .XXX..X..XXXX..XX.                |
> |                               XXX..X..XXXX..XX.X               |
> |                                XX..X..XXXX..XX.XX              |
> |                                 X..X..XXXX..XX.XXX             |
> |                                  ..X..XXXX..XX.XXXX            |
> |                                   .X..XXXX..XX.XXXXX           |
> |                                    X..XXXX..XX.XXXXX.          |
> |                                     ..XXXX..XX.XXXXX.X         |
> |                                      .XXXX..XX.XXXXX.X.        |
> |                                       XXXX..XX.XXXXX.X.X       |
> |                                        XXX..XX.XXXXX.X.X.      |
> |                                         XX..XX.XXXXX.X.X.X     |
> |                                          X..XX.XXXXX.X.X.X.    |
> |                                           ..XX.XXXXX.X.X.X.X   |
> |                                            .XX.XXXXX.X.X.X.XX  |
> |                                             XX.XXXXX.X.X.X.XXX |
> |                                              X.XXXXX.X.X.X.XXXX|
>
> Note that the corruption never happens until you get within 48 pages
> of the end of the buffer.
>
> Any suggestions?
>
> Thanks,
> Dave
>
>
>
> --
> To unsubscribe from this list: send an email with
> "unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx
> Please read the FAQ at http://kernelnewbies.org/FAQ
>
>

--
To unsubscribe from this list: send an email with
"unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx
Please read the FAQ at http://kernelnewbies.org/FAQ



[Index of Archives]     [Newbies FAQ]     [Linux Kernel Mentors]     [Linux Kernel Development]     [IETF Annouce]     [Git]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux SCSI]     [Linux ACPI]
  Powered by Linux