Re: [PATCH 2/2 resend] libata-sff: avoid byte swapping in ata_sff_data_xfer()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jeff Garzik wrote:

Handling of the trailing byte in ata_sff_data_xfer() is suboptimal bacause:

- it always initializes the padding buffer to 0 which is not really needed in
  both the read and write cases;

- it has to use memcpy() to transfer a single byte from/to the padding buffer;

Have you looked at the assembly, before deciding it is suboptiomal?

   I'm estimating the code itself, not what the compiler can do to fix it. :-)

gcc optimizes tiny arrays and structures quite well, and is well capable of seeing one path where the initialization is clobbered without a single read, and another code path where it is used.

The initialier just shouldn't have been there in the first place, clobbered or not. And let's looks at what gcc gave me:

.L504:
        .loc 1 727 0
        testb   $1, %bl #, buflen
        jne     .L511   #,
[...]
.L511:
.LBB635:
        .loc 1 731 0
        movl    8(%ebp), %eax   # rw,
        .loc 1 729 0
        leal    (%esi,%ebx), %ebx       #, tmp72
.LVL440:
        .loc 1 728 0
.LBB635:
        .loc 1 731 0
        movl    8(%ebp), %eax   # rw,
        .loc 1 729 0
        leal    (%esi,%ebx), %ebx       #, tmp72
.LVL440:
        .loc 1 728 0
        movw    $0, -14(%ebp)   #, align_buf
        .loc 1 731 0
        testl   %eax, %eax      #
        jne     .L507   #,
        .loc 1 732 0
        movl    -20(%ebp), %eax # data_addr, data_addr
        call    ioread16        #
        movw    %ax, -14(%ebp)  # D.29224, align_buf
.LBB629:
.LBB630:
        .loc 4 60 0
        movzbl  -14(%ebp), %eax #, tmp73
        movb    %al, -1(%ebx)   # tmp73,
.L509:
.LBE630:
.LBE629:
        .loc 1 738 0
        addl    $1, %edi        #, words
        jmp     .L505   #
.L507:
.LBB631:
.LBB632:
        .loc 4 60 0
        movzbl  -1(%ebx), %eax  #, tmp74
.LBE632:
.LBE631:
        .loc 1 736 0
        movzwl  -14(%ebp), %eax # align_buf, align_buf
        call    iowrite16       #
        jmp     .L509   #

As you can see, it happily assigned 0 to align_buf[0] at .LVL440, regardless of the value of 'rw'.

As for memcpy, for small and/or constant values that is quite often a compiler builtin. It is rarely useful, these days, to convert a memcpy() to a hand-rolled
version of same.

Here memcpy() just shouldn't have appeared in the first place. But indeed, gcc did optimize it away.

    Jeff

MBR, Sergei
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux