Re: MAX3421E: device giving NAKs forever?

David Mosberger <davidm@xxxxxxxxxx> · Fri, 14 Mar 2014 10:21:35 -0600

After thinking about this some more, the MAX3421E behavior could be
triggered if a write to the SNDFIFO is not followed by a BULK_OUT
command to the HXFR register.  My driver always issues BULK_OUT after
writing the SNDFIFO so this should never happen, but a corrupted SPI
transfer could do this.

Also, and perhaps more plausible for my driver, after an OUT transfer
gets a response other than ACK (e.g., NAK or error), the MAX3421E
doesn't unload that FIFO (assuming that you'll want to retransmit the
data).  My driver never retransmits the data immediately, so I think
it has to issue a dummy write to the SNDBC register to switch back to
the original FIFO.  I know I tried that at one point and it didn't fix
the issue, but I should try this again as it seems the most plausible
explanation.

  --david

On Thu, Mar 13, 2014 at 10:46 AM, David Mosberger <davidm@xxxxxxxxxx> wrote:
> OK, I finally know where the problem is coming from!  The MAX3421E
> chip uses double-buffering.  Specifically, it has two 64-byte send
> FIFOs.  You write up to 64 bytes to a send FIFO by repeatedly writing
> to SPI register 2 (SNDFIFO).  Then you tell the chip how many bytes
> you just put in the FIFO by writing SPI register 7 (the
> send-byte-count or SNDBC register).  Writing SNDBC is supposed to
> switch the FIFO to the USB-side so it can be transmitted on the USB
> bus.
>
> Unfortunately, it seems that under certain circumstances, writing the
> SNDBC fails to properly switch the FIFOs and we end up sending data to
> the USB bus from the wrong FIFO.
>
> In the USB mass-storage error situation we're seeing, the driver was
> trying to send a 31-byte "USBC" command and we see that command coming
> over the SPI-bus just fine.  However, on the USB-side, the MAX3421E
> chip instead writes a 64-byte packet full of zeroes (which is the data
> we were transmitting before).  The mass-storage peripheral afterwards
> NAKs any OUT request because it never saw the new SCSI WRITE_10
> command that was encapsulated in the "USBC" command.
>
> The work-around for now is to write outgoing packets twice, so that
> both FIFOs contain the same data.  With that workaround, we have been
> able to dd 5MB blocks of data repeatedly without any issues (dd
> if=/dev/zero of=/dev/sda1 count=5000 bs=1024).
>
> I should mention this is with rev 0x12 of the MAX3421E chip.  The
> current rev is 0x13 so we'll try with that chip in the next few days.
> However, we are not aware of any erratas for rev 0x12 that would
> explain this behavior.
>
> Also, for the record, we ran the SPI bus at only 4MHz for this testing
> so we could reliably capture the data with the Saleae Logic.  Giving
> this low frequency and the fact that the Saleae was able to capture
> the correct data, I do not think that SPI corruption is to blame.  We
> saw the same error occur even with SPI at 1MHz.
>
> I have the full trace data if anyone is interested.  It's captures the
> complete test (from loading the max3421 driver to when the error
> occurs), so it's 55MiB in size, so I can't attach it to email.
>
>   --david
>
> On Thu, Mar 13, 2014 at 8:55 AM, David Mosberger <davidm@xxxxxxxxxx> wrote:
>> Yeah, sorry, the READ_10s were a total red herring.  They're there
>> because I forgot to specify bs=1024. ;-(
>>
>> I'll try to capture better traces today and if they look interesting,
>> make them available.
>>
>>   --david
>> --
>> eGauge Systems LLC, http://egauge.net/, 1.877-EGAUGE1, fax 720.545.976
>
>
>
> --
> eGauge Systems LLC, http://egauge.net/, 1.877-EGAUGE1, fax 720.545.9768

-- 
eGauge Systems LLC, http://egauge.net/, 1.877-EGAUGE1, fax 720.545.9768
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html