Hi Cyril, On 09/10/2010 06:59 PM, Cyril Chemparathy wrote: > Hi Mike, > > I have merged your latest two emails and responded to both here. > [...] > One of the patches posted on my repo [1] replaces the dumb mdelay() with > a bit more logic that calculates the worst-case access time. This > mechanism may work a lot better for you. Would you mind trying it out? > You patch from [1] is working much more reliably now (well, 6 for 6 boot cycles as well as several ifup/ifdown cycles). I do get the "resetting idled controlled" console message every cycle, it seems that will be expected now. [...] >> >> Also, your while(1) loops with the continue conditions on the second wait_for_user_access() >> in the read and writes might need some consideration, i.e.: >> >> while (1) { >> ret = wait_for_user_access(data); >> if (ret == -EAGAIN) >> continue; >> if (ret < 0) >> break; >> >> __raw_writel(reg, &data->regs->user[0].access); >> >> ret = wait_for_user_access(data); >> if (ret == -EAGAIN) >> continue; >> ^^^^^^^^^ <--- this will re-issue the request.... what you want? > > Yes, the wait_for_user_access() would have reset the controller, > throwing the current transaction out. The intent here is to restart the > current transaction with a newly initialized controller. > OK. Makes sense. Looking at it felt like there was a chance for end endless spin, but that seems unlikely given how that condition might fire. [...] >> Also, on the shutdown, I get a major kernel trace. Here is the dump, as much >> as I could catch of it.... (I need a better terminal program) > > [...] >> WARNING: at drivers/net/davinci_emac.c:1025 __cpdma_chan_free+0xac/0xb0() > > The current code spits out a huge volume of stuff as a result of a > WARN_ON in the rx handler. The gitweb on [1] has a patch that fixes this. > Yes, these messages are no longer issued with the patches from [1]. Thanks. > [...] >>>> The MDIO module upgrade (rev 1.4 -> 1.5) could have something to do with >>>> this behavior. Even so, I can't explain why this issue wasn't seen on >>>> da8xx prior to this series. The original code should (at least in >>>> theory) have sporadically locked up on emac open. >>>> >> I think, if I understand it correctly, that in the previous version of >> this code, the emac was reset *prior* to enabling, scanning, and assigning >> the associated phy on the MDIO bus. The new implementation sets up and scans >> the MDIO bus first, then comes back around to the EMAC second... hits a reset, >> and doesn't re-ENABLE the MDIO. > > AFAICS, that isn't entirely accurate. In the previous version, the mdio > bus was being brought up at probe time in davinci_emac_probe(). The > soft-reset was happening later on when the device is opened, in > emac_hw_enable(). > > The difference, however, is that the original code forced an > emac_mii_reset() immediately after the emac_hw_enable(). This is not > being done with the separated mdio, and that is the problem. In terms > of behavior, with the current work around, the new and old versions > should be close to identical. More below... > >> Also, maybe hitting the EMAC reset while the MDIO state machine is up is *bad*, I >> seem to recall some text in the user's guide about waiting for the state >> machine to stop before disabling it. I wonder if that also applies to reset? > > You are correct. EMAC soft-reset stops the MDIO mid-transaction, quite > unlike disabling the module via the control register. Therefore, there > is a risk that a badly designed phy could be left hanging in an > arbitrary state. However, all earlier versions of the emac code have > been exposed to this very same vulnerability (i.e. arbitrary emac > soft-reset regardless of mdio state) all along. > Thanks for straightening me out on this, Cryil. Your patch series in [1] seems to have resolved the issues I've been able to see on the da850 based board I'm using here. I appreciate your patience and quick response. I may try to beat on it a bit more with some network performance tests (even though it's not at all related to the immediate problems you've fixed) -- we've had that on our list of todos anyway for this module. I think it would be good to float your patches over to davinci-next, if possible, before the 37 merge window opens... Speaking, of course, from a very partial perspective. > [1] > http://arago-project.org/git/people/?p=cyril/linux-tnetv107x.git;a=shortlog;h=refs/heads/emac-cpdma-mdio-fixes -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html