On Wed, Sep 28, 2011 at 3:59 PM, J Freyensee <james_p_freyensee@xxxxxxxxxxxxxxx> wrote: > On 09/28/2011 03:24 PM, Praveen G K wrote: >> >> On Wed, Sep 28, 2011 at 2:34 PM, J Freyensee >> <james_p_freyensee@xxxxxxxxxxxxxxx> wrote: >>> >>> On 09/28/2011 02:03 PM, Praveen G K wrote: >>>> >>>> On Wed, Sep 28, 2011 at 2:01 PM, J Freyensee >>>> <james_p_freyensee@xxxxxxxxxxxxxxx> wrote: >>>>> >>>>> On 09/28/2011 01:34 PM, Praveen G K wrote: >>>>>> >>>>>> On Wed, Sep 28, 2011 at 12:59 PM, J Freyensee >>>>>> <james_p_freyensee@xxxxxxxxxxxxxxx> wrote: >>>>>>> >>>>>>> On 09/28/2011 12:06 PM, Praveen G K wrote: >>>>>>>> >>>>>>>> On Tue, Sep 27, 2011 at 10:42 PM, Linus Walleij >>>>>>>> <linus.walleij@xxxxxxxxxx> wrote: >>>>>>>>> >>>>>>>>> On Fri, Sep 23, 2011 at 7:05 AM, Praveen G K<praveen.gk@xxxxxxxxx> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> I am working on the block driver module of the eMMC driver (SDIO >>>>>>>>>> 3.0 >>>>>>>>>> controller). I am seeing very low write speed for eMMC transfers. >>>>>>>>>> On >>>>>>>>>> further debugging, I observed that every 63rd and 64th transfer >>>>>>>>>> takes >>>>>>>>>> a long time. >>>>>>>>> >>>>>>>>> Are you not just seeing the card-internal garbage collection? >>>>>>>>> http://lwn.net/Articles/428584/ >>>>>>>> >>>>>>>> Does this mean, theoretically, I should be able to achieve larger >>>>>>>> speeds if I am not using linux? >>>>>>> >>>>>>> In theory in a fairy-tale world, maybe, in reality, not really. In >>>>>>> R/W >>>>>>> performance measurements we have done, eMMC performance in products >>>>>>> users >>>>>>> would buy falls well, well short of any theoretical numbers. We >>>>>>> believe >>>>>>> in >>>>>>> theory, the eMMC interface should be able to support up to 100MB/s, >>>>>>> but >>>>>>> in >>>>>>> reality on real customer platforms write bandwidths (for example) >>>>>>> barely >>>>>>> approach 20MB/s, regardless if it's a Microsoft Windows environment >>>>>>> or >>>>>>> Android (Linux OS environment we care about). So maybe it is >>>>>>> software >>>>>>> implementation issues of multiple OSs preventing higher eMMC >>>>>>> performance >>>>>>> numbers (hence the reason why I sometimes ask basic coding questions >>>>>>> of >>>>>>> the >>>>>>> MMC subsystem- the code isn't the easiest to follow); however, one >>>>>>> looks >>>>>>> no >>>>>>> further than what Apple has done with the iPad2 to see that eMMC >>>>>>> probably >>>>>>> just is not a good solution to use in the first place. We have >>>>>>> measured >>>>>>> Apple's iPad2 write performance on *WHAT A USER WOULD SEE* being >>>>>>> double >>>>>>> what >>>>>>> we see with products using eMMC solutions. The big difference? Apple >>>>>>> doesn't use eMMC at all for the iPad2. >>>>>> >>>>>> Thanks for all the clarification. The problem is I am seeing write >>>>>> speeds of about 5MBps on a Sandisk eMMC product and I can clearly see >>>>>> the time lost when measured between sending a command and receiving a >>>>>> data irq. I am not sure what kind of an issue this is. 5MBps feels >>>>>> really slow but can the internal housekeeping of the card take so much >>>>>> time? >>>>> >>>>> Have you tried to trace through all structs used for an MMC >>>>> operation??! >>>>> Good gravy, there are request, mmc_queue, mmc_card, mmc_host, >>>>> mmc_blk_request, mmc_request, multiple mmc_command and multiple >>>>> scatterlists >>>>> that these other structs use...I've been playing around on trying to >>>>> cache >>>>> some things to try and improve performance and it blows me away how >>>>> many >>>>> variables and pointers I have to keep track of for one operation going >>>>> to >>>>> an >>>>> LBA on an MMC. I keep wondering if more of the 'struct request' could >>>>> have >>>>> been used, and 1/3 of these structures could be eliminated. And >>>>> another >>>>> thing I wonder too is how much of this infrastructure is really needed, >>>>> that >>>>> when I do ask "what is this for?" question on the list and no one >>>>> responds, >>>>> if anyone else understands if it's needed either. >>>> >>>> I know I am not using the scatterlists, since the scatterlists are >>>> aggregated into a 64k bounce buffer. Regarding the different structs, >>>> I am just taking them on face value assuming everything works "well". >>>> But, my concern is why does it take such a long time (250 ms) to >>>> return a transfer complete interrupt on occasional cases. During this >>>> time, the kernel is just waiting for the txfer_complete interrupt. >>>> That's it. >>> >>> I think one fundamental problem with execution of the MMC commands is >>> even >>> though the MMC has it's own structures and own DMA/Host-controller, the >>> OS's >>> block subsystem and MMC subsystem do not really run independent of either >>> other and each are still tied to each others' fate, holding up >>> performance >>> of the kernel in general. >>> >>> In particular, I have found that in the 2.6.36+ kernels that the sooner >>> you >>> can retire the 'struct request *req' (ie using __blk_end_request()) with >>> respect to when the mmc_wait_for_req() call is made, the higher >>> performance >>> you are going to get out of the OS in terms of reads/writes using an MMC. >>> mmc_wait_for_req() is a blocking call, so that OS 'struct request req' >>> will >>> just sit around and do nothing until mmc_wait_for_req() is done. I have >>> been able to do some caching of some commands, calling >>> __blk_end_request() >>> before mmc_wait_for_req(), and getting much higher performance in a few >>> experiments (but the work certainly is not ready for prime-time). >>> >>> Now in the 3.0 kernel I know mmc_wait_for_req() has changed and the goal >>> was >>> to try and make that function a bit more non-blocking, but I have not >>> played >>> with it too much because my current focus is on existing products and no >>> handheld product uses a 3.0 kernel yet (that I am aware of at least). >>> However, I still see the fundamental problem is that the MMC stack, >>> which >>> was probably written with the intended purpose to be independent of the >>> OS >>> block subsystem (struct request and other stuff), really isn't >>> independent >>> of the OS block subsystem and will cause holdups between one another, >>> thereby dragging down read/write performance of the MMC. >>> >>> The other fundamental problem is the writes themselves. Way, WAY more >>> writes occur on a handheld system in an end-user's hands than reads. >>> Fundamental computer principle states "you make the common case fast". So >>> focus should be on how to complete a write operation the fastest way >>> possible. >> >> Thanks for the detailed explanation. >> Please let me know if there is a fundamental issue with the way I am >> inserting the high res timers. In the block.c file, I am timing the >> transfers as follows >> >> 1. Start timer >> mmc_queue_bounce_pre() >> mmc_wait_for_req() >> mmc_queue_bounce_post() >> End timer >> >> So, I don't really have to worry about the blk_end_request right. >> Like you said, wait_for_req is a blocking wait. I don't see what is >> wrong with that being a blocking wait, because until you get the data >> xfer complete irq, there is no point in going ahead. The >> blk_end_request comes later in the picture only when all the data is >> transferred to the card. > > Yes, that is correct. > > But if you can do some cache trickery or queue tricks, you can delay when > you have to actually write to the MMC, so then __blk_end_request() and > retiring the 'struct request *req' becomes the time-sync. That is a reason > why mmc_wait_for_req() got some work done on it in the 3.0 kernel. The OS > does not have to wait for the host controller to complete the operation (ie, > block on mmc_wait_for_data()) if there is no immediate dependency on that > data- that is kind-of dumb. This is why this can be a problem and a time > sync. It's no different than out-of-order execution in CPUs. Thanks I'll look into the 3.0 code to see what the changes are and whether it can improve the speed. Thanks for your suggestions. >> My line of thought is that the card is taking a lot of time for its >> internal housekeeping. > > Each 'write' to a solid-state/nand/flash requires an erase operation first, > so yes, there is more housekeeping going on than a simple 'write'. > > But, I want to be absolutely sure of my >> >> analysis before I can pass that judgement. >> >> I have also used another Toshiba card that gives me about 12 MBps >> write speed for the same code, but I am worried is whether I am >> masking some issue by blaming it on the card. What if the Toshiba >> card can give a throughput more than 12MBps ideally? > > No clue...you'd have to talk to Toshiba. > >> >> Or could there be an issue that the irq handler(sdhci_irq) is called >> with some kind of a delay and is there a possibility that we are not >> capturing the transfer complete interrupt immediately? >> >>>> >>>>> I mean, for the usual transfers it takes about 3ms to transfer >>>>>> >>>>>> 64kB of data, but for the 63rd and 64th transfers, it takes 250 ms. >>>>>> The thing is this is not on a file system. I am measuring the speed >>>>>> using basic "dd" command to write directly to the block device. >>>>>> >>>>>>> So, is this a software issue? or if >>>>>>>> >>>>>>>> there is a way to increase the size of bounce buffers to 4MB? >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>>> Yours, >>>>>>>>> Linus Walleij >>>>>>>>> >>>>>>>> -- >>>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-mmc" >>>>>>>> in >>>>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> J (James/Jay) Freyensee >>>>>>> Storage Technology Group >>>>>>> Intel Corporation >>>>>>> >>>>>> -- >>>>>> To unsubscribe from this list: send the line "unsubscribe linux-mmc" >>>>>> in >>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>> >>>>> >>>>> -- >>>>> J (James/Jay) Freyensee >>>>> Storage Technology Group >>>>> Intel Corporation >>>>> >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in >>>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >>> >>> -- >>> J (James/Jay) Freyensee >>> Storage Technology Group >>> Intel Corporation >>> > > > -- > J (James/Jay) Freyensee > Storage Technology Group > Intel Corporation > -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html