On Mon, Jan 02, 2017 at 10:12:33AM -0600, Andy Gross wrote: > On Mon, Jan 02, 2017 at 07:55:37PM +0530, Abhishek Sahu wrote: > > >>>>>>> So a couple of thoughts on how to deal with this: > > >>>>>>> > > >>>>>>> 1) Define a virtual channel for the command descriptors vs a normal DMA > > >>>>>>> transaction. This lets you use the same hardware channel, but lets you discern > > >>>>>>> which descriptor format you need to use. The only issue I see with this is the > > >>>>>>> required change in device tree binding to target the right type of channel (cmd > > >>>>>>> vs normal). > > >>>>>> > > >>>>>>Or mark the descriptor is cmd and write accordingly... > > >>>>> > > >>>>>The only issue i see with that is knowing how much to pre-allocate during > > >>>>>the > > >>>>>prep call. The flag set API would be called on the allocated tx > > >>>>>descriptor. So > > >>>>>you'd have to know up front and be able to specify it. > > >>>>> > > >>>>>> > > >>>>>>> > > >>>>>>> 2) Provide an API to set flags for the descriptor on a whole descriptor basis. > > >>>>>>> > > >>>>>>> 3) If you have a set of transactions described by an sgl that has disparate use > > >>>>>>> of flags, you split the list and use a separate transaction. In other words, we > > >>>>>>> need to enforce that the flag set API will be applied to all descriptors > > >>>>>>> described by an sgl. This means that the whole transaction may be comprised of > > >>>>>>> multiple async TX descriptors. > > >>>> > > >>>>Each async TX descriptor will generate the BAM interrupt so we are > > >>>>deviating > > >>>>from main purpose of DMA where ideally we should get the interrupt at > > >>>>the > > >>>>end > > >>>>of transfer. This is the main reason for going for this patch. > > >>> > > >>>If the client queues 1 descriptor or 5 descriptors, it doesn't matter that > > >>>much. > > >>>The client knows when it is done by waiting for the descriptors to > > >>>complete. It > > >>>is less efficient than grouping them all, but it should still work. > > >>> > > >>Yes. client will wait for final descriptor completion. But these > > >>interrupts > > >>will be overhead for CPU. For 1-2 page it won't matter much I guess it > > >>will > > >>be > > >>significant for complete chip read/write(during boot and FS i.e JFFS > > >>operations). > > >>>> > > >>>>With the submitted patch, only 1 interrupt per channel is required for > > >>>>complete NAND page and it solves the setting of BAM specific flags also. > > >>>>Only issue with this patch is adding new API in DMA framework itself. > > >>>>But > > >>>>this API can be used by other DMA engines in future for which mapping > > >>>>cannot > > >>>>be done with available APIs and if this mapping is vendor specific. > > >>> > > >>>I guess the key point in all of this is that the DMA operation being done > > >>>is not > > >>>a normal data flow to/from the device. It's direct remote register access > > >>>to > > >>>the device using address information contained in the sgl. And you are > > >>>collating the standard data access along with the special command access > > >>>in the > > >>>same API call. > > >>Yes. Normally DMA engine (QCA ADM DMA engine also) allows to read/write > > >>memory mapped > > >>registers just like data. But BAM is different (Since it is not a global > > >>DMA > > >>Engine > > >>and coupled with peripheral). Also, this different flag requirement is > > >>not > > >>just > > >>for command descriptors but for data descriptors also. > > >> > > >>BAM data access and command access differs only with flag and register > > >>read/write > > >>list. The register read and write list will be simply array of > > >>struct bam_cmd_element added in patch > > >>struct bam_cmd_element { > > >> __le32 addr:24; > > >> __le32 command:8; > > >> __le32 data; > > >> __le32 mask; > > >> __le32 reserved; > > >>}; > > >> > > >>The address and size of the array will be passed in data and size field > > >>of > > >>SGL. > > >>If we want to form the SGL for mentioned list then we will have SGL of > > >>size > > >>15 > > >>with just one descriptor. > > >> > > >>Now we require different flag for each SG entry. currently SG does not > > >>have > > >>flag parameter and we can't add flag parameter just for our requirement > > >>in > > >>generic SG. So we have added the custom mapping function and passed > > >>modified > > >>SG > > >>as parameter which is generic SG and flag. > > > > > >I really think that we need some additional API that allows for the flag > > >munging > > >for the descriptors instead of overriding the prep_slave_sg. We already > > >needed > > >to change the way the flags are passed anyway. And instead of building up > > >a > > >special sg list, the API should take a structure that has a 1:1 mapping of > > >the > > >flags to the descriptors. And you would call this API on your descriptor > > >before > > >issuing it. Munging wont be a good idea, but for some of the cases current flags can be used, and if need be, we can add additional flags > > > > > >So build up the sglist. Call the prep_slave_sg. You get back a tx > > >descriptor > > >that underneath is a bam descriptor. Then call the API giving the > > >descriptor > > >and the structure that defines the flags for the descriptors. Then submit > > >the > > >descriptor. > > > > > >Something like: > > >int qcom_bam_apply_descriptor_flags(struct dma_async_tx_descriptor *tx, > > > u16 *flags) > > >{ > > > struct bam_async_desc async_desc = container_of(tx, > > > struct bam_async_desc, > > > vd.tx); > > > int i; > > > > > > for (i = 0; i < async_desc->num_desc; i++) > > > async_desc->desc[i].flags = cpu_to_le16(flags[i]); > > >} > > > > > >EXPORT_SYMBOL(qcom_bam_apply_descriptor_flags) This makes it bam specific and causes issues if we want to use this code in generic libs, but yes this is an option but should be last resort. -- ~Vinod -- To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html