On 01-02-19, 11:23, Gustavo Pimentel wrote: > On 01/02/2019 04:14, Vinod Koul wrote: > > On 31-01-19, 11:33, Gustavo Pimentel wrote: > >> On 23/01/2019 13:08, Vinod Koul wrote: > >>> On 21-01-19, 15:48, Gustavo Pimentel wrote: > >>>> On 20/01/2019 11:44, Vinod Koul wrote: > >>>>> On 11-01-19, 19:33, Gustavo Pimentel wrote: > > > >>>>>> @@ -0,0 +1,1059 @@ > >>>>>> +// SPDX-License-Identifier: GPL-2.0 > >>>>>> +/* > >>>>>> + * Copyright (c) 2018 Synopsys, Inc. and/or its affiliates. > >>>>> > >>>>> 2019 now > >>>> > >>>> I've changed to "Copyright (c) 2018-present Synopsys, Inc. and/or its > >>>> affiliates." this way it's always up to date and I also kept 2018, because it > >>>> was the date that I started to develop this driver, if you don't mind. > >>> > >>> yeah 18 is fine :) it need to end with current year always > >> > >> Just to be sure, are you saying that must be: "Copyright (c) 2018-2019 Synopsys, > >> Inc. and/or its affiliates."? > > > > Yup :) > > > >>>>>> +static struct dw_edma_chunk *dw_edma_alloc_chunk(struct dw_edma_desc *desc) > >>>>>> +{ > >>>>>> + struct dw_edma_chan *chan = desc->chan; > >>>>>> + struct dw_edma *dw = chan->chip->dw; > >>>>>> + struct dw_edma_chunk *chunk; > >>>>>> + > >>>>>> + chunk = kvzalloc(sizeof(*chunk), GFP_NOWAIT); > >>>>>> + if (unlikely(!chunk)) > >>>>>> + return NULL; > >>>>>> + > >>>>>> + INIT_LIST_HEAD(&chunk->list); > >>>>>> + chunk->chan = chan; > >>>>>> + chunk->cb = !(desc->chunks_alloc % 2); > >>>>> > >>>>> cb ..? > >>>> > >>>> CB = change bit, is a property of this eDMA IP. Basically it is a kind of > >>>> handshake which serves to validate whether the linked list has been updated or > >>>> not, especially useful in cases of recycled linked list elements (every linked > >>>> list recycle is a new chunk, this will allow to differentiate each chunk). > >>> > >>> okay please add that somewhere. Also it would help me if you explain > >>> what is chunk and other terminologies used in this driver > >> > >> I'm thinking to put the below description on the patch, please check if this is > >> sufficient explicit and clear to understand what this IP needs and does. > >> > >> In order to transfer data from point A to B as fast as possible this IP requires > >> a dedicated memory space where will reside a linked list of elements. > > > > rephrasing: a dedicated memory space containing linked list of elements > > > >> All elements of this linked list are continuous and each one describes a data > >> transfer (source and destination addresses, length and a control variable). > >> > >> For the sake of simplicity, lets assume a memory space for channel write 0 which > >> allows about 42 elements. > >> > >> +---------+ > >> | Desc #0 |--+ > >> +---------+ | > >> V > >> +----------+ > >> | Chunk #0 |---+ > >> | CB = 1 | | +----------+ +-----+ +-----------+ +-----+ > >> +----------+ +-->| Burst #0 |-->| ... |-->| Burst #41 |-->| llp | > >> | +----------+ +-----+ +-----------+ +-----+ > >> V > >> +----------+ > >> | Chunk #1 |---+ > >> | CB = 0 | | +-----------+ +-----+ +-----------+ +-----+ > >> +----------+ +-->| Burst #42 |-->| ... |-->| Burst #83 |-->| llp | > >> | +-----------+ +-----+ +-----------+ +-----+ > >> V > >> +----------+ > >> | Chunk #2 |---+ > >> | CB = 1 | | +-----------+ +-----+ +------------+ +-----+ > >> +----------+ +-->| Burst #84 |-->| ... |-->| Burst #125 |-->| llp | > >> | +-----------+ +-----+ +------------+ +-----+ > >> V > >> +----------+ > >> | Chunk #3 |---+ > >> | CB = 0 | | +------------+ +-----+ +------------+ +-----+ > >> +----------+ +-->| Burst #126 |-->| ... |-->| Burst #129 |-->| llp | > >> +------------+ +-----+ +------------+ +-----+ > > > > This is great and required to understand the driver. > > > > How does controller move from Burnst #41 of Chunk 0 to Chunk 1 ? > > I forgot to explain that... > > On every last Burst of the Chunk (Burst #41, Burst #83, Burst #125 or even Burst > #129) is set some flags on their control variable (RIE and LIE bits) that will > trigger the send of "done" interruption. > > On the interruptions callback, is decided whether to recycle the linked list > memory space by writing a new set of Bursts elements (if still exists Chunks to > transfer) or is considered completed (if there is no Chunks available to transfer) > > > Is Burst 0 to 129 a link list with Burst 0, 42, 84 and 126 having a > > callback (done bit set).. > > I didn't quite understand it your question. > > It comes from the prep_slave_sg n elements (130 applying the example), where > will be divide in several Chunks (#0, #1, #2, #3 and #4 applying the example) (a > linked list that will contain another linked list for the Bursts, CB, Chunk > size). The linked list inside of each Chunk will contain a number of Bursts > (limited to the memory space size), each one will possess Source Address, > Destination Address and size of that sub-transfer. Thanks this helps :) so in this case what does prep_slave_sg n elements corrosponds to (130 bursts) ..? If so how do you split a transaction to chunks and bursts? > > > > > This sound **very** similar to dw dma concepts! > > I believe some parts of dw dma and dw edma behavior are similar and that makes > perfectly sense since both dma are done by Synopsys and may be exist a shared > knowledge, however they are different IPs applied to different products. > > > > >> > >> Legend: > >> > >> *Linked list*, also know as Chunk > >> *Linked list element*, also know as Burst > >> *CB*, also know as Change Bit, it's a control bit (and typically is toggled) > >> that allows to easily identify and differentiate between the current linked list > >> and the previous or the next one. > >> *LLP*, is a special element that indicates the end of the linked list element > >> stream also informs that the next CB should be toggle. > >> > >> On scatter-gather transfer mode, the client will submit a scatter-gather list of > >> n (on this case 130) elements, that will be divide in multiple Chunks, each > >> Chunk will have (on this case 42) a limited number of Bursts and after > >> transferring all Bursts, an interrupt will be triggered, which will allow to > >> recycle the all linked list dedicated memory again with the new information > >> relative to the next Chunk and respective Burst associated and repeat the whole > >> cycle again. > >> > >> On cyclic transfer mode, the client will submit a buffer pointer, length of it > >> and number of repetitions, in this case each burst will correspond directly to > >> each repetition. > >> > >> Each Burst can describes a data transfer from point A(source) to point > >> B(destination) with a length that can be from 1 byte up to 4 GB. Since dedicated > >> the memory space where the linked list will reside is limited, the whole n burst > >> elements will be organized in several Chunks, that will be used later to recycle > >> the dedicated memory space to initiate a new sequence of data transfers. > >> > >> The whole transfer is considered has completed when it was transferred all bursts. > >> > >> Currently this IP has a set well-known register map, which includes support for > >> legacy and unroll modes. Legacy mode is version of this register map that has > > > > whats unroll.. > > > >> multiplexer register that allows to switch registers between all write and read > > The unroll is explained here, see below > > >> channels and the unroll modes repeats all write and read channels registers with > >> an offset between them. This register map is called v0. > >> > >> The IP team is creating a new register map more suitable to the latest PCIe > >> features, that very likely will change the map register, which this version will > >> be called v1. As soon as this new version is released by the IP team the support > >> for this version in be included on this driver. > >> > >> What do you think? There is any gray area that I could clarify? > > > > This sounds good. But we are also catering to a WIP IP which can change > > right. Doesnt sound very good idea to me :) > > This IP exists for several years like this and it works quite fine, however > because of new features and requests (SR-IOV, more dma channels, function > segregation and isolation, performance improvement) that are coming it's natural > to have exist improvements. The drivers should follow the evolution and be > sufficient robust enough to adapt to this new circumstance. Kernel driver should be modular be design, if we keep things simple then adding/splitting to support future revisions should be easy! > >>>>>> + dev_dbg(chan2dev(chan), "addr(physical) src=%pa, dst=%pa\n", > >>>>>> + &config->src_addr, &config->dst_addr); > >>>>>> + > >>>>>> + chan->src_addr = config->src_addr; > >>>>>> + chan->dst_addr = config->dst_addr; > >>>>>> + > >>>>>> + err = ops->device_config(dchan); > >>>>> > >>>>> what does this do? > >>>> > >>>> This is an initialization procedure to setup interrupts (data and addresses) to > >>>> each channel on the eDMA IP, in order to be triggered after transfer being > >>>> completed or aborted. Due the fact the config() can be called at anytime, > >>>> doesn't make sense to have this procedure here, I'll moved it to probe(). > >>> > >>> Yeah I am not still convinced about having another layer! Have you > >>> though about using common lib for common parts ..? > >> > >> Maybe I'm explaining myself wrongly. I don't have any clue about the new > >> register map for the future versions. I honestly tried to implement the common > >> lib for the whole process that interact with dma engine controller to ease in > >> the future any new addition of register mapping, by having this common callbacks > >> that will only be responsible for interfacing the HW accordingly to register map > >> version. Maybe I can simplify something in the future, but I only be able to > >> conclude that after having some idea about the new register map. > >> > >> IMHO I think this is the easier and clean way to do it, in terms of code > >> maintenance and architecture, but if you have another idea that you can show me > >> or pointing out for a driver that implements something similar, I'm no problem > >> to check it out. > > > > That is what my fear was :) > > > > Lets step back and solve one problem at a time. Right now that is v0 of > > IP. Please write a simple driver which solve v0 without any layers > > involved. > > > > Once v1 is in good shape, you would know what it required and then we > > can split v0 driver into common lib and v0 driver and then add v1 > > driver. > > Can I keep the code segregation as it is now? With the dw-edma-v0-core.c/h, > dw-edma-v0-debugfs.c/h and dw-edma-v0-regs.h > > That way I would only replace the callbacks calls to direct function calls and > remove the switch case callback selection base on the version. That sound better, please keep patches smallish (anything going more than 600 lines becomes harder to review) and logically split. > >>>>> what is the second part of the check, can you explain that, who sets > >>>>> chan->dir? > >>>> > >>>> The chan->dir is set on probe() during the process of configuring each channel. > >>> > >>> So you have channels that are unidirectional? > >> > >> Yes. That's one another reason IMHO to keep the dw-edma-test separate from > >> dma-test, since this IP is more picky and have this particularities. > > > > That is okay, that should be handled by prep calls, if you get a prep > > call for direction you dont support return error and move to next > > available one. > > > > That can be done generically in dmatest driver and to answer your > > question, yes I would test that :) > > Like you said, let do this in small steps. For now I would like to suggest to > leave out the dw-dma-test driver and just focus on the current driver, if you > don't mind. I never thought that his test driver would raise this kind of discuss. Well dmatest is just a testing aid and doesnt care about internal designs of your driver. I would say keep it as it is useful aid and will help other folks as well :) -- ~Vinod