RE: [PATCH v11 07/10] mtd: spi-nor: Add stacked memories support in spi-nor

"Mahapatra, Amit Kumar" <amit.kumar-mahapatra@xxxxxxx> · Wed, 7 Aug 2024 13:21:09 +0000

Hello Michael,

> On 8/5/24 10:27, Michael Walle wrote:
> > Hi,
> >
> >>>>> All I'm saying is that you shouldn't put burden on us (the SPI NOR
> >>>>> maintainers) for what seems to me at least as a niche. Thus I was
> >>>>> asking for performance numbers and users. Convince me that I'm
> >>>>> wrong and that is worth our time.
> >>>>
> >>>> No. It is not really just feature of our evaluation boards.
> >>>> Customers are using it. I was talking to one guy from field and he
> >>>> confirms me that these configurations are used by his multiple
> customers in real products.
> >>>
> >>> Which begs the question, do we really have to support every feature
> >>> in the core (I'd like to hear Tudors and Pratyush opinion here).
> >>> Honestly, this just looks like a concatenation of two QSPI
> >>> controllers.
> >>
> >> Based on my understanding for stacked yes. For parallel no.
> >
> > See below.
> >
> >>> Why didn't you just use a normal octal controller which is a
> >>> protocol also backed by the JEDEC standard.
> >>
> >> On newer SOC octal IP core is used.
> >> Amit please comment.
> >>
> >>> Is it any faster?
> >>
> >> Amit: please provide numbers.
> >>
> >>> Do you get more capacity? Does anyone really use large SPI-NOR
> >>> flashes? If so, why?
> >>
> >> You get twice more capacity based on that configuration. I can't
> >> answer the second question because not working with field. But both
> >> of that configurations are used by customers. Adding Neal if he wants to
> add something more to it.
> >>
> >>> I mean you've put that controller on your SoC, you must have some
> >>> convincing arguments why a customer should use it.
> >>
> >> I expect recommendation is to use single configuration but if you
> >> need bigger space for your application the only way to extend it is
> >> to use stacked configuration with two the same flashes next to each other.
> >> If you want to have bigger size and also be faster answer is parallel
> >> configuration.
> >
> > But who is using expensive NOR flash for bulk storage anyway?
> 
> I expect you understand that even if I know companies which does it I am not
> allow to share their names.
> 
> But customers don't need to have other free pins to connect for example
> emmc.
> That's why adding one more "expensive flash" can be for them only one
> option.
> 
> Also I bet that price for one more qspi flash is nothing compare to chip itself
> and other related expenses for low volume production.
> 
> > You're
> > only mentioning parallel mode. Also the performance numbers were just
> > about the parallel mode. What about stacked mode? Because there's a
> > chance that parallel mode works without modification of the core (?).
> 
> I will let Amit to comment it.

The performance of the stacked configuration will be the same as that of 
the single mode. As Michal mentioned earlier, stacked mode is used for 
scenarios where the customer requires larger flash space while maintaining 
the same performance.

I want to provide some background on why I choose to handle stacked and 
parallel modes through an additional layer or file, such as 
/mtd/spi-nor/stacked.c, rather than mtd-concat. Initially, when Miquel 
began upstreaming stacked support by extending the mtd-concat driver, 
the DT binding was not accepted. He proposed a couple of DT bindings 
[1] & [2] to support stacking through mtd-concat, but none were accepted. 
Additionally, after reviewing the MTD core code, he found that adding 
stacked support through mtd-concat could be complicated and involve many 
corner cases, which he mentioned in his RFC [3]. He then suggested 
concatenating the flashes instead of the mtd partitions, and eventually, 
the current DT bindings were added. This is why I propose handling the 
stacked and parallel configurations through an additional layer or file, 
as the mtd-concat approach was already discussed and rejected.

[1] https://lore.kernel.org/all/20191113171505.26128-4-miquel.raynal@xxxxxxxxxxx/
[2] https://lore.kernel.org/all/20191127105522.31445-5-miquel.raynal@xxxxxxxxxxx/
[3]https://lore.kernel.org/all/20211112152411.818321-1-miquel.raynal@xxxxxxxxxxx/

Regards,
Amit

> 
> 
> >
> >>>>> The first round of patches were really invasive regarding the core
> >>>>> code. So if there is a clean layering approach which can be
> >>>>> enabled as a module and you are maintaining it I'm fine with that
> >>>>> (even if the core code needs some changes then like hooks or so, not
> sure).
> >>>>
> >>>> That discussion started with Miquel some years ago when he was
> >>>> trying to to solve description in DT which is merged for a while in the
> kernel.
> >>>
> >>> What's your point here? From what I can tell the DT binding is wrong
> >>> and needs to be reworked anyway.
> >>
> >> I am just saying that this is not any adhoc new feature but
> >> configuration which has been already discussed and some steps made.
> >> If DT binding is wrong it can be deprecated and use new one but for that it
> has be clear which way to go.
> >
> > Well, AMD could have side stepped all this if they had just integrated
> > a normal OSPI flash controller, which would have the same requirements
> > regarding the pins (if not even less) and it would have been *easy* to
> > integrate it into the already available ecosystem.
> > That was what my initial question was about. Why did you choose two
> > QSPI ports instead of one OSPI port.
> 
> Keep in your mind that ZynqMP is 9years old SoC. Zynq 12+ years with a lot of
> internal development happening before. Not sure if ospi even exists at that
> time. Also if any IP was available for the price which they were targeting.
> I don't think make sense to discuss OSPI in this context because that's not in
> these SoCs.
> I have never worked with spi that's why don't know historical context to
> provide more details.
> 
> Thanks,
> Michal