[PATCH v11 07/10] mtd: spi-nor: Add stacked memories support in spi-nor

Thu Aug 1 02:28:54 PDT 2024

Hello,

> -----Original Message-----
> From: Frager, Neal <neal.frager at amd.com>
> Sent: Thursday, August 1, 2024 12:07 PM
> To: Simek, Michal <michal.simek at amd.com>; Michael Walle
> <michael at walle.cc>; Mahapatra, Amit Kumar <amit.kumar-
> mahapatra at amd.com>; Tudor Ambarus <tudor.ambarus at linaro.org>;
> broonie at kernel.org; pratyush at kernel.org; miquel.raynal at bootlin.com;
> richard at nod.at; vigneshr at ti.com; sbinding at opensource.cirrus.com;
> lee at kernel.org; james.schulman at cirrus.com; david.rhodes at cirrus.com;
> rf at opensource.cirrus.com; perex at perex.cz; tiwai at suse.com
> Cc: linux-spi at vger.kernel.org; linux-kernel at vger.kernel.org; linux-
> mtd at lists.infradead.org; nicolas.ferre at microchip.com;
> alexandre.belloni at bootlin.com; claudiu.beznea at tuxon.dev; linux-arm-
> kernel at lists.infradead.org; alsa-devel at alsa-project.org;
> patches at opensource.cirrus.com; linux-sound at vger.kernel.org; git (AMD-
> Xilinx) <git at amd.com>; amitrkcian2002 at gmail.com; Conor Dooley
> <conor.dooley at microchip.com>; beanhuo at micron.com
> Subject: RE: [PATCH v11 07/10] mtd: spi-nor: Add stacked memories support
> in spi-nor
> 
> Hi Michael,
> 
> >>> All I'm saying is that you shouldn't put burden on us (the SPI NOR
> >>> maintainers) for what seems to me at least as a niche. Thus I was
> >>> asking for performance numbers and users. Convince me that I'm wrong
> >>> and that is worth our time.
> >>
> >> No. It is not really just feature of our evaluation boards. Customers
> >> are using it. I was talking to one guy from field and he confirms me
> >> that these configurations are used by his multiple customers in real
> products.
> >
> > Which begs the question, do we really have to support every feature in
> > the core (I'd like to hear Tudors and Pratyush opinion here).
> > Honestly, this just looks like a concatenation of two QSPI
> > controllers.
> 
> > Based on my understanding for stacked yes. For parallel no.
> 
> > Why didn't you just use a normal octal controller which is a protocol
> > also backed by the JEDEC standard.
> 
> > On newer SOC octal IP core is used.
> > Amit please comment.
> 
> > Is it any faster?
> 
> > Amit: please provide numbers.
> 
Here are some QSPI performance numbers comparing a single flash mode and 
two flashes connected in parallel mode. I ran the test on a VCK190 Eval 
Board https://www.xilinx.com/products/boards-and-kits/vck190.html, 
measuring the timing for mtd_debug erase, write, and read operations. 
The QSPI bus clock was set to 150MHz, and the data size was 32MB, 
comparing a single flash setup with a two-flash parallel mode setup.

Single Flash mode:

xilinx-vck190-20242:/home/petalinux# time mtd_debug erase /dev/mtd2 0x00 0x1e00000
Erased 31457280 bytes from address 0x00000000 in flash

real	0m54.713s
user	0m0.000s
sys	0m32.639s
xilinx-vck190-20242:/home/petalinux# time mtd_debug write /dev/mtd2 0x00 0x1e00000 test.bin
Copied 31457280 bytes from test.bin to address 0x00000000 in flash

real	0m30.187s
user	0m0.000s
sys	0m16.359s
xilinx-vck190-20242:/home/petalinux# time mtd_debug read /dev/mtd2 0x00 0x1e00000 test-read.bin
Copied 31457280 bytes from address 0x00000000 in flash to test-read.bin

real	0m0.472s
user	0m0.001s
sys	0m0.040s

Two flashes connected in parallel mode:

xilinx-vck190-20242:/home/petalinux# time mtd_debug erase /dev/mtd2 0x00 0x1e00000
Erased 31457280 bytes from address 0x00000000 in flash

real	0m27.727s
user	0m0.004s
sys	0m14.923s
xilinx-vck190-20242:/home/petalinux# time mtd_debug write /dev/mtd2 0x00 0x1e00000 test.bin
Copied 31457280 bytes from test.bin to address 0x00000000 in flash

real	0m16.538s
user	0m0.000s
sys	0m8.512s
xilinx-vck190-20242:/home/petalinux# time mtd_debug read /dev/mtd2 0x00 0x1e00000 test-read.bin
Copied 31457280 bytes from address 0x00000000 in flash to test-read.bin

real	0m0.263s
user	0m0.000s
sys	0m0.044s

Regards,
Amit

> > Do you get more capacity? Does anyone really use large SPI-NOR
> > flashes? If so, why?
> 
> > You get twice more capacity based on that configuration. I can't
> > answer the second question because not working with field. But both of
> > that configurations are used by customers. Adding Neal if he wants to add
> something more to it.
> 
> Just to add a comment as I work directly with our customers.  The main
> reason this support is important is for our older SoCs, zynq and zynqmp.
> 
> Most of our customers are using QSPI flash as the first boot memory to get
> from the boot ROM to u-boot.  They then typically use other memories, such
> as eMMC for the Linux kernel, OS and file system.
> 
> The issue we have on the zynq and zynqmp SoCs is that the boot ROM (code
> that cannot be changed) will not boot from an OSPI flash.  It will only boot
> from a QSPI flash.  This is what is forcing many of our customers down the
> QSPI path.
> Since many of these customers are interested in additional speed and
> memory size, they then end up using a parallel or stacked configuration
> because they cannot use an OSPI with zynq or zynqmp.
> 
> All of our newer SoCs can boot from OSPI.  I agree with you that if someone
> could choose OSPI for performance, they would, so I do not expect parallel or
> stacked configurations with our newer SoCs.
> 
> I get why you see this configuration as a niche, but for us, it is a very large
> niche because zynq and zynqmp are two of our most successful SoC families.
> 
> > I mean you've put that controller on your SoC, you must have some
> > convincing arguments why a customer should use it.
> 
> > I expect recommendation is to use single configuration but if you need
> > bigger space for your application the only way to extend it is to use
> > stacked configuration with two the same flashes next to each other.
> > If you want to have bigger size and also be faster answer is parallel
> > configuration.
> 
> 
> >>> The first round of patches were really invasive regarding the core
> >>> code. So if there is a clean layering approach which can be enabled
> >>> as a module and you are maintaining it I'm fine with that (even if
> >>> the core code needs some changes then like hooks or so, not sure).
> >>
> >> That discussion started with Miquel some years ago when he was trying
> >> to to solve description in DT which is merged for a while in the kernel.
> >
> > What's your point here? From what I can tell the DT binding is wrong
> > and needs to be reworked anyway.
> 
> > I am just saying that this is not any adhoc new feature but
> > configuration which has been already discussed and some steps made. If
> > DT binding is wrong it can be deprecated and use new one but for that it
> has be clear which way to go.
> 
> 
> >> And Amit is trying to figure it out which way to go.
> >> I don't want to predict where that code should be going or how it
> >> should look like because don't have spi-nor experience. But I hope we
> >> finally move forward on this topic to see the path how to upstream
> support for it.
> >
> > You still didn't answer my initial question. Will AMD support and
> > maintain that code? I was pushing you towards putting that code into
> > your own driver because then that's up to you what you are doing
> > there.
> 
> > Of course. We care about our code and about supporting our SOC and
> > features which are related to it. It means yes, we will be regularly
> > testing it and taking care about it.
> 
> 
> > So how do we move forward? I'd like to see as little as core changes
> > possible to get your dual flash setup working. I'm fine with having a
> > layer in spi-nor/ (not sure that's how it will work with mtdcat which
> > looks like it's similar as your stacked flash) as long as it can be a
> > module (selected by the driver).
> 
> > ok.
> 
> Best regards,
> Neal Frager
> AMD