[PATCH v2 0/9] mtd: spi-nor: read while write support

Thu Dec 1 18:22:10 PST 2022

Hi Miquel

>
> Hi Pratyush,
>
> Jaime, your input is welcome on the topic below.

Yes, the reason is to reduce read latency at minimum cost of hardware
as your understanding.

In some MCPs with common SPI bus, it is possible to program in one chip
while erasing in another chip. But for single die chip with multiple banks.
only RWW is offered to date in SPI NOR because of cost like your suggestion.

>
> pratyush at kernel.org wrote on Sun, 20 Nov 2022 16:41:16 +0100:
>
> > Hi Miquel,
> >
> > On 10/11/22 04:55PM, Miquel Raynal wrote:
> > > Hello folks,
> > >
> > > Here is the follow-up of the RFC trying to bring a little bit of
> > > parallelism to support SPI-NOR Read While Write feature on parts
> > > supporting it and featuring several banks.
> > >
> > > I have received some hardware to make it work, so since the RFC, the
> > > series has been updated to fix my mistakes, but the overall idea is the
> > > same.
> > >
> > > There is nothing Macronix specific in the implementation, the operations
> > > and opcodes are exactly the same as before. The only difference being:
> > > we may consider the chip usable when it is in the busy state during a
> > > write or an erase. Any chip with an internal split allowing to perform
> > > parallel operations might possibly leverage the benefits of this
> > > implementation.
> > >
> > > The first patches are just refactoring and preparation work, there is
> > > almost no functional change, it's just a way to prepare the introduction
> > > of the new locking mechanism and hopefully provide the cleanest and
> > > simplest diff possible for this new feature. The actual change is all
> > > contained in "mtd: spi-nor: Enhance locking to support reads while
> > > writes". The logic is described in the commit log and copy/pasted here
> > > for clarity:
> > >
> > > "
> > > The following constraints have to be taken into account:
> > > 1#: A single operation can be performed in a given bank.
> > > 2#: Only a single program or erase operation can happen on the entire chip.
> >
> > Is this a limitation of the chip you are working with? A chip with
> > multiple banks can in theory support parallel erases and programs on
> > each bank as well, right? If so then it might make sense to allow
> > parallel erases and program operations too, and then allow for extra
> > chip-specific constraints.
>
> Yes this is a limitation of the chip family I had a chance to play
> with, but there are two reasons why I would assume we won't need
> a more parallelized scheme:
>
> * I doubt any chip will ever be able to perform more than one erase or
>  program operation at a time due to the current it might draw. A read
>  is a rather non-expensive operation power-wise so I guess that is why
>  you can do it in parallel of others, but two program or even worse,
>  two erases, that might actually be too much (that is pure speculation
>  on my side).
>
> * The second reason is that the RWW feature serves one major purpose:
>  accessing your filesystem while you update it. So the goal really is
>  to be able to make an update, while still getting a rather good
>  responsiveness from the system that runs on the same device during
>  it. This actually only involves reading while performing any other
>  operation. It does not improve the performances much, besides the
>  read latencies.
>
> Thanks,
> Miquèl
>
> > I have not yet read the code so I am not sure how complex implementing
> > this would be. Just thinking out loud for now.
> >
> > > 3#: The I/O bus is unique and thus is the most constrained resource, all
> > >     spi-nor operations requiring access to the spi bus (through the spi
> > >     controller) must be serialized until the bus exchanges are over. So
> > >     we must ensure a single operation can be "sent" at a time.
> > > 4#: Any other operation that would not be either a read or a write or an
> > >     erase is considered requiring access to the full chip and cannot be
> > >     parallelized, we then need to ensure the full chip is in the idle
> > >     state when this occurs.
> >
> > Makes sense.
> >
> > >
> > > All these constraints can easily be managed with a proper locking model:
> > > 1#: Is protected by a per-bank mutex. Only a single operation can happen
> > >     in a specific bank at any times. If the bank mutex is not available,
> > >     the operation cannot start.
> > > 2#: Is handled by the pe_mode mutex which is acquired before any write
> > >     or erase, and is released only at the very end of the
> > >     operation. This way, no other destructive operation on the chip can
> > >     start during this time frame.
> > > 3#: A device-wide mutex is introduced in order to capture and serialize
> > >     bus accessed. This is the one being released "sooner" than before,
> > >     because we only need to protect the chip against other SPI accesses
> > >     during the I/O phase, which for the destructive operations is the
> > >     beginning of the operation (when we send the command cycles and
> > >     eventually the data), while the second part of the operation (the
> > >     erase delay or the programmation delay) is when we can do something
> > >     else with another bank.
> > > 4#: Is handled by the "generic" helpers which existed before, where
> > >     basically all the locks are taken before the operation can start,
> > >     and all locks are released once done.
> > >
> > > As many devices still do not support this feature, the original lock is
> > > also kept in a union: either the feature is available and we initialize
> > > and use the new locks, or it is not and we keep using the previous
> > > logic.
> > > "
> > >
> > > Here is now a benchmark with a Macronix MX25UW51245G with bank and RWW
> > > support:
> > >
> > >      // Testing the two accesses in the same bank
> > >      $ flash_speed -b0 -k0 -c10 -d /dev/mtd0
> > >      [...]
> > >      testing read while write latency
> > >      read while write took 51ms, read ended after 51ms
> > >
> > >      // Testing the two accesses within different banks
> > >      $ flash_speed -b0 -k4096 -c10 -d /dev/mtd0
> > >      [...]
> > >      testing read while write latency
> > >      read while write took 51ms, read ended after 20ms
> > >
> > > Here is a branch with the mtd-utils patch bringing support for this
> > > additional "-k" parameter (for the second block to use during RWW
> > > testing), used to get the above result:
> > > https://github.com/miquelraynal/mtd-utils/compare/master...rww
> > >
> > > Cheers,
> > > Miquèl
> > >
> > > Miquel Raynal (9):
> > >   mtd: spi-nor: Create macros to define chip IDs and geometries
> > >   mtd: spi-nor: Introduce the concept of bank
> > >   mtd: spi-nor: Add a macro to define more banks
> > >   mtd: spi-nor: Reorder the preparation vs locking steps
> > >   mtd: spi-nor: Separate preparation and locking
> > >   mtd: spi-nor: Prepare the introduction of a new locking mechanism
> > >   mtd: spi-nor: Add a RWW flag
> > >   mtd: spi-nor: Enhance locking to support reads while writes
> > >   mtd: spi-nor: macronix: Add support for mx25uw51245g with RWW
> > >
> > >  drivers/mtd/spi-nor/core.c     | 224 +++++++++++++++++++++++++++++----
> > >  drivers/mtd/spi-nor/core.h     |  61 +++++----
> > >  drivers/mtd/spi-nor/macronix.c |   3 +
> > >  include/linux/mtd/spi-nor.h    |  12 +-
> > >  4 files changed, 250 insertions(+), 50 deletions(-)
> > >
> > > --
> > > 2.34.1
> > >
> > >
> > > ______________________________________________________
> > > Linux MTD discussion mailing list
> > > http://lists.infradead.org/mailman/listinfo/linux-mtd/
> >
>
>

Thanks
Jaime