[PATCH v4 0/8] mtd: spi-nor: read while write support

Fri Mar 24 06:51:40 PDT 2023

Hi Tudor,

tudor.ambarus at linaro.org wrote on Fri, 17 Mar 2023 04:13:27 +0000:

> On 2/1/23 11:35, Miquel Raynal wrote:
> > Hello folks,
> > 
> > Here is the follow-up of the RFC trying to bring a little bit of
> > parallelism to support SPI-NOR Read While Write feature on parts
> > supporting it and featuring several banks.
> > 
> > I have received some hardware to make it work, so since the RFC, the
> > series has been updated to fix my mistakes, but the overall idea is the
> > same.
> > 
> > There is nothing Macronix specific in the implementation, the operations
> > and opcodes are exactly the same as before. The only difference being:
> > we may consider the chip usable when it is in the busy state during a
> > write or an erase. Any chip with an internal split allowing to perform
> > parallel operations might possibly leverage the benefits of this
> > implementation.
> > 
> > The first patches are just refactoring and preparation work, there is
> > almost no functional change, it's just a way to prepare the introduction
> > of the new locking mechanism and hopefully provide the cleanest and
> > simplest diff possible for this new feature. The actual change is all
> > contained in "mtd: spi-nor: Enhance locking to support reads while
> > writes". The logic is described in the commit log and copy/pasted here
> > for clarity:
> > 
> > "
> >     On devices featuring several banks, the Read While Write (RWW) feature
> >     is here to improve the overall performance when performing parallel
> >     reads and writes at different locations (different banks). The
> >     following constraints have to be taken into account:
> >     1#: A single operation can be performed in a given bank.
> >     2#: Only a single program or erase operation can happen on the entire
> >         chip (common hardware limitation to limit costs)
> >     3#: Reads must remain serialized even though reads on different banks
> >         might occur at the same time.
> >     4#: The I/O bus is unique and thus is the most constrained resource,
> >         all spi-nor operations requiring access to the spi bus (through
> >         the spi controller) must be serialized until the bus exchanges
> >         are over. So we must ensure a single operation can be "sent" at
> >         a time.
> >     5#: Any other operation that would not be either a read or a write or an
> >         erase is considered requiring access to the full chip and cannot be
> >         parallelized, we then need to ensure the full chip is in the idle
> >         state when this occurs.
> >     
> >     All these constraints can easily be managed with a proper locking model:
> >     1#: Is enforced by a bitfield of the in-use banks, so that only a single
> >         operation can happen in a specific bank at any time.
> >     2#: Is handled by the ongoing_pe boolean which is set before any write
> >         or erase, and is released only at the very end of the
> >         operation. This way, no other destructive operation on the chip can
> >         start during this time frame.
> >     3#: An ongoing_rd boolean allows to track the ongoing reads, so that
> >         only one can be performed at a time.
> >     4#: An ongoing_io boolean is introduced in order to capture and
> >         serialize bus accessed. This is the one being released "sooner"
> >         than before, because we only need to protect the chip against
> >         other SPI accesses during the I/O phase, which for the
> >         destructive operations is the beginning of the operation (when
> >         we send the command cycles and possibly the data), while the
> >         second part of the operation (the erase delay or the
> >         programmation delay) is when we can do something else in another
> >         bank.
> >     5#: Is handled by the three booleans presented above, if any of them is
> >         set, the chip is not yet ready for the operation and must wait.
> >     
> >     All these internal variables are protected by the existing lock, so that
> >     changes in this structure are atomic. The serialization is handled with
> >     a wait queue."
> > 
> > Here is now a benchmark with a Macronix MX25UW51245G with 4 banks and RWW
> > support:
> > 
> >      // Testing the two accesses in the same bank
> >      $ flash_speed -b0 -k0 -c10 -d /dev/mtd0
> >      [...]
> >      testing read while write latency
> >      read while write took 51ms, read ended after 51ms
> > 
> >      // Testing the two accesses within different banks
> >      $ flash_speed -b0 -k4096 -c10 -d /dev/mtd0
> >      [...]
> >      testing read while write latency
> >      read while write took 51ms, read ended after 20ms
> > 
> > Parallel accesses have been validated with io_paral. A slight increase
> > of the time spent on this test has however been noticed. With my  
> 
> how do the other tests look? Is there any change in performance for
> flashes that do not support RWW?

The current implementation takes care of not changing anything with the
existing flashes, when I resend I'll provide all the logs you asked
for, plus another quick test without the RWW feature bit set.

> 
> > configuration, over a limited number of blocks, the overall operation
> > took 22 min without any RWW changes up to 27 min with these changes,
> > maybe due to the number of additional scheduling situations involved).
> > 
> > Here is a branch with the mtd-utils patch bringing support for this
> > additional "-k" parameter in flash_speed (for the second block to use
> > during RWW testing), used to get the above results:
> > https://github.com/miquelraynal/mtd-utils/compare/master...rww
> > 
> > Cheers,
> > Miquèl
> > 
> > Changes in v4:
> > * Dropped patch 1/9 which got applied.
> > * s/SPI-NOR/SPI NOR/
> > * Turned n_banks into an u8 and moved it below in the struct to avoid
> >   padding.
> > * Updated the S3AN_INFO macro to set n_banks to 1 by default.
> > * Renamed the lock and prep helper to follow the order of each
> >   operation.
> > * Reworded a commit log to fit the recent changes upstream.
> > 
> > Changes in v3:
> > * Fix the bank offsets calculations by providing the same values when
> >   locking and when unlocking (might be changed by the functions themselves
> >   without use noticing).
> > * I completely changed the way the locking works because there was a new
> >   constraint: reads cannot be interrupted and status reads cannot happen
> >   during a read. Hence, as the multi-locks design was starting to be too
> >   messy, I changed the implementation to use a bunch of variables to
> >   track the read while write state, protected by the main spi-nor
> >   lock. If the internal state does not allow the operation, a sleep
> >   starts in a queue, until the threads are woken up after a state
> >   update. I know it is very verbose, I am open to suggestions.
> > 
> > 
> > Miquel Raynal (8):
> >   mtd: spi-nor: Introduce the concept of bank
> >   mtd: spi-nor: Add a macro to define more banks
> >   mtd: spi-nor: Reorder the preparation vs locking steps
> >   mtd: spi-nor: Separate preparation and locking
> >   mtd: spi-nor: Prepare the introduction of a new locking mechanism
> >   mtd: spi-nor: Add a RWW flag
> >   mtd: spi-nor: Enhance locking to support reads while writes
> >   mtd: spi-nor: macronix: Add support for mx25uw51245g with RWW
> > 
> >  drivers/mtd/spi-nor/core.c     | 396 +++++++++++++++++++++++++++++++--
> >  drivers/mtd/spi-nor/core.h     |  26 ++-
> >  drivers/mtd/spi-nor/macronix.c |   3 +
> >  drivers/mtd/spi-nor/xilinx.c   |   1 +
> >  include/linux/mtd/spi-nor.h    |  13 ++
> >  5 files changed, 409 insertions(+), 30 deletions(-)
> >   

Thanks,
Miquèl