[PATCH v4 0/8] mtd: spi-nor: read while write support
miquel.raynal at bootlin.com
Fri Mar 24 06:51:40 PDT 2023
tudor.ambarus at linaro.org wrote on Fri, 17 Mar 2023 04:13:27 +0000:
> On 2/1/23 11:35, Miquel Raynal wrote:
> > Hello folks,
> > Here is the follow-up of the RFC trying to bring a little bit of
> > parallelism to support SPI-NOR Read While Write feature on parts
> > supporting it and featuring several banks.
> > I have received some hardware to make it work, so since the RFC, the
> > series has been updated to fix my mistakes, but the overall idea is the
> > same.
> > There is nothing Macronix specific in the implementation, the operations
> > and opcodes are exactly the same as before. The only difference being:
> > we may consider the chip usable when it is in the busy state during a
> > write or an erase. Any chip with an internal split allowing to perform
> > parallel operations might possibly leverage the benefits of this
> > implementation.
> > The first patches are just refactoring and preparation work, there is
> > almost no functional change, it's just a way to prepare the introduction
> > of the new locking mechanism and hopefully provide the cleanest and
> > simplest diff possible for this new feature. The actual change is all
> > contained in "mtd: spi-nor: Enhance locking to support reads while
> > writes". The logic is described in the commit log and copy/pasted here
> > for clarity:
> > "
> > On devices featuring several banks, the Read While Write (RWW) feature
> > is here to improve the overall performance when performing parallel
> > reads and writes at different locations (different banks). The
> > following constraints have to be taken into account:
> > 1#: A single operation can be performed in a given bank.
> > 2#: Only a single program or erase operation can happen on the entire
> > chip (common hardware limitation to limit costs)
> > 3#: Reads must remain serialized even though reads on different banks
> > might occur at the same time.
> > 4#: The I/O bus is unique and thus is the most constrained resource,
> > all spi-nor operations requiring access to the spi bus (through
> > the spi controller) must be serialized until the bus exchanges
> > are over. So we must ensure a single operation can be "sent" at
> > a time.
> > 5#: Any other operation that would not be either a read or a write or an
> > erase is considered requiring access to the full chip and cannot be
> > parallelized, we then need to ensure the full chip is in the idle
> > state when this occurs.
> > All these constraints can easily be managed with a proper locking model:
> > 1#: Is enforced by a bitfield of the in-use banks, so that only a single
> > operation can happen in a specific bank at any time.
> > 2#: Is handled by the ongoing_pe boolean which is set before any write
> > or erase, and is released only at the very end of the
> > operation. This way, no other destructive operation on the chip can
> > start during this time frame.
> > 3#: An ongoing_rd boolean allows to track the ongoing reads, so that
> > only one can be performed at a time.
> > 4#: An ongoing_io boolean is introduced in order to capture and
> > serialize bus accessed. This is the one being released "sooner"
> > than before, because we only need to protect the chip against
> > other SPI accesses during the I/O phase, which for the
> > destructive operations is the beginning of the operation (when
> > we send the command cycles and possibly the data), while the
> > second part of the operation (the erase delay or the
> > programmation delay) is when we can do something else in another
> > bank.
> > 5#: Is handled by the three booleans presented above, if any of them is
> > set, the chip is not yet ready for the operation and must wait.
> > All these internal variables are protected by the existing lock, so that
> > changes in this structure are atomic. The serialization is handled with
> > a wait queue."
> > Here is now a benchmark with a Macronix MX25UW51245G with 4 banks and RWW
> > support:
> > // Testing the two accesses in the same bank
> > $ flash_speed -b0 -k0 -c10 -d /dev/mtd0
> > [...]
> > testing read while write latency
> > read while write took 51ms, read ended after 51ms
> > // Testing the two accesses within different banks
> > $ flash_speed -b0 -k4096 -c10 -d /dev/mtd0
> > [...]
> > testing read while write latency
> > read while write took 51ms, read ended after 20ms
> > Parallel accesses have been validated with io_paral. A slight increase
> > of the time spent on this test has however been noticed. With my
> how do the other tests look? Is there any change in performance for
> flashes that do not support RWW?
The current implementation takes care of not changing anything with the
existing flashes, when I resend I'll provide all the logs you asked
for, plus another quick test without the RWW feature bit set.
> > configuration, over a limited number of blocks, the overall operation
> > took 22 min without any RWW changes up to 27 min with these changes,
> > maybe due to the number of additional scheduling situations involved).
> > Here is a branch with the mtd-utils patch bringing support for this
> > additional "-k" parameter in flash_speed (for the second block to use
> > during RWW testing), used to get the above results:
> > https://github.com/miquelraynal/mtd-utils/compare/master...rww
> > Cheers,
> > Miquèl
> > Changes in v4:
> > * Dropped patch 1/9 which got applied.
> > * s/SPI-NOR/SPI NOR/
> > * Turned n_banks into an u8 and moved it below in the struct to avoid
> > padding.
> > * Updated the S3AN_INFO macro to set n_banks to 1 by default.
> > * Renamed the lock and prep helper to follow the order of each
> > operation.
> > * Reworded a commit log to fit the recent changes upstream.
> > Changes in v3:
> > * Fix the bank offsets calculations by providing the same values when
> > locking and when unlocking (might be changed by the functions themselves
> > without use noticing).
> > * I completely changed the way the locking works because there was a new
> > constraint: reads cannot be interrupted and status reads cannot happen
> > during a read. Hence, as the multi-locks design was starting to be too
> > messy, I changed the implementation to use a bunch of variables to
> > track the read while write state, protected by the main spi-nor
> > lock. If the internal state does not allow the operation, a sleep
> > starts in a queue, until the threads are woken up after a state
> > update. I know it is very verbose, I am open to suggestions.
> > Miquel Raynal (8):
> > mtd: spi-nor: Introduce the concept of bank
> > mtd: spi-nor: Add a macro to define more banks
> > mtd: spi-nor: Reorder the preparation vs locking steps
> > mtd: spi-nor: Separate preparation and locking
> > mtd: spi-nor: Prepare the introduction of a new locking mechanism
> > mtd: spi-nor: Add a RWW flag
> > mtd: spi-nor: Enhance locking to support reads while writes
> > mtd: spi-nor: macronix: Add support for mx25uw51245g with RWW
> > drivers/mtd/spi-nor/core.c | 396 +++++++++++++++++++++++++++++++--
> > drivers/mtd/spi-nor/core.h | 26 ++-
> > drivers/mtd/spi-nor/macronix.c | 3 +
> > drivers/mtd/spi-nor/xilinx.c | 1 +
> > include/linux/mtd/spi-nor.h | 13 ++
> > 5 files changed, 409 insertions(+), 30 deletions(-)
More information about the linux-mtd