[PATCH v2 3/3] mtd: rawnand: Support for sequential cache reads

Miquel Raynal miquel.raynal at bootlin.com
Sun Jul 16 08:49:17 PDT 2023


Hello all,

So here is a summary of the situation:
- we have two bug reports regarding the use of sequential page reads
- both are on TI OMAP platforms: AM33XX and AM3517. I believe both are
  using the same omap2.c driver
- they use a Micron and a Samsung NAND chip

All these information gives me the hint that it is related to the
controller driver which does something silly during the exec_op phase.

Alexander and Måns, can you please tell me:
- Are you using a gpio for the waitrdy thing or do you leverage
  nand_soft_waitrdy()? If you are using the gpio, can you both try with
  the soft implementation and see if it changes something?
- Are you using any POLL or DMA prefetch mode? Can you please force the
  default in and out helpers by using only omap_nand_data_in() and
  omap_nand_data_out() to see if it changes something?

I believe there is something wrong in the timings, while properly
implemented in theory there might be some cases where we miss a barrier
or something like that. I would like to try the following two hacks,
and see if we can find what is the timing that is not observed, despite
the lack of probing. The first one is a real hack, the second one might
actually look like a real fix. Please let me know, both of you, if you
see different behaviors.

*** HACK #1 ***

--- a/drivers/mtd/nand/raw/omap2.c
+++ b/drivers/mtd/nand/raw/omap2.c
@@ -2113,6 +2113,9 @@ static int omap_nand_exec_instr(struct nand_chip *chip,
        case NAND_OP_CMD_INSTR:
                iowrite8(instr->ctx.cmd.opcode,
                         info->reg.gpmc_nand_command);
+               if (instr->ctx.cmd.opcode == NAND_CMD_READCACHESEQ ||
+                   instr->ctx.cmd.opcode == NAND_CMD_READCACHEEND)
+                       udelay(50);
                break;
 
        case NAND_OP_ADDR_INSTR:

*** HACK #2 ***

--- a/drivers/mtd/nand/raw/omap2.c
+++ b/drivers/mtd/nand/raw/omap2.c
@@ -2143,8 +2146,10 @@ static int omap_nand_exec_instr(struct nand_chip *chip,
                break;
        }
 
-       if (instr->delay_ns)
+       if (instr->delay_ns) {
+               mb();
                ndelay(instr->delay_ns);
+       }
 
        return 0;
 }

Thanks a lot!
Miquèl

mans at mansr.com wrote on Thu, 22 Jun 2023 15:59:25+0100:

> Miquel Raynal <miquel.raynal at bootlin.com> writes:
> 
> > From: JaimeLiao <jaimeliao.tw at gmail.com>
> >
> > Add support for sequential cache reads for controllers using the generic
> > core helpers for their fast read/write helpers.
> >
> > Sequential reads may reduce the overhead when accessing physically
> > continuous data by loading in cache the next page while the previous
> > page gets sent out on the NAND bus.
> >
> > The ONFI specification provides the following additional commands to
> > handle sequential cached reads:
> >
> > * 0x31 - READ CACHE SEQUENTIAL:
> >   Requires the NAND chip to load the next page into cache while keeping
> >   the current cache available for host reads.
> > * 0x3F - READ CACHE END:
> >   Tells the NAND chip this is the end of the sequential cache read, the
> >   current cache shall remain accessible for the host but no more
> >   internal cache loading operation is required.
> >
> > On the bus, a multi page read operation is currently handled like this:
> >
> > 	00 -- ADDR1 -- 30 -- WAIT_RDY (tR+tRR) -- DATA1_IN
> > 	00 -- ADDR2 -- 30 -- WAIT_RDY (tR+tRR) -- DATA2_IN
> > 	00 -- ADDR3 -- 30 -- WAIT_RDY (tR+tRR) -- DATA3_IN
> >
> > Sequential cached reads may instead be achieved with:
> >
> > 	00 -- ADDR1 -- 30 -- WAIT_RDY (tR) -- \
> > 		       31 -- WAIT_RDY (tRCBSY+tRR) -- DATA1_IN \
> > 		       31 -- WAIT_RDY (tRCBSY+tRR) -- DATA2_IN \
> > 		       3F -- WAIT_RDY (tRCBSY+tRR) -- DATA3_IN
> >
> > Below are the read speed test results with regular reads and
> > sequential cached reads, on NXP i.MX6 VAR-SOM-SOLO in mapping mode with
> > a NAND chip characterized with the following timings:
> > * tR: 20 µs
> > * tRCBSY: 5 µs
> > * tRR: 20 ns
> > and the following geometry:
> > * device size: 2 MiB
> > * eraseblock size: 128 kiB
> > * page size: 2 kiB
> >
> > ============= Normal read @ 33MHz =================
> > mtd_speedtest: eraseblock read speed is 15633 KiB/s
> > mtd_speedtest: page read speed is 15515 KiB/s
> > mtd_speedtest: 2 page read speed is 15398 KiB/s
> > ===================================================
> >
> > ========= Sequential cache read @ 33MHz ===========
> > mtd_speedtest: eraseblock read speed is 18285 KiB/s
> > mtd_speedtest: page read speed is 15875 KiB/s
> > mtd_speedtest: 2 page read speed is 16253 KiB/s
> > ===================================================
> >
> > We observe an overall speed improvement of about 5% when reading
> > 2 pages, up to 15% when reading an entire block. This is due to the
> > ~14us gain on each additional page read (tR - (tRCBSY + tRR)).
> >
> > Co-developed-by: Miquel Raynal <miquel.raynal at bootlin.com>
> > Signed-off-by: Miquel Raynal <miquel.raynal at bootlin.com>
> > Signed-off-by: JaimeLiao <jaimeliao.tw at gmail.com>
> > ---
> >  drivers/mtd/nand/raw/nand_base.c | 119 +++++++++++++++++++++++++++++--
> >  include/linux/mtd/rawnand.h      |   9 +++
> >  2 files changed, 124 insertions(+), 4 deletions(-)  
> 
> This change broke something on a TI AM3517 based system.  What I'm
> noticing is that the u-boot fw_setenv tool is failing due to the
> MEMGETBADBLOCK ioctl reporting some blocks as bad when they are not.
> Everything else is, somehow, working fine.  Reverting this commit fixes
> it, though I don't know why.  I'm seeing the same behaviour on multiple
> devices, so I doubt there is a problem with the flash memory.
> 
> Is there anything I can test to get more information?
> 



More information about the linux-mtd mailing list