[PATCH v2 3/3] mtd: rawnand: Support for sequential cache reads

Sun Jul 16 22:33:39 PDT 2023

Hello.

We are using GPIO for waitrdy. However, even when switching to soft
implementation,
the system still behaves incorrectly.
We are using prefetch-dma mode for xfer. Changing to the default implementation
does not result in any improvements.

Both patches don't help :(

вс, 16 июл. 2023 г. в 18:49, Miquel Raynal <miquel.raynal at bootlin.com>:
>
> Hello all,
>
> So here is a summary of the situation:
> - we have two bug reports regarding the use of sequential page reads
> - both are on TI OMAP platforms: AM33XX and AM3517. I believe both are
>   using the same omap2.c driver
> - they use a Micron and a Samsung NAND chip
>
> All these information gives me the hint that it is related to the
> controller driver which does something silly during the exec_op phase.
>
> Alexander and Måns, can you please tell me:
> - Are you using a gpio for the waitrdy thing or do you leverage
>   nand_soft_waitrdy()? If you are using the gpio, can you both try with
>   the soft implementation and see if it changes something?
> - Are you using any POLL or DMA prefetch mode? Can you please force the
>   default in and out helpers by using only omap_nand_data_in() and
>   omap_nand_data_out() to see if it changes something?
>
> I believe there is something wrong in the timings, while properly
> implemented in theory there might be some cases where we miss a barrier
> or something like that. I would like to try the following two hacks,
> and see if we can find what is the timing that is not observed, despite
> the lack of probing. The first one is a real hack, the second one might
> actually look like a real fix. Please let me know, both of you, if you
> see different behaviors.
>
> *** HACK #1 ***
>
> --- a/drivers/mtd/nand/raw/omap2.c
> +++ b/drivers/mtd/nand/raw/omap2.c
> @@ -2113,6 +2113,9 @@ static int omap_nand_exec_instr(struct nand_chip *chip,
>         case NAND_OP_CMD_INSTR:
>                 iowrite8(instr->ctx.cmd.opcode,
>                          info->reg.gpmc_nand_command);
> +               if (instr->ctx.cmd.opcode == NAND_CMD_READCACHESEQ ||
> +                   instr->ctx.cmd.opcode == NAND_CMD_READCACHEEND)
> +                       udelay(50);
>                 break;
>
>         case NAND_OP_ADDR_INSTR:
>
> *** HACK #2 ***
>
> --- a/drivers/mtd/nand/raw/omap2.c
> +++ b/drivers/mtd/nand/raw/omap2.c
> @@ -2143,8 +2146,10 @@ static int omap_nand_exec_instr(struct nand_chip *chip,
>                 break;
>         }
>
> -       if (instr->delay_ns)
> +       if (instr->delay_ns) {
> +               mb();
>                 ndelay(instr->delay_ns);
> +       }
>
>         return 0;
>  }
>
> Thanks a lot!
> Miquèl
>
> mans at mansr.com wrote on Thu, 22 Jun 2023 15:59:25+0100:
>
> > Miquel Raynal <miquel.raynal at bootlin.com> writes:
> >
> > > From: JaimeLiao <jaimeliao.tw at gmail.com>
> > >
> > > Add support for sequential cache reads for controllers using the generic
> > > core helpers for their fast read/write helpers.
> > >
> > > Sequential reads may reduce the overhead when accessing physically
> > > continuous data by loading in cache the next page while the previous
> > > page gets sent out on the NAND bus.
> > >
> > > The ONFI specification provides the following additional commands to
> > > handle sequential cached reads:
> > >
> > > * 0x31 - READ CACHE SEQUENTIAL:
> > >   Requires the NAND chip to load the next page into cache while keeping
> > >   the current cache available for host reads.
> > > * 0x3F - READ CACHE END:
> > >   Tells the NAND chip this is the end of the sequential cache read, the
> > >   current cache shall remain accessible for the host but no more
> > >   internal cache loading operation is required.
> > >
> > > On the bus, a multi page read operation is currently handled like this:
> > >
> > >     00 -- ADDR1 -- 30 -- WAIT_RDY (tR+tRR) -- DATA1_IN
> > >     00 -- ADDR2 -- 30 -- WAIT_RDY (tR+tRR) -- DATA2_IN
> > >     00 -- ADDR3 -- 30 -- WAIT_RDY (tR+tRR) -- DATA3_IN
> > >
> > > Sequential cached reads may instead be achieved with:
> > >
> > >     00 -- ADDR1 -- 30 -- WAIT_RDY (tR) -- \
> > >                    31 -- WAIT_RDY (tRCBSY+tRR) -- DATA1_IN \
> > >                    31 -- WAIT_RDY (tRCBSY+tRR) -- DATA2_IN \
> > >                    3F -- WAIT_RDY (tRCBSY+tRR) -- DATA3_IN
> > >
> > > Below are the read speed test results with regular reads and
> > > sequential cached reads, on NXP i.MX6 VAR-SOM-SOLO in mapping mode with
> > > a NAND chip characterized with the following timings:
> > > * tR: 20 µs
> > > * tRCBSY: 5 µs
> > > * tRR: 20 ns
> > > and the following geometry:
> > > * device size: 2 MiB
> > > * eraseblock size: 128 kiB
> > > * page size: 2 kiB
> > >
> > > ============= Normal read @ 33MHz =================
> > > mtd_speedtest: eraseblock read speed is 15633 KiB/s
> > > mtd_speedtest: page read speed is 15515 KiB/s
> > > mtd_speedtest: 2 page read speed is 15398 KiB/s
> > > ===================================================
> > >
> > > ========= Sequential cache read @ 33MHz ===========
> > > mtd_speedtest: eraseblock read speed is 18285 KiB/s
> > > mtd_speedtest: page read speed is 15875 KiB/s
> > > mtd_speedtest: 2 page read speed is 16253 KiB/s
> > > ===================================================
> > >
> > > We observe an overall speed improvement of about 5% when reading
> > > 2 pages, up to 15% when reading an entire block. This is due to the
> > > ~14us gain on each additional page read (tR - (tRCBSY + tRR)).
> > >
> > > Co-developed-by: Miquel Raynal <miquel.raynal at bootlin.com>
> > > Signed-off-by: Miquel Raynal <miquel.raynal at bootlin.com>
> > > Signed-off-by: JaimeLiao <jaimeliao.tw at gmail.com>
> > > ---
> > >  drivers/mtd/nand/raw/nand_base.c | 119 +++++++++++++++++++++++++++++--
> > >  include/linux/mtd/rawnand.h      |   9 +++
> > >  2 files changed, 124 insertions(+), 4 deletions(-)
> >
> > This change broke something on a TI AM3517 based system.  What I'm
> > noticing is that the u-boot fw_setenv tool is failing due to the
> > MEMGETBADBLOCK ioctl reporting some blocks as bad when they are not.
> > Everything else is, somehow, working fine.  Reverting this commit fixes
> > it, though I don't know why.  I'm seeing the same behaviour on multiple
> > devices, so I doubt there is a problem with the flash memory.
> >
> > Is there anything I can test to get more information?
> >