[PATCH v2 3/3] mtd: rawnand: Support for sequential cache reads

Fri Sep 15 04:20:10 PDT 2023

Hi Miquel,

On Tue, 2023-09-12 at 17:59 +0200, Miquel Raynal wrote:
> Hi Martin,
> 
> martin at geanix.com wrote on Fri, 08 Sep 2023 14:25:59 +0200:
> 
> > Hi Miquel et al.
> > 
> > On Thu, 2023-01-12 at 10:36 +0100, Miquel Raynal wrote:
> > > From: JaimeLiao <jaimeliao.tw at gmail.com>
> > > 
> > > Add support for sequential cache reads for controllers using the
> > > generic
> > > core helpers for their fast read/write helpers.
> > > 
> > > Sequential reads may reduce the overhead when accessing
> > > physically
> > > continuous data by loading in cache the next page while the
> > > previous
> > > page gets sent out on the NAND bus.
> > > 
> > > The ONFI specification provides the following additional commands
> > > to
> > > handle sequential cached reads:
> > > 
> > > * 0x31 - READ CACHE SEQUENTIAL:
> > >   Requires the NAND chip to load the next page into cache while
> > > keeping
> > >   the current cache available for host reads.
> > > * 0x3F - READ CACHE END:
> > >   Tells the NAND chip this is the end of the sequential cache
> > > read,
> > > the
> > >   current cache shall remain accessible for the host but no more
> > >   internal cache loading operation is required.
> > > 
> > > On the bus, a multi page read operation is currently handled like
> > > this:
> > > 
> > >         00 -- ADDR1 -- 30 -- WAIT_RDY (tR+tRR) -- DATA1_IN
> > >         00 -- ADDR2 -- 30 -- WAIT_RDY (tR+tRR) -- DATA2_IN
> > >         00 -- ADDR3 -- 30 -- WAIT_RDY (tR+tRR) -- DATA3_IN
> > > 
> > > Sequential cached reads may instead be achieved with:
> > > 
> > >         00 -- ADDR1 -- 30 -- WAIT_RDY (tR) -- \
> > >                        31 -- WAIT_RDY (tRCBSY+tRR) -- DATA1_IN \
> > >                        31 -- WAIT_RDY (tRCBSY+tRR) -- DATA2_IN \
> > >                        3F -- WAIT_RDY (tRCBSY+tRR) -- DATA3_IN
> > > 
> > > Below are the read speed test results with regular reads and
> > > sequential cached reads, on NXP i.MX6 VAR-SOM-SOLO in mapping
> > > mode
> > > with
> > > a NAND chip characterized with the following timings:
> > > * tR: 20 µs
> > > * tRCBSY: 5 µs
> > > * tRR: 20 ns
> > > and the following geometry:
> > > * device size: 2 MiB
> > > * eraseblock size: 128 kiB
> > > * page size: 2 kiB
> > > 
> > > ============= Normal read @ 33MHz =================
> > > mtd_speedtest: eraseblock read speed is 15633 KiB/s
> > > mtd_speedtest: page read speed is 15515 KiB/s
> > > mtd_speedtest: 2 page read speed is 15398 KiB/s
> > > ===================================================
> > > 
> > > ========= Sequential cache read @ 33MHz ===========
> > > mtd_speedtest: eraseblock read speed is 18285 KiB/s
> > > mtd_speedtest: page read speed is 15875 KiB/s
> > > mtd_speedtest: 2 page read speed is 16253 KiB/s
> > > ===================================================
> > > 
> > > We observe an overall speed improvement of about 5% when reading
> > > 2 pages, up to 15% when reading an entire block. This is due to
> > > the
> > > ~14us gain on each additional page read (tR - (tRCBSY + tRR)).
> > > 
> > > Co-developed-by: Miquel Raynal <miquel.raynal at bootlin.com>
> > > Signed-off-by: Miquel Raynal <miquel.raynal at bootlin.com>
> > > Signed-off-by: JaimeLiao <jaimeliao.tw at gmail.com>  
> > 
> > This patch broke our imx6ull system after doing a kernel upgrade:
> > [    2.921886] ubi0: default fastmap pool size: 100
> > [    2.926612] ubi0: default fastmap WL pool size: 50
> > [    2.931421] ubi0: attaching mtd1
> > [    3.515799] ubi0: scanning is finished
> > [    3.525237] ubi0 error: vtbl_check: bad CRC at record 0:
> > 0x88cdfb6,
> > not 0xffffffff
> > [    3.532937] Volume table record 0 dump:
> > [    3.536783]  reserved_pebs   -1
> > [    3.539932]  alignment       -1
> > [    3.543101]  data_pad        -1
> > [    3.546251]  vol_type        255
> > [    3.549485]  upd_marker      255
> > [    3.552746]  name_len        65535
> > [    3.556155]  1st 5 characters of name: 
> > [    3.560429]  crc             0xffffffff
> > [    3.564294] ubi0 error: vtbl_check: bad CRC at record 0:
> > 0x88cdfb6,
> > not 0xffffffff
> > [    3.571892] Volume table record 0 dump:
> > [    3.575754]  reserved_pebs   -1
> > [    3.578906]  alignment       -1
> > [    3.582049]  data_pad        -1
> > [    3.585216]  vol_type        255
> > [    3.588452]  upd_marker      255
> > [    3.591687]  name_len        65535
> > [    3.595108]  1st 5 characters of name: 
> > [    3.599384]  crc             0xffffffff
> > [    3.603285] ubi0 error: ubi_read_volume_table: both volume
> > tables
> > are corrupted
> > [    3.611460] ubi0 error: ubi_attach_mtd_dev: failed to attach
> > mtd1,
> > error -22
> > [    3.618760] UBI error: cannot attach mtd1
> > [    3.622831] UBI: block: can't open volume on ubi0_4, err=-19
> > [    3.628505] UBI: block: can't open volume on ubi0_6, err=-19
> > [    3.634196] UBI: block: can't open volume on ubi0_7, err=-19
> > 
> > It fails consistently at every attach operation. As mentioned
> > above,
> > this is on an i.MX6ULL and a Toshiba NAND chip:
> > [    0.530121] nand: device found, Manufacturer ID: 0x98, Chip ID:
> > 0xdc
> > [    0.530173] nand: Toshiba NAND 512MiB 3,3V 8-bit
> > [    0.530194] nand: 512 MiB, SLC, erase size: 256 KiB, page size:
> > 4096, OOB size: 128
> > 
> > I'm happy to perform experiments to fix this.
> 
> I received two other reports using another controller and two
> different
> chips, we investigated the timings which could be the issue but found
> nothing relevant. I thought it was specific to the controller (or its
> driver) but if you reproduce on imx6 it must be something else.
> Especially since this series was also tested on imx6. So maybe some
> devices do not really support what they advertise? Or they expect
> another sequence? I need to investigate this further but I am a bit
> clueless.
> > 

For what it's worth: I've tested sequential read on an almost identical
board that has a Micron flash instead of the Toshiba one:

[    1.370336] nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xdc
[    1.376830] nand: Micron MT29F4G08ABAFAWP
[    1.380870] nand: 512 MiB, SLC, erase size: 256 KiB, page size:
4096, OOB size: 256

Fist, ubi seems to be happy:

[    2.702415] ubi0: default fastmap pool size: 100                   
[    2.707138] ubi0: default fastmap WL pool size: 50                 
[    2.711946] ubi0: attaching mtd1                                   
[    3.528830] ubi0: scanning is finished                             
[    3.540626] ubi0: attached mtd1 (name "ubi", size 504 MiB)         
[    3.546235] ubi0: PEB size: 262144 bytes (256 KiB), LEB size: 253952
bytes                                                                 
[    3.553169] ubi0: min./max. I/O unit sizes: 4096/4096, sub-page size
4096                                                                  
[    3.559972] ubi0: VID header offset: 4096 (aligned 4096), data
offset: 8192                                                          
[    3.566968] ubi0: good PEBs: 2012, bad PEBs: 4, corrupted PEBs: 0  
[    3.573096] ubi0: user volume: 9, internal volumes: 1, max. volumes
count: 128                                                            
[    3.580330] ubi0: max/mean erase counter: 5/2, WL threshold: 4096,
image sequence number: 1298270752                                     
[    3.589496] ubi0: available PEBs: 12, total reserved PEBs: 2000,
PEBs reserved for bad PEB handling: 36                                
[    3.598945] ubi0: background thread "ubi_bgt0d" started, PID 35    
[    3.607409] block ubiblock0_4: created from ubi0:4(rootfs.a)       
[    3.615190] block ubiblock0_6: created from ubi0:6(appfs.a)        
[    3.622888] block ubiblock0_7: created from ubi0:7(appfs.b)

But then squashfs fails (we're using ubiblock devices):

[    3.651818] VFS: Mounted root (squashfs filesystem) readonly on
device 254:0.                                                         
[    3.660646] devtmpfs: mounted                                      
[    3.665515] Freeing unused kernel image (initmem) memory: 1024K    
[    3.672816] Run /sbin/init as init process                         
[    3.678097] SQUASHFS error: Unable to read inode 0x8008115d        
[    3.683917] Starting init: /sbin/init exists but couldn't execute it
(error -22)                                                           
[    3.691340] Run /etc/init as init process                          
[    3.697553] Run /bin/init as init process                          
[    3.702982] Run /bin/sh as init process

[    2.702415] ubi0: default fastmap pool size: 100
[    2.707138] ubi0: default fastmap WL pool size: 50
[    2.711946] ubi0: attaching mtd1                                   
[    3.528830] ubi0: scanning is finished                             
[    3.540626] ubi0: attached mtd1 (name "ubi", size 504 MiB)
[    3.546235] ubi0: PEB size: 262144 bytes (256 KiB), LEB size: 253952
bytes
[    3.553169] ubi0: min./max. I/O unit sizes: 4096/4096, sub-page size
4096
[    3.559972] ubi0: VID header offset: 4096 (aligned 4096), data
offset: 8192
[    3.566968] ubi0: good PEBs: 2012, bad PEBs: 4, corrupted PEBs: 0
[    3.573096] ubi0: user volume: 9, internal volumes: 1, max. volumes
count: 128
[    3.580330] ubi0: max/mean erase counter: 5/2, WL threshold: 4096,
image sequence number: 1298270752
[    3.589496] ubi0: available PEBs: 12, total reserved PEBs: 2000,
PEBs reserved for bad PEB handling: 36
[    3.598945] ubi0: background thread "ubi_bgt0d" started, PID 35

I think the patch should be reverted, until we get some more insight.

// Martin

> > // Martin
> > 
> > > ---
> > >  drivers/mtd/nand/raw/nand_base.c | 119
> > > +++++++++++++++++++++++++++++--
> > >  include/linux/mtd/rawnand.h      |   9 +++
> > >  2 files changed, 124 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/drivers/mtd/nand/raw/nand_base.c
> > > b/drivers/mtd/nand/raw/nand_base.c
> > > index 34395d5d3a47..0b1fd6bbb36b 100644
> > > --- a/drivers/mtd/nand/raw/nand_base.c
> > > +++ b/drivers/mtd/nand/raw/nand_base.c
> > > @@ -1208,6 +1208,73 @@ static int
> > > nand_lp_exec_read_page_op(struct
> > > nand_chip *chip, unsigned int page,
> > >         return nand_exec_op(chip, &op);
> > >  }
> > >  
> > > +static int nand_lp_exec_cont_read_page_op(struct nand_chip
> > > *chip,
> > > unsigned int page,
> > > +                                         unsigned int
> > > offset_in_page, void *buf,
> > > +                                         unsigned int len, bool
> > > check_only)
> > > +{
> > > +       const struct nand_interface_config *conf =
> > > +               nand_get_interface_config(chip);
> > > +       u8 addrs[5];
> > > +       struct nand_op_instr start_instrs[] = {
> > > +               NAND_OP_CMD(NAND_CMD_READ0, 0),
> > > +               NAND_OP_ADDR(4, addrs, 0),
> > > +               NAND_OP_CMD(NAND_CMD_READSTART,
> > > NAND_COMMON_TIMING_NS(conf, tWB_max)),
> > > +               NAND_OP_WAIT_RDY(NAND_COMMON_TIMING_MS(conf,
> > > tR_max),
> > > 0),
> > > +               NAND_OP_CMD(NAND_CMD_READCACHESEQ,
> > > NAND_COMMON_TIMING_NS(conf, tWB_max)),
> > > +               NAND_OP_WAIT_RDY(NAND_COMMON_TIMING_MS(conf,
> > > tR_max),
> > > +                                NAND_COMMON_TIMING_NS(conf,
> > > tRR_min)),
> > > +               NAND_OP_DATA_IN(len, buf, 0),
> > > +       };
> > > +       struct nand_op_instr cont_instrs[] = {
> > > +               NAND_OP_CMD(page == chip->cont_read.last_page ?
> > > +                           NAND_CMD_READCACHEEND :
> > > NAND_CMD_READCACHESEQ,
> > > +                           NAND_COMMON_TIMING_NS(conf,
> > > tWB_max)),
> > > +               NAND_OP_WAIT_RDY(NAND_COMMON_TIMING_MS(conf,
> > > tR_max),
> > > +                                NAND_COMMON_TIMING_NS(conf,
> > > tRR_min)),
> > > +               NAND_OP_DATA_IN(len, buf, 0),
> > > +       };
> > > +       struct nand_operation start_op = NAND_OPERATION(chip-
> > > >cur_cs,
> > > start_instrs);
> > > +       struct nand_operation cont_op = NAND_OPERATION(chip-
> > > >cur_cs,
> > > cont_instrs);
> > > +       int ret;
> > > +
> > > +       if (!len) {
> > > +               start_op.ninstrs--;
> > > +               cont_op.ninstrs--;
> > > +       }
> > > +
> > > +       ret = nand_fill_column_cycles(chip, addrs,
> > > offset_in_page);
> > > +       if (ret < 0)
> > > +               return ret;
> > > +
> > > +       addrs[2] = page;
> > > +       addrs[3] = page >> 8;
> > > +
> > > +       if (chip->options & NAND_ROW_ADDR_3) {
> > > +               addrs[4] = page >> 16;
> > > +               start_instrs[1].ctx.addr.naddrs++;
> > > +       }
> > > +
> > > +       /* Check if cache reads are supported */
> > > +       if (check_only) {
> > > +               if (nand_check_op(chip, &start_op) ||
> > > nand_check_op(chip, &cont_op))
> > > +                       return -EOPNOTSUPP;
> > > +
> > > +               return 0;
> > > +       }
> > > +
> > > +       if (page == chip->cont_read.first_page)
> > > +               return nand_exec_op(chip, &start_op);
> > > +       else
> > > +               return nand_exec_op(chip, &cont_op);
> > > +}
> > > +
> > > +static bool rawnand_cont_read_ongoing(struct nand_chip *chip,
> > > unsigned int page)
> > > +{
> > > +       return chip->cont_read.ongoing &&
> > > +               page >= chip->cont_read.first_page &&
> > > +               page <= chip->cont_read.last_page;
> > > +}
> > > +
> > >  /**
> > >   * nand_read_page_op - Do a READ PAGE operation
> > >   * @chip: The NAND chip
> > > @@ -1233,10 +1300,16 @@ int nand_read_page_op(struct nand_chip
> > > *chip,
> > > unsigned int page,
> > >                 return -EINVAL;
> > >  
> > >         if (nand_has_exec_op(chip)) {
> > > -               if (mtd->writesize > 512)
> > > -                       return nand_lp_exec_read_page_op(chip,
> > > page,
> > > -                                                       
> > > offset_in_page, buf,
> > > -                                                        len);
> > > +               if (mtd->writesize > 512) {
> > > +                       if (rawnand_cont_read_ongoing(chip,
> > > page))
> > > +                               return
> > > nand_lp_exec_cont_read_page_op(chip, page,
> > > +                                                                
> > >     
> > > offset_in_page,
> > > +                                                                
> > >     
> > > buf, len, false);
> > > +                       else
> > > +                               return
> > > nand_lp_exec_read_page_op(chip, page,
> > > +                                                               
> > > offset_in_page, buf,
> > > +                                                               
> > > len);
> > > +               }
> > >  
> > >                 return nand_sp_exec_read_page_op(chip, page,
> > > offset_in_page,
> > >                                                  buf, len);
> > > @@ -3353,6 +3426,27 @@ static uint8_t *nand_transfer_oob(struct
> > > nand_chip *chip, uint8_t *oob,
> > >         return NULL;
> > >  }
> > >  
> > > +static void rawnand_enable_cont_reads(struct nand_chip *chip,
> > > unsigned int page,
> > > +                                     u32 readlen, int col)
> > > +{
> > > +       struct mtd_info *mtd = nand_to_mtd(chip);
> > > +
> > > +       if (!chip->controller->supported_op.cont_read)
> > > +               return;
> > > +
> > > +       if ((col && col + readlen < (3 * mtd->writesize)) ||
> > > +           (!col && readlen < (2 * mtd->writesize))) {
> > > +               chip->cont_read.ongoing = false;
> > > +               return;
> > > +       }
> > > +
> > > +       chip->cont_read.ongoing = true;
> > > +       chip->cont_read.first_page = page;
> > > +       if (col)
> > > +               chip->cont_read.first_page++;
> > > +       chip->cont_read.last_page = page + ((readlen >> chip-  
> > > > page_shift) & chip->pagemask);  
> > > +}
> > > +
> > >  /**
> > >   * nand_setup_read_retry - [INTERN] Set the READ RETRY mode
> > >   * @chip: NAND chip object
> > > @@ -3426,6 +3520,8 @@ static int nand_do_read_ops(struct
> > > nand_chip
> > > *chip, loff_t from,
> > >         oob = ops->oobbuf;
> > >         oob_required = oob ? 1 : 0;
> > >  
> > > +       rawnand_enable_cont_reads(chip, page, readlen, col);
> > > +
> > >         while (1) {
> > >                 struct mtd_ecc_stats ecc_stats = mtd->ecc_stats;
> > >  
> > > @@ -5009,12 +5105,27 @@ static void
> > > rawnand_early_check_supported_ops(struct nand_chip *chip)
> > >         rawnand_check_data_only_read_support(chip);
> > >  }
> > >  
> > > +static void rawnand_check_cont_read_support(struct nand_chip
> > > *chip)
> > > +{
> > > +       struct mtd_info *mtd = nand_to_mtd(chip);
> > > +
> > > +       if (chip->read_retries)
> > > +               return;
> > > +
> > > +       if (!nand_lp_exec_cont_read_page_op(chip, 0, 0, NULL,
> > > +                                           mtd->writesize,
> > > true))
> > > +               chip->controller->supported_op.cont_read = 1;
> > > +}
> > > +
> > >  static void rawnand_late_check_supported_ops(struct nand_chip
> > > *chip)
> > >  {
> > >         /* The supported_op fields should not be set by
> > > individual
> > > drivers */
> > > +       WARN_ON_ONCE(chip->controller->supported_op.cont_read);
> > >  
> > >         if (!nand_has_exec_op(chip))
> > >                 return;
> > > +
> > > +       rawnand_check_cont_read_support(chip);
> > >  }
> > >  
> > >  /*
> > > diff --git a/include/linux/mtd/rawnand.h
> > > b/include/linux/mtd/rawnand.h
> > > index 28c5dce782dd..1b0936fe3c6e 100644
> > > --- a/include/linux/mtd/rawnand.h
> > > +++ b/include/linux/mtd/rawnand.h
> > > @@ -67,6 +67,8 @@ struct gpio_desc;
> > >  
> > >  /* Extended commands for large page devices */
> > >  #define NAND_CMD_READSTART     0x30
> > > +#define NAND_CMD_READCACHESEQ  0x31
> > > +#define NAND_CMD_READCACHEEND  0x3f
> > >  #define NAND_CMD_RNDOUTSTART   0xE0
> > >  #define NAND_CMD_CACHEDPROG    0x15
> > >  
> > > @@ -1099,12 +1101,14 @@ struct nand_controller_ops {
> > >   * @supported_op.data_only_read: The controller supports reading
> > > more data from
> > >   *                     the bus without restarting an entire read
> > > operation nor
> > >   *                     changing the column.
> > > + * @supported_op.cont_read: The controller supports sequential
> > > cache
> > > reads.
> > >   */
> > >  struct nand_controller {
> > >         struct mutex lock;
> > >         const struct nand_controller_ops *ops;
> > >         struct {
> > >                 unsigned int data_only_read: 1;
> > > +               unsigned int cont_read: 1;
> > >         } supported_op;
> > >  };
> > >  
> > > @@ -1308,6 +1312,11 @@ struct nand_chip {
> > >         int read_retries;
> > >         struct nand_secure_region *secure_regions;
> > >         u8 nr_secure_regions;
> > > +       struct {
> > > +               bool ongoing;
> > > +               unsigned int first_page;
> > > +               unsigned int last_page;
> > > +       } cont_read;
> > >  
> > >         /* Externals */
> > >         struct nand_controller *controller;  
> >