Cached NAND reads and UBIFS

Wed Jul 13 06:13:09 PDT 2016

On Wed, 2016-07-13 at 14:43 +0200, Boris Brezillon wrote:
> On Wed, 13 Jul 2016 14:30:01 +0200
> Richard Weinberger <richard at nod.at> wrote:
> 
> > Hi!
> > 
> > As discussed on IRC, Boris and I figured that on our target UBIFS
> > is sometimes
> > very slow.
> > i.e. deleting a 1GiB file right after a reboot takes more than 30
> > seconds.
> > 
> > When deleting a file with a cold TNC UBIFS has to lookup a lot of
> > znodes
> > on the flash.
> > For every single znode lookup UBIFS requests a few bytes from the
> > flash.
> > This is slow.
> > 
> > After some investigation we found out that the NAND read cache is
> > disabled
> > when the NAND driver supports reading subpages.
> > So we removed the NAND_SUBPAGE_READ flag from the driver and
> > suddenly
> > lookups were fast. Really fast. Deleting a 1GiB took less than 5
> > seconds.
> > Since on our MLC NAND a page is 16KiB many znodes can be read very
> > fast
> > directly out of the NAND read cache.
> > The read cache helps here a lot because in the regular case UBIFS'
> > index
> > nodes are linearly stored in a LEB.
> > 
> > The TNC seems to assume that it can do a lot of short reads since
> > the NAND
> > read cache will help.
> > But as soon subpage reads are possible this assumption is no longer
> > true.
> > 
> > Now we're not sure what do do, should we implement bulk reading in
> > the TNC
> > code or improve NAND read caching?
> 
> Hm, NAND page caching is something I'd like to get rid of at some
> point, and this for several reasons:
> 
> 1/ it brings some confusion in NAND controller drivers, where those
> don't know when they are allowed to use chip->buffer, and what to do
> with ->pagebuf in this case

Yes, it adds complexity because it is not a separate caching layer but
rather "built-in" into the logic, sprinkled around.

> 2/ caching is already implemented at the FS level, so I'm not sure we
> really need another level of caching at the MTD/NAND level (except
> for
> those specific use cases where the MTD user relies on this caching to
> improve accesses to small contiguous chunks)

Well, FS is caching stuff, but device level caching is still useful.
E.g., UBI decides to move things around, things get cached, and when
UBIFS reads the things, it picks them from the cache.

Disk blocks are also cached in Linux separately from the FS level
cache.

> 3/ it hides the real number of bitflips in a given page: say someone
> is
> reading over and over the same page, the MTD user will never be able
> to
> detect when the number of bitflips exceed the threshold. This should
> not be a problem in real world, because MTD users are unlikely to
> always
> read the same page without reading other pages in the meantime, but
> still, I think it adds some confusion, especially if one wants to
> write
> a test that reads over and over the same page to see the impact of
> read-disturb.

Well, I think this is not a blocker problem, more of a complication
that caching introduces. Indeed, I was working with different kind of
caches, e.g., implementing my own custom caching for my custom user-
space scripts, and caches always introduces extra complexity. That's
the price to pay.