State of read-only filesystems in NAND / MTD bad blocks handling when reading

Tue May 8 06:44:19 EDT 2012

Thilo:

> Please keep in mind that re-write events (or even bit scrubbing)
> should be extremely rare since, well, we're r/o.

  I'm not sure this assumption holds on modern NAND Flashes,
  especially for the highest-density, lowest-cost MLC (and
  now, even a few TLC!) devices. On these devices, the
  difference between bits is down to 20-30 electrons on
  the floating gate. These devices also have relatively
  high rates of "read disturb" errors so there can't be
  any such thing as a read-only Flash, only a read-mostly
  Flash.

                          Atlant

-----Original Message-----
From: linux-mtd-bounces at lists.infradead.org [mailto:linux-mtd-bounces at lists.infradead.org] On Behalf Of Thilo Fromm
Sent: Tuesday, May 08, 2012 06:09
To: Ricard Wanderlof
Cc: linux-mtd at lists.infradead.org
Subject: Re: State of read-only filesystems in NAND / MTD bad blocks handling when reading

Hello Ricard,

>>> By putting a bunch of UBI volumes in one UBI partition you also get the
>>> benifit of having UBI manage bad blocks for you, and you can have one
>>> large
>>> pool of bad blocks rather than an individual pool of bad blocks for each
>>> mtd, which in the end translates to better usage of the flash.
>>>
>>> These things may or may not be important to you, depending on how much
>>> flash you have and how reliable it is.
>>
>>
>> The most important point for me is whether you use your flash, or a
>> specific partition of your flash, in read-only or read-write mode.
>> When operating in r/o mode I don't quite see a point in most of the
>> more advanced stuff UBI does, e.g. wear leveling.
>
>
> One could argue though that once you start using UBI for bit scrubbing, then
> wear levelling becomes an issue even on a r/o file system.

I do understand your argument and I can follow you with this - it's a
conclusion easily reached from the r/w point of view. That is, if you
consider r/o functionality a subset of r/w functionality, then it's
easy to re-use all the comfy functions from r/w that you already have
at your disposal.

However, from a read-only point of view even with scrubbing there's
still no absolute necessity for wear leveling. Since we do bad block
mapping anyways and since writes are very rare (bit scrubbing and
full-image writes) wearing off the blocks evenly is not a requirement.
If a block dies during image write, then it is mapped out and never
used again. If a block dies during bit scrub, one could map it, then
re-write all succeeding blocks. With a MTD only allowing for r/o
access even a robust implementation of scrubbing / block rewrite
should be easy to do. With r/w this would be madness.

Please keep in mind that re-write events (or even bit scrubbing)
should be extremely rare since, well, we're r/o. My argument here is
that when limiting oneself to r/o access a lot of these robustness
features can be provided in a much simpler way, allowing for code that
is both simple and easy to maintain. Also I can do this at a low level
(the MTD level) instead of a higher layer.

>> I do miss a very simple r/o layer, though. I think there is a strong
>> use case for read-only file systems; especially in the embedded world.
>> And all things necessary - bad-block handling, scrubbing -
>> can be abstracted well enough for a simple r/o block device at a very
>> low level.
>
>
> I'm not too sure. There is a r/o mode in UBI, so-called static volumes,
> which have a fixed size after creation or writing data to them, and have
> built-in checksums for validation. They are primarily intended for blobs
> such as a kernel, but can also be used for read/write file systems.

How would I put a squashfs into that and then mount it upon boot?

> My biggest gripe about adding another "simple" r/o layer is this: Since UBI
> is stable and maintained, it seems a bit pointless to dilute future
> development and maintenence efforts by introducing another layer which
> basically is a subset of UBI. For instance, future changes to the mtd API
> that affect the layers above will have to be maintained not just for UBI but
> for other mtd users as well. Sure, a parallel layer may be simpler, but with
> UBI already in place and working well, I'm not convinced it actually makes a
> practical difference in the end.

I do understand your skepticism. And you're right with the extra
maintenance effort required. I think we differ in opinion only
concerning the complexity of a read-only block access implementation
w/ bad block skipping and bit scrubbing. How about you take a look at
blockrom.c, and maybe at blockrom_scrub.c after I come up with this
(probably next week, or the week after that). Then we can decide about
the implementation's usability, its weaknesses, and most importantly,
its complexity (especially concerning maintainability).

You once wrote that flashes are actually not disks, and that some of
the specifics of flashes must be exposed to upper layers in order to
be handled adequately. From a r/w point of view I do follow your
argument completely. However, I think that a flash device allowing
only for r/o generic accesses can indeed be reasonably abstracted to
look exactly like a disk.

And I think it can be done with only few code, so it should be of low
complexity, easy to maintain, while still supporting a (imho) strong
use case (i.e. a generic r/o fs in a MTD block device).

> Especially if you are using UBI in your system anyway, but not necessarily
> for you root file system, adding another parallel layer seems rather
> pointless. You've got UBI in your system anyway, why add something else that
> does the same thing just because UBI seems overly convoluted for readonly
> file systems. It does the job and does it well and doesn't cost anything
> since it's there anyway.

I want a simple (and guaranteed r/o) root fs so that if anybody
power-circles our devices at any point I'd still end up with a working
root fs. I can guarantee this behavior with MTD + blockrom because the
stack is simple enough to see through. I would not be able to
guarantee this with an UBI root fs which is inherently writable, even
if mounted read-only. Writes would happen because wear leveling, bit
scrubbing and all the other maintenance operations must account for a
r/w device. Way more complex, way more risky.

> If you are really pressed for space in the kernel (in terms of required
> space on the flash or the amount of RAM it consumes run time) I could
> understand that replacing the Big and Grand UBI with something smaller might
> have some value.

It's not actually a small footprint requirement I'm trying to fulfill
but a low-complexity one for robustness reasons. But you're right,
another use case for the blockrom FTL would be if you're low on space.

Regards,
Thilo

--
Dipl.-Ing (FH) Thilo Fromm, MSc., Embedded Systems Architect
DResearch Fahrzeugelektronik GmbH
Otto-Schmirgal-Str. 3, D-10319 Berlin, Germany
Tel: +49 (30) 515 932 228   mailto:fromm at dresearch-fe.de
Fax: +49 (30) 515 932 77    http://www.dresearch.de
Amtsgericht: Berlin Charlottenburg, HRB 130120 B
Ust.-IDNr. DE273952058
Geschäftsführer: Dr. M. Weber, W. Mögle

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

 Click https://www.mailcontrol.com/sr/DWXDFJBQj0jTndxI!oX7UskfEqpntbSQkv+pTFoDsZWAF3dlp+5oPQ7EuxmfksU1rEhty3NkVVu6Rmyxrg+4Pw==  to report this email as spam.

This e-mail and the information, including any attachments, it contains are intended to be a confidential communication only to the person or entity to whom it is addressed and may contain information that is privileged. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please immediately notify the sender and destroy the original message.

Thank you.

Please consider the environment before printing this email.