Corrupted UBIFS, bad CRC

Artem Bityutskiy dedekind1 at gmail.com
Wed Jan 18 09:43:22 EST 2012


On Tue, 2012-01-17 at 04:23 -0800, Karsten Jeppesen wrote:
> Artem: Would it help you in any 
> way if I get a some of these units sent to you? They are like 15 x 15 cm
>  Single board. I would send it with all needed (battery included :-) ) 
> and never to be returned  (You can keep everything).

I would be happy to help, but I really have no time to do more than
suggesting and giving some advises - my employer and the baby take all
my time, sorry.

Besides, I am not a flash HW expert, and the issue you observe look like
it is very related to your HW and how it behaves when it loses power
when a write operation is ongoing. Or may be erase operation, but it
looks like that was a write operation. It does not look at all like
UBI/UBIFS issue.

> 1. Where are the patches for 3.2? git://git.infradead.org/~dedekind/ubifs-v3.2.git

Yes.

>  ?? To get the max_write output I changed the dbg_msg to ubi_msg.
> 
> 2. UBI: max_write_size   64
> 3. Confirmed 64 from data sheets

OK.

> 4 Theory unfortunately bust.

Not necessarily. You need to dig deeper - what if your driver is doing
something you are not aware about or the controller? Better to ask the
vendor how the flash behaves on a power cut while writing.

> 5. See below
> 
> Now for the weird part: Setting the write buffer INCORRECTLY to 256
> does mount the system - but is that healthy???: And what are the
> implications of setting it to a 4 times wrong value?

You need to really dig deeper into this. Let me elaborate the concept of
'max_write_size', and also you can find it from git log and by googling
- we discussed this in the mailing list.

So, UBI has a notion of min_io_size - this is minimum amount of bytes
you can write. For NAND this is often 2048. For NOR it is 1 byte.

NORs have optimization called "write-buffer", which means that NOR can
write many bytes at a time.

This "write buffer" size is called 'max_write_size' in UBIFS, to be
consistent with 'min_io_size', and also because UBIFS has its own
write-buffers, so this term has already been occupied when we added
'max_write_size'.

Note, we added 'max_write_size' to fix NOR issues after power cuts, I
think last year.

So what happens when you write data and have a power cut? On the driver
level, you write write-buffer after write-buffer - 'max_write_size'
bytes at a time.  The experiments with NOR showed that the after the
power cut the 'max_write_size' area which you have been writing to
during the power cut will contain garbage, or unstable bits, or zeroes,
or few zeroes, or any other anomaly.

When UBIFS recovers after a power cut, in has to scan the journal and
find the last node. The last node is the one which follows with all 0xFF
bytes.

In your case you have one good node from offset 0 to offset 112 (AFAIR),
then 32 bytes of 0xFFs, and then 32 bytes of zeroes, and then the rest
is all 0xFFs.

So my theory is that your write-buffer size is 256, or you have some
kind of striping, or something, so that when UBIFS submits a 112 bytes
write request, on the driver level a 256-byte write buffer is used, and
actually the area (0, 256) is being programmed, but area 113-226 is
programmed with all 0xFFs.

Or something like that, I do not know NOR well enough.

Anyway, what happens is that due to power cut you end up with random
corruption within that 256 bytes area, so you end up with those zeroes.

UBIFS is aware of this effect and it knows that it should only check for
all 0xFFs starting from the next 'max_write_size'-align offset after the
last node. But because in your case 'max_write_size' is 64, it hits
those zeroes and refuses mounting, because it is unexpected.

I do not think there is a big downside of having 'max_write_size' to be
256 from the performance POW at least. The downside is that in case of
power cut you may lose a bit more data, because UBIFS has its own
write-buffers, but this is really minor.

You can also experiment by forcing your flash to not use write-buffer at
all to verify if the corruptions are related.

-- 
Best Regards,
Artem Bityutskiy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: This is a digitally signed message part
URL: <http://lists.infradead.org/pipermail/linux-mtd/attachments/20120118/c9371062/attachment.sig>


More information about the linux-mtd mailing list