jffs2: Excess summary entries
Thomas.Betker at rohde-schwarz.com
Thomas.Betker at rohde-schwarz.com
Thu Nov 5 10:36:47 PST 2015
We ran into a problem with jffs2 where the filesystem became unusable
after some specific MTD failures; summary was enabled, write buffering was
disabled:
1. For some reason, mtd_writev() returned -EINTR and *retlen == 0. No
INODE data was written to flash, but jffs2_flash_direct_writev() still
added a summary entry, by jffs2_sum_add_kvec(). Actually, this happened
twice in a row:
jffs2: Write of 4164 bytes at 0x00ec5b88 failed. returned -4,
retlen 0
jffs2: Not marking the space at 0x00ec5b88 as dirty because the
flash driver returned retlen zero
jffs2: Write of 4164 bytes at 0x00ec5b88 failed. returned -4,
retlen 0
jffs2: Not marking the space at 0x00ec5b88 as dirty because the
flash driver returned retlen zero
2. When rebooting after the summary data was written to flash (including
the two excess entries), we got the following messages:
jffs2: error: (79) jffs2_link_node_ref: Adding new ref c3048d18 at
(0x00ec5b88-0x00ec6bcc) not immediately after previous
(0x00ec5b88-0x00ec5b88)
jffs2: error: (79) jffs2_link_node_ref: Adding new ref c3048d20 at
(0x00ec5b88-0x00ec6000) not immediately after previous
(0x00ec5b88-0x00ec5b88)
...
jffs2: warning: (79) jffs2_sum_scan_sumnode: Free size 0xffffdf78
bytes in eraseblock @0x00ec0000 with summary?
jffs2: Checked all inodes but still 0x2088 bytes of unchecked
space?
jffs2: No space for garbage collection. Aborting GC thread
The excess entries added up to "unchecked space", so that
jffs2_garbage_collect_pass() returned -ENOSPC.
3. Since garbage collection was off, the flash filled up over time until
the filesystem could no longer be modified (can't add or write to files,
can't even delete files -- "No space left on device"). Basically, we were
bricked.
Here are the solutions I have considered so far:
The minimal solution would be to check *retlen == 0 in
jffs2_flash_direct_writev() and jffs2_flash_direct_write() before running
jffs2_sum_add_kvec().
if (jffs2_sum_active() && *retlen) {
...
res = jffs2_sum_add_kvec(...)
...
}
The general failure case, though, is (ret != 0 || *retlen != len), where
'ret' is the return code of mtd_writev(), and 'len' is the data size to be
written. When write buffering is enabled, jffs2_flash_writev() in wbuf.c
skips the summary entry in this case; perhaps we should do this in
writev.c as well?
if (jffs2_sum_active() && !ret && *retlen == len) {
...
res = jffs2_sum_add_kvec(...)
...
}
I ran some quick tests, simulating write failures, and it seems that
adding the summary entry doesn't harm when *retlen != 0 [so the minimal
solution would suffice]. This is because the calling function will reserve
the node space, marking it as dirty, and there is no confusion about
unchecked space.
On the other hand, running the same quick tests _without_ adding the
summary entry didn't seem to harm either [so the general solution would
work as well]. It is entirely possible that I have overlooked something,
though.
Any opinions on that? When in doubt, I would provide a patch for the
minimal solution, changing as little as possible. However, it may make
sense to go for the general solution to be consistent with write
buffering.
Best regards,
Thomas Betker
More information about the linux-mtd
mailing list