atomic file operations

Thu Mar 24 05:11:43 EST 2005

Hi Sergei,

more info below.

Sergei Sharonov wrote:
> > No write operation is guaranteed to be atomic. Have a look
> > at jffs2_write_inode_range in write.c : if there is not enough
> > space in the current block for the whole data, it may be split
> > into several chunks. Additionally write ops that overlap a
> > cache page boundary (not a flash page) are always split at
> > the page limit.
> 
> That means that one write may have several CRCs corresponding to
> splinter chunks?

Yes, when I write that the input buffer is split it means that
several data nodes are written to the flash - each data node
is an independent piece of data complete with header and CRC.
If a data node is only partly written to flash, its CRC check
will fail so the partial data will not be taken into account
when building the file at next mount. In this sense each data
node is an atomic write - but JFFS2 does not guarantee that
a write() input buffer will be written as a single data node.

> > If you want to have atomic writes, you could:
> > 1) Mandatorily: ensure that your application will not
> > issue write ops which overlap a page boundary.
> > You should not tweak the JFFS2 code to write such
> > overlapping nodes, otherwise you must also tweak
> > the GC and it gets difficult.
> > 2) Either tweak jffs2_write_inode_range to forbid
> > splitting data which does not overlap a page boundary
> > or adjust JFFS2_MIN_DATA_LEN to reserve enough
> > space (difficult to estimate maybe if you have
> > compression...).
> >
> > The above tweaking should ensure that an input buffer
> > is written to JFFS2 FS as a single CRC-protected
> > data node.
> 
> Ok, got that. Does not seem like a promissing idea considering
> how fast jffs2 evolves and therefore how bad forking would be.
> Thansk for the suggestion anyway.

You can always submit your patch to the list and then
either someone will merge it for you, or you can ask
for a CVS account to do it yourself.
It could be a conditionally-compiled option. Or maybe 
there is an appropriate fcntl or open flag that could 
be implemented in JFFS2  ?
Anyway I think it would be an interesting option to 
have. The main problem is the cache page boundary 
which would require more thinking about to solve 
and lots of testing...

> > You should be aware that on NAND flash JFFS2 uses
> > a (nand flash) page buffer (wbuf.c), which is flushed
> > only on fsync/sync/umount. So even though your write
> > ops will be atomic (with above code tweaks),
> > there is no guarantee that a buffer is effectively
> > committed to flash when write() returns, because the
> > end of the data node may remain in the buffer.
> > If you want that also, you can tweak JFFS2 again
> > by requiring a  wbuf flush after each "atomic write",
> > or you can have your application call fsync after
> > each write.
> 
> Beg pardon if it is FAQ, but if I open the file with O_SYNC
> flag, wouldn't that guarantee synchronous write that does not
> return untill all the data is in flash?

I am not familiar with Linux VFS, however from previous 
discussion on the list I was led to understand that
it doesn't work with JFFS2. Probably you could implement 
O_SYNC yourself without too much trouble.

bye
Estelle