mtdblock caching and syncing

Doug Graham dgraham at nortel.com
Thu Apr 9 12:02:47 EDT 2009


On Thu, Apr 09, 2009 at 10:51:00AM -0400, Josh Boyer wrote:
> On Thu, Apr 09, 2009 at 10:15:56AM -0400, Doug Graham wrote:
> >
> >The problem is that a sync() or fsync() on an mtdblock device does not
> >actually get the data all the way to the flash device.  The mtdblock
> >layer maintains its own cache of a single erase-unit (256KB in my case).
> >If I open /dev/mtdblock0 for writing, write some stuff to it, then call
> >fsync() but do not close the device, up to one erase-unit's worth of
> >data may still be buffered in memory.  This data is only flushed when
> >the device is actually closed (by mtdblock_release).  I think that
> >this violates the intended semantics of sync and fsync.  I shouldn't be
> >required to do a close() to force the data to the device.
> 
> The device in question isn't the flash.  It's the mtdblock device.  So
> fsync semantics are preserved.  This is the same as writing to a file
> on a hard drive, calling fsync, and having it sit in the hard drive's
> cache.

That's a good point, and one I've wondered about before.  I don't know
much about how hard drives manage their cache, but I would assume that
they don't leave dirty data in their cache for an unbounded period
of time.  I'd guess that data is written to the actual disk within a
few 10s of milliseconds after being sent to the device.

In the case of mtdblock, dirty data can stay in the cache forever.

> >I think this is fairly serious bug in a flash-based system, where there
> >are frequently times that you want to make sure that data has actually
> >made it all the way to the device.  I think that a sync() or fsync()
> >really ought to somehow propagate all the way down to the mtdblock layer
> >so that mtdblock can flush its buffer.
> 
> Why are you using mtdblock in a serious flash-based system?  The fact
> that it buffers an entire eraseblock means you risk huge data loss in
> the event of an unclean shutdown anyway (power loss).  No amount of
> sync or fsync will fix that.

We don't use mtdblock during normal operations; we use squashfs and jffs2
(maybe ubifs sometime soon).  But one job that we do use mtdblock for is
burning loads.  We could, and perhaps should, be using the char device
instead to burn loads, except that those require specialized tools to do
erases before writes.  To avoid the need for such specialized tools, we
just use the equivalent of dd on the mtdblock device followed by a sync.
But that doesn't work given the behaviour I'm complaining about.

It's actually a little more complicated that that.  We have a system
comprised of multiple cards.  When upgrading the system from the master
card, we're using NBD to upgrade (some) loads on remote cards.  The NBD
server running on the remote cards never closes the mtdblock device that
it is managing, so the mtdblock_release() method never gets called.
The NDB server cannot using the MTD character device because it knows
nothing about the characteristics of flash, including the need to erase
before writing.  Even if it did know about erasing, we'd want it to do
exactly the same kind of caching the mtdblock already does, so mtdblock
does seem like a good match in this case.  We can certainly modify the
NBD server to close and reopen the device when it needs to be sure that
data has actually been written to flash, but that seems a bit on the
kludgy side, and doesn't help any other applications using mtdblock
(like the dd scheme I mention above).

> >Thoughts?  Suggestions?  Patches?
> 
> Word-weasling aside, if you have patches that fix the behavior you don't
> like, they would certainly be looked at.  Setting pdflush to 5 seconds
> instead of 30 would help a bit, or using the ioctl on the mtdblock device
> that already exists to flush would help too.  However you might want to
> really look at a system design that relies on mtdblock for data integrity.

What's the point of mtdblock then?  All systems care about data integrity
to some degree (some more than others, obviously), so if mtdblock makes
no effort to preserve that integrity, where do you see it ever being
used legitimately?

Thanks very much for your comments.

--Doug.



More information about the linux-mtd mailing list