ubifs: sync() causes writes even if nothing is changed

Mon Jan 17 03:19:04 EST 2011

On 16/01/11 19:48, ext Artem Bityutskiy wrote:
> On Wed, 2010-10-13 at 18:30 +0200, Hans J. Koch wrote:
>> Running this command:
>>
>> # while true ; do sync; sleep 1; done
>>
>> causes two eraseblocks being erased every second, although there
>> are no writes to the ubifs filesystem. I hacked some printks into
>> my NAND driver that print page_address and column for each erase.
>> With that, I get this output every second:
>>
>> ...
>> [   63.701765] erase p=0x0000ae40 c=0xffffffff
>> [   63.706534] erase p=0xffffffff c=0xffffffff
>> [   63.725492] erase p=0x0000ae80 c=0xffffffff
>> [   63.730260] erase p=0xffffffff c=0xffffffff
>> ...
>>
>>  From a quick glance at the ubifs code, this might come out of the
>> garbage collector that is triggered on every sync() and writes
>> something even if nothing has changed.
>
> With nandsim I only can see one erase, but this is anyway suboptimal.
> The below patch should fix the issue, please, test if you can. I've also
> pushed it to ubifs-2.6.git.
>
>
>> From dca0fe61489805e0eb4ada7c6922856ca91eae52 Mon Sep 17 00:00:00 2001
> From: Artem Bityutskiy<Artem.Bityutskiy at nokia.com>
> Date: Sun, 16 Jan 2011 19:22:02 +0200
> Subject: [PATCH] UBIFS: do not start the commit if there is nothing to commit
>
> This patch fixes suboptimal UBIFS 'sync_fs()' implementation which causes
> flash I/O even if the file-system is synchronized. E.g., a 'printk()'
> in the MTD erasure function (e.g., 'nand_erase_nand()') can show that
> for every 'sync' shell command UBIFS erases at least one eraseblock.
>
> So '$ while true; do sync; done' will cause huge amount of flash I/O.
>
> The reason for this is that UBIFS commits in 'sync_fs()', and starts the
> commit even if there is nothing to commit, e.g., it anyway changes the
> log. This patch adds a check in the 'do_commit()' UBIFS functions which
> prevents the commit if there are not dirty znodes (hence, nothing to
> commit).

Possibly the LPT should be checked also.  Perhaps it can be dirty due
to trivial garbage collection.

Also, have you checked there are no degenerate cases where the commit
is required for some other reason such as consolidating the log or the
recovery  commit?

>
> Reported-by: Hans J. Koch<hjk at linutronix.de>
> Signed-off-by: Artem Bityutskiy<Artem.Bityutskiy at nokia.com>
> ---
>   fs/ubifs/commit.c |   17 ++++++++++++++++-
>   1 files changed, 16 insertions(+), 1 deletions(-)
>
> diff --git a/fs/ubifs/commit.c b/fs/ubifs/commit.c
> index 02429d8..a963d96 100644
> --- a/fs/ubifs/commit.c
> +++ b/fs/ubifs/commit.c
> @@ -70,6 +70,21 @@ static int do_commit(struct ubifs_info *c)
>   		goto out_up;
>   	}
>
> +	/*
> +	 * Every file-system change changes the TNC, and makes the root znode
> +	 * dirty. So if the root znode is clean we can just return immediately
> +	 * because there must be nothing to commit. Note, se do not have to
> +	 * lock @c->tnc_mutex because we have @c->commit_sem in write mode,
> +	 * which guarantees that no one else can access TNC functions
> +	 * concurrently.
> +	 */
> +	if (!c->zroot.znode || !test_bit(DIRTY_ZNODE,&c->zroot.znode->flags)) {
> +		ubifs_assert(atomic_long_read(&c->dirty_zn_cnt) == 0);
> +		err = 0;
> +		up_write(&c->commit_sem);
> +		goto out_cancel;
> +	}
> +
>   	/* Sync all write buffers (necessary for recovery) */
>   	for (i = 0; i<  c->jhead_cnt; i++) {
>   		err = ubifs_wbuf_sync(&c->jheads[i].wbuf);
> @@ -162,12 +177,12 @@ static int do_commit(struct ubifs_info *c)
>   	if (err)
>   		goto out;
>
> +out_cancel:
>   	spin_lock(&c->cs_lock);
>   	c->cmt_state = COMMIT_RESTING;
>   	wake_up(&c->cmt_wq);
>   	dbg_cmt("commit end");
>   	spin_unlock(&c->cs_lock);
> -
>   	return 0;
>
>   out_up: