Benchmarking JFFS2

Jarkko Lavinen jlavi at iki.fi
Thu May 2 08:56:02 EDT 2002


I have been running simple benchmark program that is based on over 10
years old Byte Unix file-system benchmark. I have modified the program
to measure time using gettimeofday(), report throughput better, collect
latency profiles and report about memory consumption.

I have then measured the write throughput relation to the block
size. I am using embedded device with Arm9 CPU @ 120 Mhz, 8MB of ram
and Intel 4MiB NOR flash (28F320B3T) and kernel 2.4.15. My JFFS2 code
is from CVS snapshot in February 2002 and may be too old as well as
the kernel.

I am attaching the test program, sample parameters and a figure of
results seen on particular device. The figure shows two curves. In the
first curve, an upside down "V" is show. In another curve there is a
bent "S" shape.

I first tried running the benchmark runs in sequence, with random
data. The best performance 27000 bytes/s was achieved using blocks
sizes of 1024 +- 512 bytes. With block size of 4KiB the performance is
only 50% of the peak throughput.

Then I run the test using natively formatted file-system. I erased the
flash partition, then mounted it, unmounted and then again mounted
it. After that I run only single benchmark run so that the file
created didn't fill the file-system and no overwriting occurred nor
garbage was produced. The performance increased steadily and at block
size multiples of 4KiB achieved about 56000 bytes/s. The raw writing
speed through /dev/mtd0 is 67000 B/s

Question 1:

Is the lack of performance at higher block sizes normal?

Question 2:

Is the lack of performance at higher blocks sizes due to garbage
collection?


I have also tried to run the tests using linear data that compresses
easily. I have encountered repeatedly very low memory and out of
memory condition and messages like "Memory fail", "deflateInit failed"
and when the memory really runs out repeated "Out of memory". I don't
think a benchmark program should be able to bring the system to its
knwws simply by exercising the file-system. I wouldn't bet on the
stability and maturity of the embedded device either.

To bring out this behavior I run the test with blocks sizes of 1K to
128K with proportional steps. Typically somewhere around 2K I start to
see first messages and at 4 KiB the system has run out of memory. It
is not enough to run single benchmark alone but effect seems to be
cumulative and requires sustained loading of the file-system.

Regards,
Jarkko Lavinen


------------------------------------------------------------------------------
Some output from the program when memory problems occur:

Running fstime 60 seconds, 1722 byte block size, linear data, max size 17500
Write test: 61.4s elapsed, 1218 blocks, 2.00 MB @ 33.4 Kbytes/sec written.
Read test: 60.0s elapsed, 628782 blocks, 1.01 GB @ 17.2 Mbytes/sec read.
Copy test: Memory fault

Running fstime 60 seconds, 3444 byte block size, linear data, max size 17500
Write test: 61.5s elapsed, 821 blocks, 2.70 MB @ 44.9 Kbytes/sec written.
Read test: 60.0s elapsed, 330471 blocks, 1.06 GB @ 18.1 Mbytes/sec read.
Copy test: deflateInit failed
deflateInit failed
deflateInit failed
60.2s elapsed, 1086 blocks, 3.57 MB @ 60.7 Kbytes/sec copied.
# 3444 45976.000000 18969034.000000 62160.281250 1777664 5234688 24576 2445312 1908736 5103616 24576 4820992 1581056 5431296 24576 5226496 2232320 4780032 24576 4587520 2232320 4780032 24576 2392064

Running fstime 60 seconds, 4096 byte block size, linear data, max size 17500
Write test: 60.8s elapsed, 1739 blocks, 6.79 MB @ 114 Kbytes/sec written.
Read test: 60.0s elapsed, 19406 blocks, 75.8 MB @ 1.26 Mbytes/sec read.
deflateInit failed
Copy test: deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
Out of Memory: Killed process 8 (sh).
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
Out of Memory: Killed process 8 (sh).
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
deflateInit failed
Out of Memory: Killed process 8 (sh).
Out of Memory: Killed process 8 (sh).
Out of Memory: Killed process 8 (sh).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: combined.png
Type: image/png
Size: 5142 bytes
Desc: not available
Url : http://lists.infradead.org/pipermail/linux-mtd/attachments/20020502/54627a5b/attachment.png 
-------------- next part --------------
/* fstest.c. baed on old Byte Unix filsystem bechmark */

#include <stdio.h>
#include <signal.h>
#include <errno.h>
#include <sys/time.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>
#include <stdlib.h>
#include <string.h>

#define SECONDS      10
#define BUFF_SIZE    131072
#define BLOCK_SIZE   1024 /* default block size */
#define MAX_BLOCKS   2000  /* max number of BUFF_SIZE blocks in file */
#define MAX_SIZE     3584  /* Some default flash size limit */
#define LATENCYSCALE 6
#define FILE0        "testfile0"
#define FILE1        "testfile1"

/* Globals */
int  block_size = BLOCK_SIZE;
int  max_size =   MAX_SIZE;
int  max_blocks = MAX_BLOCKS;
int  seconds    = SECONDS;
int  sigalarm   = 0;
int  verbose    = 0;

char buf[BUFF_SIZE];
int  latencylogging = 0;
int  fd0, fd1;
unsigned w_latencies[LATENCYSCALE][100], 
  c_latencies[LATENCYSCALE][100];

struct fs_stat_st {
  float speed;
  unsigned long used[4];
} w_stat, r_stat, c_stat;

int w_test(void);
int r_test(void);
int c_test(void);
void stop_count(int);
void clean_up(int prob);

float elapsed_seconds(struct timeval *start, struct timeval *end)
{
  float secs = (end->tv_sec - start->tv_sec) + 
    (end->tv_usec - start->tv_usec) / 1000000.0;
  if (secs < 0) secs = 0.0;
  return secs;
}

static void floatprint(FILE *fp, float f, int decimals)
{
  char fmt[10];
  int intdigits;

  if (f >= 100.0)
    intdigits=3;
  else if(f >= 10.0)
    intdigits=2;
  else
    intdigits=1;
		
  decimals -= intdigits;
  if (decimals < 0)
    decimals = 0;
  sprintf(fmt, "%%.%df ", decimals);	
  fprintf(fp, fmt, f);
}

static void floatscaleprint(FILE *fp, float f, int decimals)
{
  if (f >= 1024*1024*1024.0) {
    floatprint(stderr, f/(1024*1024*1024.0), decimals);
    fputc('G', stderr);
  } else if (f >= 1024*1024.0) {
    floatprint(stderr, f/(1024*1024.0), decimals);
    fputc('M', stderr);
  } else if (f >= 1024.0) {
    floatprint(stderr, f/1024.0, decimals);
    fputc('K', stderr);
  } else floatprint(stderr, f, decimals);
}

void skipff(FILE *fp)
{
  int c;
	
  c = getc(fp);
  while (c != EOF && c != '\n')
    c = getc(fp);
}

static void freemem(unsigned long *used)
{
  FILE *fp;
  int n;
	
  if (used == NULL)
    return;

  if ((fp = fopen("/proc/meminfo", "r")) == NULL) {
    fprintf(stderr, "Cannot open /proc/meminfo\n");
    exit(10);
  }

  skipff(fp);

  n = fscanf(fp, "Mem: %*u %lu %lu %*u %lu %lu",
	     &used[0], &used[1], &used[2], &used[3]);

  if (n >= 4) {
    used[1] += used[2] + used[3];
    used[0] -= used[2] + used[3];
  }

  fclose(fp);
}

static void report(float elapsed, int n_blocks, char *opdonename,
		   struct fs_stat_st *stat)
{
  float size = ((float) n_blocks * block_size);
  float speed = size / elapsed;

  if (verbose) {
    fprintf(stderr, "%.1fs elapsed, %d blocks, ",
	    elapsed, n_blocks);

    floatscaleprint(stderr, size, 3);
    fprintf(stderr,"B @ ");

    floatscaleprint(stderr, speed, 3);
    fprintf(stderr, "bytes/sec %s.\n", opdonename);
  }

  if (stat != NULL) {
    stat->speed = speed;
    freemem(stat->used);
  }
	       
  return;
}

static void loglatency(unsigned latencies[LATENCYSCALE][100],
		       struct timeval *before,
		       struct timeval *after)
{
  long usecs = (after->tv_sec - before->tv_sec)*1000000+
    (after->tv_usec - before->tv_usec);
  unsigned ix;

  if (usecs < 0)
    usecs = 0;

  if (usecs < 1000) { /* below 1 ms */
    ix = usecs / 10; /* max 990 us => 99 */
    latencies[0][ix]++;
  } else if (usecs < 10000) { /* 1 .. 10 ms */
    ix = usecs / 100; /* min 1000 us, max 9900 us  */
    latencies[1][ix]++;
  } else if (usecs < 100000) { /* 10.0 .. 99.9 ms */
    ix = usecs / 1000; /* max 99000 us */
    latencies[2][ix]++;
  } else if (usecs < 1000000) { /* 100 .. 999 ms */
    ix = usecs / 10000; /* min 100 000 max 990 000 us */
    latencies[3][ix]++;
  } else if (usecs < 10000000) { /* 1.. 9.9s s */
    ix = usecs / 100000; /* max 9 900 000 us */
    latencies[4][ix]++;
  } else { /* 10 .. 999 s */
    ix = usecs / 1000000; /* max 99 0900 000 us */
    if (ix > 99)
      ix = 99 ;
    latencies[5][ix]++;
  }
}

void latencystats(char *name, unsigned latencies[LATENCYSCALE][100])
{
  float lat,sum = 0.0;
  float min = 1e6, max = 0.0;
  unsigned cnt = 0, medcnt = 0;
  int i, j;
  char *fmt[] = {"%.2f %u\n", "%.1f %u\n", "%.0f %u\n"};
  float mul[LATENCYSCALE] = {0.01, 0.1, 1.0, 10, 100, 1000}; 
  float prev, median=-1;
	
  printf("Latency profile for %s\n", name);

  for(j = 0; j < LATENCYSCALE; j++) {
    char *s = fmt[j < 2 ? j : 2 ];

    /* The first latency array covers 0..990 us with 10us granularity,
       the next 1.0 .. 9.9 ms with 0.1ms granularty, then 10 .. 99 ms with 1ms
       granularty, then 100 .. 990 with 10 ms and so on. */
    for(i = (j > 0 ? 10 : 0); i < 100; i++)
      if (latencies[j][i]) {
	lat = i * mul[j];
	sum += lat*latencies[j][i];
	cnt += latencies[j][i];
	if (lat < min)
	  min = lat;
	if (lat > max)
	  max = lat;

	printf(s, lat, latencies[j][i]);
      }	
  }

  if (cnt > 0) {		
    prev=0;
    for(j = 0; j < LATENCYSCALE; j++) {
      for(i = (j > 0 ? 10 : 0); i < 100; i++) {
	if ((medcnt + latencies[j][i])*2 > cnt) {
	  median = (prev*medcnt + 
		    (i+1)*mul[j]*(cnt - medcnt)) / cnt;
	  goto loopout;
	}
	medcnt += latencies[j][i];
	prev = i*mul[j];
      }
    }
  loopout:

    printf("Avg latency:     %9.3g ms\n", sum / cnt);
    printf("Median latency:  %9.3g ms\n", median);
    printf("Minimum latency: %9g ms\n", min);
    printf("Maximum latency: %9g ms\n", max);
  }
}



/******************** MAIN ****************************/

int main(int argc, char *argv[])
{
  int random = 0;
  unsigned long begin[4], end[4];
  int i, c;
  char *dir = NULL;
	
  char *usage="fstime [-t seconds] [-b blocks] [-d dir] [-m max size]\n";

  do {
    /* Use posixly correct flag '+' */
    c = getopt (argc, argv, "b:t:d:m:vrh?l");
    switch (c) {
    case -1:
      break;
    case 'v':
      verbose++;
      break;
    case 'b':
      block_size = atoi(optarg);
      break;
    case 'd':
      dir=optarg;
      break;

    case 't':
      seconds = atoi(optarg);
      break;
    case 'r':
      random=1;
      break;
    case 'm':
      max_size=atoi(optarg);
      break;
    case 'l':
      latencylogging = 1;
      break;
    case 'h':
    case '?':
      fputs(usage, stderr);
      exit(0);
	
    default:
      printf ("Unknown option '%c'\n", c);
      break;
    }
  } while (c != -1);

  /**** initialize ****/
  if (seconds <= 0) {
    fprintf(stderr, "No time given\n");
    exit(1);
  }
	
  if (dir == NULL) {
    fprintf(stderr, "No test directory given\n");
    exit(1);
  }
  if (chdir(dir) == -1) {
    perror("fstime: chdir");
    exit(1);
  }

  if (block_size <= 0 || block_size > BUFF_SIZE) {
    fprintf(stderr, 
	    "Illegal blocksize. Must be +- 1..%d\n",
	    BUFF_SIZE);
    exit(1);
  }

  max_blocks = max_size*1024 / block_size;

  freemem(&begin[0]);

  if (verbose)
    printf("Running fstime %d seconds, %d byte block size, %s data, max size %d\n",
	   seconds,
	   block_size,
	   random?"random":"linear",
	   max_size);
  
  if ((fd0 = open(FILE0, O_CREAT | O_TRUNC | O_RDWR, 0666)) == -1) {
    perror("fstime: open");
    exit(1);
  }

  if ((fd1 = open(FILE1, O_CREAT|O_TRUNC|O_RDWR, 0666)) == -1) {
    perror("fstime: open");
    exit(1);
  }

  /* fill buffer with random or linear data */
  for (i = 0; i < BUFF_SIZE; i++) {
    if (random)
      buf[i] = rand() & 0xff;
    else
      buf[i] = i & 0xff;
  }
	
  if (latencylogging) 
    for (i = 0; i < 100; i++) {
      int j;
      for(j = 0; j < LATENCYSCALE; j++) {
	w_latencies[j][i] = 0;
	c_latencies[j][i] = 0;
      }
    }
	

  signal(SIGKILL, clean_up);
  if(w_test() || r_test() || c_test()) { 
    clean_up(0);
    exit(1);
  }
  
  clean_up(0);
  freemem(&end[0]);
  
  if (verbose > 1) 
    printf("Fields:\t 1  blocksize\n"
	   "\t 2 write speed\n"
	   "\t 3 read speed\n"
	   "\t 4 copy speed\n"
			    
	   "\t 5 bytes available at begin before opening files\n"
	   "\t 6 bytes used at begin\n"
	   "\t 7 bytes buffered at begin\n"
	   "\t 8 bytes cached at begin\n"
			    
	   "\t 9 bytes used after write test\n"
	   "\t10 bytes available after write test\n"
	   "\t11 bytes buffered after write test\n"
	   "\t12 bytes cached after write test\n"
			
	   "\t13 bytes used after read test\n"
	   "\t14 bytes available after read test\n"
	   "\t15 bytes buffered after read test\n"
	   "\t16 bytes cached after read test\n"
			
	   "\t17 bytes used after copy test\n"
	   "\t18 bytes available after copy test\n"
	   "\t19 bytes buffered after copy test\n"
	   "\t20 bytes cached after copy test\n"
			
	   "\t21 bytes used at the end after closing files\n"
	   "\t22 bytes available at the end\n"
	   "\t23 bytes buffered at the end\n"
	   "\t24 bytes cached at the end\n");

  printf("# %d %f %f %f ", 
	 block_size, w_stat.speed, r_stat.speed, c_stat.speed);

  printf("%lu %lu %lu %lu ",
	 begin[0], begin[1], begin[2], begin[3]);

  printf("%lu %lu %lu %lu ",
	 w_stat.used[0],
	 w_stat.used[1],
	 w_stat.used[2],
	 w_stat.used[3]);

  printf("%lu %lu %lu %lu ",
	 r_stat.used[0],
	 r_stat.used[1],
	 r_stat.used[2],
	 r_stat.used[3]);

  printf("%lu %lu %lu %lu ",
	 c_stat.used[0],
	 c_stat.used[1],
	 c_stat.used[2],
	 c_stat.used[3]);

  printf("%lu %lu %lu %lu\n",
	 end[0], end[1], end[2], end[3]);

  if (latencylogging) {
    latencystats("write latencies in the write test", w_latencies);
    latencystats("write latencies in the copy test",  c_latencies);
  }

  exit(0);
}

/* write test */
int w_test() 
{
  long n_blocks = 0L;
  int f_blocks;
  struct timeval start, end, l0, l1;
  int status;

#ifdef USE_SYNC 
  sync();
  sleep(5); /* to insure the sync */
#endif

  if (verbose)
    fprintf(stderr, "Write test: ");
  signal(SIGALRM,stop_count);
  sigalarm = 0; /* reset alarm flag */
  alarm(seconds);
  gettimeofday(&start, NULL);
  while(!sigalarm) {
      /* On first alarm may break the loop before complete. On
	 further rounds must go through */
      for(f_blocks=0; 
	  f_blocks < max_blocks &&
	    (!sigalarm || n_blocks >= max_blocks);
	  ++f_blocks) {
	  if (latencylogging) {
	    gettimeofday(&l0, NULL);
	    status = write(fd0, buf, block_size);
	    gettimeofday(&l1, NULL);
	    loglatency(w_latencies, &l0, &l1);
	  } else
	    status = write(fd0, buf, block_size);

	  if (status < 0) {
	    if (errno != EINTR) {
	      perror("fstime: write");
	      return(-1);
	    } else stop_count(0);
	  }
	  ++ n_blocks;
	}
      lseek(fd0, SEEK_SET, 0); /* rewind */
    }
  /* stop clock */
  gettimeofday(&end, NULL);
  report(elapsed_seconds(&start, &end), n_blocks, "written",
	 &w_stat);
  return(0);
}

/* read test */
int r_test()
{
  long n_blocks = 0L;
  int n_read;
  extern int errno;
  struct timeval start, end;
  
  if (verbose)
    fprintf(stderr, "Read test: ");

  /* rewind */
#ifdef USE_SYNC
  sync();
  sleep(10+seconds/2);
#endif

  errno = 0;
  lseek(fd0, SEEK_SET, 0);

  signal(SIGALRM,stop_count);
  sigalarm = 0; /* reset alarm flag */
  alarm(seconds);
  gettimeofday(&start, NULL);
  while(!sigalarm) {
    /* read while checking for an error */
    n_read = read(fd0, buf, block_size);
    if (n_read == 0) {
      lseek(fd0, SEEK_SET, 0);  /* rewind at end of file */
      continue;
    } else if (n_read < 0)
      switch(errno) {
      case EINVAL:
	lseek(fd0, SEEK_SET, 0);  /* rewind at end of file */
	continue;
	break;
      case EINTR:
	stop_count(0);
	break;
      default:
	perror("fstime: read");
	return(-1);
	break;
      }
    ++ n_blocks;
  }
  /* stop clock */
  gettimeofday(&end, NULL);
  report(elapsed_seconds(&start, &end), n_blocks, "read", &r_stat);
  return(0);
}


/* copy test */
int c_test() 
{
  long n_blocks = 0L;
  struct timeval start, end, l0, l1;
  int n_read, n_written;

  if (verbose)
    fprintf(stderr, "Copy test: ");
  /* rewind */
#ifdef USE_SYNC
  sync();
  sleep(10+seconds/2); /* to insure the sync */
#endif
  lseek(fd0, SEEK_SET, 0);

  signal(SIGALRM,stop_count);
  sigalarm = 0; /* reset alarm flag */
  alarm(seconds);
  gettimeofday(&start, NULL);
  while(!sigalarm) {
    n_read = read(fd0, buf, block_size);
    if (n_read == 0) { /* EOF */
      lseek(fd0, SEEK_SET, 0);  /* rewind at end of file */
      lseek(fd1, SEEK_SET, 0);  /* rewind the output as well */
      continue;
    } else if (n_read < 0) {
      switch(errno) {
      case 0:
      case EINVAL:
	  lseek(fd0, SEEK_SET, 0);  /* rewind at end of file */
	  lseek(fd1, SEEK_SET, 0);  /* rewind the output as well */
	  continue;
	  break;
      case EINTR:
	stop_count(0);
	break;
      default:
	fprintf(stderr, "fstime: copy read (%d): %s\n",
		errno, strerror(errno));
	return(-1);
	break;
      }
    }

    if (latencylogging) {
      gettimeofday(&l0, NULL);
      n_written = write(fd1, buf, block_size);
      gettimeofday(&l1, NULL);
      loglatency(c_latencies, &l0, &l1);
    } else
      n_written = write(fd1, buf, block_size);
    
    if (n_written < 0) {
      if (errno == ENOSPC) {
	printf("fstime: copy write: %s at block %ld, max blocks %d.\n",
	       strerror(errno), n_blocks, max_blocks);
	system("df .");
	break; /* FS full */
      } else if (errno != EINTR) {
	fprintf(stderr, "fstime: copy write (%d): %s\n",
		errno, strerror(errno));
	return(-1);
      } else stop_count(0);
    } else
      ++ n_blocks;
  }
  /* stop clock */
  gettimeofday(&end, NULL);
  report(elapsed_seconds(&start, &end), n_blocks, "copied", &c_stat);
  return(0);
}

void stop_count(int i)
{
  sigalarm = 1;
}

void clean_up(int prob)
{  
  if (close(fd0) || close(fd1))
    perror("fstime: close");

  if (unlink(FILE0) || unlink(FILE1))
    perror("fstime: unlink");
}
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test-rnd.sh
Type: application/x-sh
Size: 19462 bytes
Desc: not available
Url : http://lists.infradead.org/pipermail/linux-mtd/attachments/20020502/54627a5b/attachment.sh 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test-lin.sh
Type: application/x-sh
Size: 1476 bytes
Desc: not available
Url : http://lists.infradead.org/pipermail/linux-mtd/attachments/20020502/54627a5b/attachment-0001.sh 


More information about the linux-mtd mailing list