[LEDE-DEV] [PATCH] jshn: add functionality to read big JSON

Sat Nov 18 05:55:49 PST 2017

Am Fri, 17 Nov 2017 09:17:39 +0200
schrieb Alexandru Ardelean <ardeleanalex at gmail.com>:

> On Thu, Nov 16, 2017 at 8:52 PM, Christian Beier <dontmind at freeshell.org>
> wrote:
> > The existing read functionality feeds the complete JSON to jshn as a
> > cmdline argument, leading to `-ash: jshn: Argument list too long`
> > errors for JSONs bigger than ca. 100KB.
> >
> > This commit adds the ability to read the JSON directly from a file if
> > wanted, removing this shell-imposed size limit.
> >
> > Tested on x86-64 and ar71xx. An mmap()-based solution was also evaluated,
> > but found to make no performance difference on either platform.
> >
> > Signed-off-by: Christian Beier <dontmind at freeshell.org>
> > ---
> >  jshn.c     | 30 ++++++++++++++++++++++++++++--
> >  sh/jshn.sh |  4 ++++
> >  2 files changed, 32 insertions(+), 2 deletions(-)
> >
> > diff --git a/jshn.c b/jshn.c
> > index 79136dd..a3866a0 100644
> > --- a/jshn.c
> > +++ b/jshn.c
> > @@ -25,6 +25,8 @@
> >  #include <stdbool.h>
> >  #include <ctype.h>
> >  #include <getopt.h>
> > +#include <sys/stat.h>
> > +#include <fcntl.h>
> >  #include "list.h"
> >
> >  #include "avl.h"
> > @@ -300,7 +302,7 @@ out:
> >
> >  static int usage(const char *progname)
> >  {
> > -       fprintf(stderr, "Usage: %s [-n] [-i] -r <message>|-w\n", progname);
> > +       fprintf(stderr, "Usage: %s [-n] [-i] -r <message>|-R <file>|-w\n",
> > progname); return 2;
> >  }
> >
> > @@ -333,6 +335,10 @@ int main(int argc, char **argv)
> >         struct env_var *vars;
> >         int i;
> >         int ch;
> > +       int fd;
> > +       struct stat sb;
> > +       char *fbuf;
> > +       int ret;
> >
> >         avl_init(&env_vars, avl_strcmp_var, false, NULL);
> >         for (i = 0; environ[i]; i++);
> > @@ -354,7 +360,7 @@ int main(int argc, char **argv)
> >                 avl_insert(&env_vars, &vars[i].avl);
> >         }
> >
> > -       while ((ch = getopt(argc, argv, "p:nir:w")) != -1) {
> > +       while ((ch = getopt(argc, argv, "p:nir:R:w")) != -1) {
> >                 switch(ch) {
> >                 case 'p':
> >                         var_prefix = optarg;
> > @@ -362,6 +368,26 @@ int main(int argc, char **argv)
> >                         break;
> >                 case 'r':
> >                         return jshn_parse(optarg);
> > +               case 'R':
> > +                       if((fd = open(optarg, O_RDONLY)) == -1) {
> > +                           fprintf(stderr, "Error opening %s\n", optarg);
> > +                           return 3;
> > +                       }
> > +                       if (fstat(fd, &sb) == -1) {
> > +                           fprintf(stderr, "Error getting size of %s\n",
> > optarg);
> > +                           close(fd);
> > +                           return 3;
> > +                       }
> > +                       if(!(fbuf = malloc(sb.st_size)) || read(fd, fbuf,
> > sb.st_size) != sb.st_size) {
> > +                           fprintf(stderr, "Error reading %s\n", optarg);
> > +                           free(fbuf);
> > +                           close(fd);
> > +                           return 3;
> > +                       }
> > +                       ret = jshn_parse(fbuf);
> > +                       free(fbuf);
> > +                       close(fd);
> > +                       return ret;  
> nitpick/preference: I would move this code into a file, so that the
> case statement is not too loaded
> it would allow for a simpler/cleaner code-path with some goto
> statements [if put into a function]

Yeah, I tried keeping all those error case printf and return blocks as minimal
as possible, though it still boils down to 3. Might be an idea to factor out the
file-opening-and-reading into a separate function to keep the switch statement
cleaner, but then OTOH it'd be a function with only one caller. 

Also, I was unsure if it'd be overkill to assign different exit codes to the
different error conditions, so I simply used the first unused one (3) for all
of them. 

> 
> >                 case 'w':
> >                         return jshn_format(no_newline, indent);
> >                 case 'n':
> > diff --git a/sh/jshn.sh b/sh/jshn.sh
> > index bf76edb..0a146e1 100644
> > --- a/sh/jshn.sh
> > +++ b/sh/jshn.sh
> > @@ -174,6 +174,10 @@ json_load() {
> >         eval "`jshn -r "$1"`"
> >  }
> >
> > +json_load_file() {
> > +       eval "`jshn -R "$1"`"  
> would it be an idea to try to use $JSON_PREFIX here ?
> for cases when you need to change the JSON namespace, the JSON_PREFIX
> var is used if available
> it would definitely complete this function

TBH, I don't really know what JSON_PREFIX is for. I simply used the existing
json_load() function as a blueprint.

> 
> > +}
> > +
> >  json_dump() {
> >         jshn "$@" ${JSON_PREFIX:+-p "$JSON_PREFIX"} -w
> >  }
> > --
> > 2.11.0
> >  
> 
> from my side [as a simple reviewer], this looks pretty neat
> 
> thanks :)

cool, thanks!

> 
> Alex
> 
> >
> > _______________________________________________
> > Lede-dev mailing list
> > Lede-dev at lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/lede-dev  

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 195 bytes
Desc: Digitale Signatur von OpenPGP
URL: <http://lists.infradead.org/pipermail/lede-dev/attachments/20171118/6a112d8c/attachment-0001.sig>