[LEDE-DEV] [PATCH] jshn: add functionality to read big JSON

Alexandru Ardelean ardeleanalex at gmail.com
Sun Nov 19 03:57:46 PST 2017


On Sat, Nov 18, 2017 at 3:55 PM, Christian Beier <dontmind at sdf.org> wrote:
> Am Fri, 17 Nov 2017 09:17:39 +0200
> schrieb Alexandru Ardelean <ardeleanalex at gmail.com>:
>
>> On Thu, Nov 16, 2017 at 8:52 PM, Christian Beier <dontmind at freeshell.org>
>> wrote:
>> > The existing read functionality feeds the complete JSON to jshn as a
>> > cmdline argument, leading to `-ash: jshn: Argument list too long`
>> > errors for JSONs bigger than ca. 100KB.
>> >
>> > This commit adds the ability to read the JSON directly from a file if
>> > wanted, removing this shell-imposed size limit.
>> >
>> > Tested on x86-64 and ar71xx. An mmap()-based solution was also evaluated,
>> > but found to make no performance difference on either platform.
>> >
>> > Signed-off-by: Christian Beier <dontmind at freeshell.org>
>> > ---
>> >  jshn.c     | 30 ++++++++++++++++++++++++++++--
>> >  sh/jshn.sh |  4 ++++
>> >  2 files changed, 32 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/jshn.c b/jshn.c
>> > index 79136dd..a3866a0 100644
>> > --- a/jshn.c
>> > +++ b/jshn.c
>> > @@ -25,6 +25,8 @@
>> >  #include <stdbool.h>
>> >  #include <ctype.h>
>> >  #include <getopt.h>
>> > +#include <sys/stat.h>
>> > +#include <fcntl.h>
>> >  #include "list.h"
>> >
>> >  #include "avl.h"
>> > @@ -300,7 +302,7 @@ out:
>> >
>> >  static int usage(const char *progname)
>> >  {
>> > -       fprintf(stderr, "Usage: %s [-n] [-i] -r <message>|-w\n", progname);
>> > +       fprintf(stderr, "Usage: %s [-n] [-i] -r <message>|-R <file>|-w\n",
>> > progname); return 2;
>> >  }
>> >
>> > @@ -333,6 +335,10 @@ int main(int argc, char **argv)
>> >         struct env_var *vars;
>> >         int i;
>> >         int ch;
>> > +       int fd;
>> > +       struct stat sb;
>> > +       char *fbuf;
>> > +       int ret;
>> >
>> >         avl_init(&env_vars, avl_strcmp_var, false, NULL);
>> >         for (i = 0; environ[i]; i++);
>> > @@ -354,7 +360,7 @@ int main(int argc, char **argv)
>> >                 avl_insert(&env_vars, &vars[i].avl);
>> >         }
>> >
>> > -       while ((ch = getopt(argc, argv, "p:nir:w")) != -1) {
>> > +       while ((ch = getopt(argc, argv, "p:nir:R:w")) != -1) {
>> >                 switch(ch) {
>> >                 case 'p':
>> >                         var_prefix = optarg;
>> > @@ -362,6 +368,26 @@ int main(int argc, char **argv)
>> >                         break;
>> >                 case 'r':
>> >                         return jshn_parse(optarg);
>> > +               case 'R':
>> > +                       if((fd = open(optarg, O_RDONLY)) == -1) {
>> > +                           fprintf(stderr, "Error opening %s\n", optarg);
>> > +                           return 3;
>> > +                       }
>> > +                       if (fstat(fd, &sb) == -1) {
>> > +                           fprintf(stderr, "Error getting size of %s\n",
>> > optarg);
>> > +                           close(fd);
>> > +                           return 3;
>> > +                       }
>> > +                       if(!(fbuf = malloc(sb.st_size)) || read(fd, fbuf,
>> > sb.st_size) != sb.st_size) {
>> > +                           fprintf(stderr, "Error reading %s\n", optarg);
>> > +                           free(fbuf);
>> > +                           close(fd);
>> > +                           return 3;
>> > +                       }
>> > +                       ret = jshn_parse(fbuf);
>> > +                       free(fbuf);
>> > +                       close(fd);
>> > +                       return ret;
>> nitpick/preference: I would move this code into a file, so that the
>> case statement is not too loaded
>> it would allow for a simpler/cleaner code-path with some goto
>> statements [if put into a function]
>
> Yeah, I tried keeping all those error case printf and return blocks as minimal
> as possible, though it still boils down to 3. Might be an idea to factor out the
> file-opening-and-reading into a separate function to keep the switch statement
> cleaner, but then OTOH it'd be a function with only one caller.
>
> Also, I was unsure if it'd be overkill to assign different exit codes to the
> different error conditions, so I simply used the first unused one (3) for all
> of them.
>
>>
>> >                 case 'w':
>> >                         return jshn_format(no_newline, indent);
>> >                 case 'n':
>> > diff --git a/sh/jshn.sh b/sh/jshn.sh
>> > index bf76edb..0a146e1 100644
>> > --- a/sh/jshn.sh
>> > +++ b/sh/jshn.sh
>> > @@ -174,6 +174,10 @@ json_load() {
>> >         eval "`jshn -r "$1"`"
>> >  }
>> >
>> > +json_load_file() {
>> > +       eval "`jshn -R "$1"`"
>> would it be an idea to try to use $JSON_PREFIX here ?
>> for cases when you need to change the JSON namespace, the JSON_PREFIX
>> var is used if available
>> it would definitely complete this function
>
> TBH, I don't really know what JSON_PREFIX is for. I simply used the existing
> json_load() function as a blueprint.

oh right ;
disregard what i was saying ;
i was looking at the json_dump() function

JSON_PREFIX is used for changing the namespace;
that allows to load into shell vars multiple JSON instances, and
switch between them with the json_set_namespace() function ;
it's useful when you need to work with more than 1 JSON objects/instances

>
>>
>> > +}
>> > +
>> >  json_dump() {
>> >         jshn "$@" ${JSON_PREFIX:+-p "$JSON_PREFIX"} -w
>> >  }
>> > --
>> > 2.11.0
>> >
>>
>> from my side [as a simple reviewer], this looks pretty neat
>>
>> thanks :)
>
> cool, thanks!
>
>>
>> Alex
>>
>> >
>> > _______________________________________________
>> > Lede-dev mailing list
>> > Lede-dev at lists.infradead.org
>> > http://lists.infradead.org/mailman/listinfo/lede-dev
>



More information about the Lede-dev mailing list