Shell script to get PIDs from schedules

Charles Johnson cehjohnson at gmail.com
Sun Nov 2 03:34:03 PST 2014


On 02/11/14 08:52, Chris Allison wrote:
> Peter,
>
> some good ideas there, but there is no need to scrape the web pages
> when all the schedule info you could possibly need is available in
> xml, json and yaml files at urls of this form:
>
> www.bbc.co.uk/radio4/programmes/schedules/fm/this_week.json
> www.bbc.co.uk/radio4extra/programmes/schedules/2014/11/1.json
> www.bbc.co.uk/bbcfour/programmes/schedules/last_week.json
>
> etc.
Thanks for that Chris. Have been excited enough by that first link into
experimenting with the json parsing utility called 'jq'.

A pipeline like the following will produce all the titles, pids and
synopses:

wget -O -
http://www.bbc.co.uk/radio4/programmes/schedules/fm/this_week.json | jq
'.[] | .[] | .[] | .[] | .programme as $P |
$P.display_titles.title,$P.short_synopsis,$P.pid'

So, just a 6-line tail with

wget -q -O -
http://www.bbc.co.uk/radio4/programmes/schedules/fm/this_week.json | jq
'.[] | .[] | .[] | .[] | .programme as $P |
$P.display_titles.title,$P.short_synopsis,$P.pid' | tail -n 6

will get you the following:

============
"The Film Programme"
"Director Mike Leigh discusses art and movie-making in his latest film
Mr Turner."
"b04mgxtq"
"Something Understood"
"Mark Tully debates the cultural benefits of classical music with
composer James MacMillan."
"b04n2fmh"
============

Regards,

Charles




More information about the get_iplayer mailing list