Shell script to get PIDs from schedules

Dirk Husemann dirk+getiplayer at d2h.net
Sun Nov 2 10:17:27 PST 2014


this one seems to work:

% curl -v -v http://www.bbc.co.uk/iplayer/js/episode/b04nhkz9

GET /iplayer/js/episode/b04nhkz9 HTTP/1.1
> User-Agent: curl/7.37.1
> Host: www.bbc.co.uk
> Accept: */*
>
< HTTP/1.1 200 OK
* Server Apache is not blacklisted
< Server: Apache
< Content-Type: application/json
< Etag: "0336180592c132763a48612b843431b3"
< X-PAL-Host: pal120.telhc.bbc.co.uk:80
< X-Ua-Compatible: IE=edge
< Content-Length: 224
< Date: Sun, 02 Nov 2014 18:15:34 GMT
< Connection: keep-alive
< X-Cache-Action: MISS
< X-Cache-Age: 0
< Cache-Control: private, max-age=0, must-revalidate
< Vary: X-CDN,Accept-Language,Accept-Encoding
<
{ [data not shown]
100   224  100   224    0     0    858      0 --:--:-- --:--:--
--:--:--   864
* Connection #0 to host www.bbc.co.uk left intact
{"id":"b04nhkz9","title":"The Apprentice: You're
Fired","subtitle":"Series 10: Episode 4","synopsis":"Dara O Briain is
joined by Radio 1's Matt Edmondson and comedian Romesh
Ranganathan.","tleo":"b007qgcl","versions":["HD"]}

no deprecation warning there...


On 2014-11-02 17:24, artisticforge . wrote:
> hello;
>
> JSON, XML and YAML, all have the following in the header sent by the
> server which most people would never see.
>
> HTTP/1.1 200 OK
> Server: Apache
> Content-Type: application/x-yaml
> Access-Control-Allow-Origin: *
> X-PAL-Host: pal131.telhc.bbc.co.uk:80
> X-UA-Compatible: IE=edge
> X-Aps-Deprecation-Notice: APS is soon to be deprecated. It will first
> of all cease to be supported on a 24/7 basis, and will then cease
> responding entirely. Nitro is the BBC's new API for programme data,
> and can provide all the information previously provided by APS. Go
> here to read more: http://developer.bbc.co.uk/nitro
> Cache-Control: private, max-age=0, no-store
> Content-Length: 495441
> Date: Sun, 02 Nov 2014 16:17:13 GMT
> Connection: keep-alive
> X-Cache-Action: MISS
> X-Cache-Age: 0
> Vary: X-CDN,Accept-Encoding
>
> Basically, JSON, XML and YAML, may disappear at any time. We are then
> left in the same position that we have recently
> found ourselves.
>
> So one viable long term option is to start parsing the HTML version of
> the programme schedules.
>
> In my opinion it is better to start now than wait for "the sky is
> falling the sky falling we are doomed"
>
>
> On Sun, Nov 2, 2014 at 9:52 AM, Jeremy Nicoll - ml get_iplayer
> <jn.ml.gti.91 at wingsandbeaks.org.uk> wrote:
>> "Terry L. Ridder" <artisticforge at gmail.com> wrote:
>>
>>> Hello
>>>
>>> I may have missed something , but where is there any mention of the
>> www.bbc.co.uk website programme schedules going away?
>>
>> You've missed this: if a computer program grabs website pages and 'scrapes'
>> them, which is to say wades through all the rubbish that's there to make the
>> page look pretty, trying to extract only the data that says what the
>> tv/radio programmes are, their pids etc... it's
>>
>>   - complicated
>>   - slow
>>   - unreliable because as soon as the BBC alter how the webpages
>>     work, the scraping programs might need altered
>>
>> So instead, programmers are concentrating on finding resources that contain
>> data without frills.  The stuff at:
>>
>>  www.bbc.co.uk/radio4/programmes/schedules/fm/this_week.json
>>
>> and
>>
>>  www.bbc.co.uk/radio4/programmes/schedules/fm/this_week.yaml
>>
>> and
>>
>>  www.bbc.co.uk/radio4/programmes/schedules/fm/this_week.xml
>>
>>
>> (those three URLs are the same except for the last .xxx part) all yield data
>> that's much more immediately useful to programmers.  The first two are nasty
>> for a human to look at, the third is easier on the eye.  But as someone said
>> these simpler-to-use files are going to cease to exist; they're 'deprecated'
>> which is the term programmers use to mean "something that works now but soon
>> won't".
>>
>> --
>> Jeremy Nicoll - my opinions are my own.
>>
>> _______________________________________________
>> get_iplayer mailing list
>> get_iplayer at lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/get_iplayer
>
>




More information about the get_iplayer mailing list