Subtitles, Round 2
northmedia1 at the.forthnet.gr
Wed Sep 25 03:09:35 EDT 2013
On Tue Sep 24 20:50:08 BST 2013, dinkypumpkin wrote:
>...I decided to... iron out some subtitles-related issues
>raised in the last couple of days.
>The patch against Git HEAD is here:
>With this patch, subtitles should be formatted as they would be in the
>Flash player on the iPlayer site (incl. explicit line breaks)...
>It would be helpful if anyone interested in getting subtitles sorted
>would have a go at testing this.
>Please do it soon -
>I want to resolve this so I can get another release out.
Well, first of all I am only a very sparse TV downloader,
for reasons errr.... probably known to other list members;
I'd say 2-3 shows a month is descriptive of my habits - and
these are niche content I can't find elsewhere. So I am not
your ideal tester for this patch...
Having said that, I do sometimes use GiP to fetch subtitles
for BBC content acquired through other means (if I were in
the US, I'd say "I plead the Fifth"...).
I have applied your linked patch to my local copy of the
get_iplayer.pl script, then used the patched version to
re-download the subs to programme with pid=b03bjpcy
(see my previous mail here:
get_iplayer --pid=b03bjpcy --subtitles-only --force
The above yielded an .srt file sized 60.9 KB and was tested
together with its corresponding video file by using
I have to declare that it is a VAST improvement over the status
quo ante, indeed in this test file the subs are identical to how
they are presented on-line on iPlayer - this also means that
I "got hit" by a very rare "three-liner" in subtitle no. 663:
00:52:54,360 --> 00:52:58,800
that had been deliberately given
but this is how the Beeb made it; not a fault of GiP's.
I fixed it manually to:
00:52:54,360 --> 00:52:58,800
on macaques that had been deliberately
given Parkinson's Disease.
It would be heaven-perfect if GiP made sure that every
subtitle does not exceed two lines, but I do realise this is
too much to ask...
So, a sterling patch it is then!
>The(re) may be other format variations lurking
>in iPlayer that I would never come across.
>If you check subtitles during playback and find big chunks of dialogue
>missing, first check the raw subtitles files before blaming get_iplayer.
>I have found several programmes with missing dialogue.
In my limited experience with subtitles over the past 2 years,
I would put the Beeb's subs under 2 categories:
1. The ones that at the early days of iPlayer were labeled as "prepared";
these are made by a third party dedicated service, are usually very
accurate and in-sync with audio and always end with
"Subtitles By Red Bee Media Ltd"
2. The ones that at the early days of iPlayer were labeled as "live";
these are machine-generated transcripts of the live (audio) TV feed.
More often than not, they are of low quality, usually off-sync with
the spoken audio and with missing dialogue - or may contain repeated
subtitle lines - in general are beyond an easy repair and I avoid them
Contrary to their "live" label, they are not limited to live shows -
examples are Friday's pre-recorded Jools Holland's show (when subs do
come with it) and most content on the BBC News Channel (namely "Click"
and other shows there that I am exposed to via BBC World News Channel).
Again, cheers dinkypumpkin for this latest fix!
More information about the get_iplayer