Changed subtitle format
rob.dixon at gmx.com
Mon Sep 23 20:30:15 EDT 2013
On 24/09/2013 00:44, dinkypumpkin wrote:
> On 23/09/2013 23:16, Rob Dixon wrote:
> Unfortunately, that's not how the subtitles arrive from the Beeb. There
> are line breaks within a single speaker's lines, and sometimes no line
> break or other structural change to demarcate the transition between
> speakers in a single subtitle.
That isn't my experience - certainly not across all programmes. Current
editions of QI, for instance, have the text split between voices in
several places. The splits also seem to be at punctuation where
possible, rather than at an arbitrary half-way point.
I believe the subtitles XML file contains the text formatted as it was
transcribed and intended to be viewed.
> The old format has text colour changes to mark transition between
> speakers in a single subtitle, but the newer format doesn't appear to
> use that device.
I suspect that depends on the source of the subtitles. Captions that the
BBC has commissioned itself are uncoloured, but if they come from a
third party - often subtitles for films are of this type - then they can
vary a lot in style and content, and for instance may be labelled with
the speaker's name, or placed on the appropriate side of the screen.
My vote is with keeping the newlines as they are in the XML when
translating to SRT. (I have a short Perl program that does just that
using XML::Parser::Lite if it is of any interest.)
More information about the get_iplayer