Exclude category exception?

dinkypumpkin dinkypumpkin at gmail.com
Tue Sep 4 07:34:53 EDT 2012


On 04/09/2012 01:42, Arthur Murray wrote:
> As an example, would it be possible for get_iplayer to
> "excludecategory" programmes with "Sport" category, unless it was
> Golf?  Is there some regex magic that would work?

Not exactly, but it should be good enough to just use --category=golf. 
There may be golf programmes that are not also sport programmes, but I 
imagine that will happen very rarely.

Grisly detail:  Programme categories are concatenated into a 
comma-separated string which is matched against each of the 
comma-separated regular expressions you supply to --category and 
--exclude-category.  Although the BBC taxonomy has a hierarchical 
structure, which leads to a generally consistent ordering of categories 
in the concatenated string, with more specific content categories coming 
before their more general parent categories.  In this example, Golf will 
come before Sport in the concatenated categories list.  However, it 
wouldn't be enough to use --category="golf,sport" because get_iplayer 
will retrieve programmes that match either of those categories.  You 
could do something like --category="golf.*sport" to get only sport 
programmes that are also golf programmes.  And if you're mildly insane, 
you could use --category="(^|\x2C)golf\x2C.*\x2Csport(\x2C|$)" be 
absolutely sure that you only retrieve programmes with categories that 
exactly match both "golf" and "sport".  The \x2C represents an embedded 
comma since the regex will be split on commas before running the search.



More information about the get_iplayer mailing list