Possible solution to the Windows AtomicParsley crashes?

dinkypumpkin dinkypumpkin at gmail.com
Sat Oct 8 15:36:56 EDT 2011


On 08/10/2011 19:25, David Woodhouse wrote:
> I don't quite understand "utf8 flag" though. It's not a boolean option
> "utf8" vs. "not-utf8". Again, there has to be *some* encoding.

If a scalar has the utf8 flag "on", the value is a sequence of Unicode 
characters in UTF-8.  If the utf8 flag is "off", the value is a sequence 
of bytes, assumed as ISO-8859-1 if no locale is set.  You can explicitly 
encode byte strings in a different encoding, though you have to keep 
track of (or determine) it yourself.

> Right, so we just have to make sure perl joins us in the 21st century
> and allows us to write utf-8 to text files?

It's easy enough to add the necessary output layer for UTF-8.  You just 
have to know you need it.

> And file a bug against the perl implementations that don't do this by
> default, perhaps?

I think the default behaviour in Perl is to treat file I/O as binary 
data.  Anything else you have specify with layers.  You can set default 
layers for all file handles, but I'm not sure that's necessary for 
get_iplayer.  The history file is likely the only place this problem 
arises.  Unicode chars are re-encoded as character entities in the 
metadata file itself.  In theory, I think this glitch might affect 
subtitle files if they contain curly quotes and the like, but I'll leave 
that until somebody hits on a test case.




More information about the get_iplayer mailing list