Possible solution to the Windows AtomicParsley crashes?
dinkypumpkin
dinkypumpkin at gmail.com
Sat Oct 8 15:36:56 EDT 2011
On 08/10/2011 19:25, David Woodhouse wrote:
> I don't quite understand "utf8 flag" though. It's not a boolean option
> "utf8" vs. "not-utf8". Again, there has to be *some* encoding.
If a scalar has the utf8 flag "on", the value is a sequence of Unicode
characters in UTF-8. If the utf8 flag is "off", the value is a sequence
of bytes, assumed as ISO-8859-1 if no locale is set. You can explicitly
encode byte strings in a different encoding, though you have to keep
track of (or determine) it yourself.
> Right, so we just have to make sure perl joins us in the 21st century
> and allows us to write utf-8 to text files?
It's easy enough to add the necessary output layer for UTF-8. You just
have to know you need it.
> And file a bug against the perl implementations that don't do this by
> default, perhaps?
I think the default behaviour in Perl is to treat file I/O as binary
data. Anything else you have specify with layers. You can set default
layers for all file handles, but I'm not sure that's necessary for
get_iplayer. The history file is likely the only place this problem
arises. Unicode chars are re-encoded as character entities in the
metadata file itself. In theory, I think this glitch might affect
subtitle files if they contain curly quotes and the like, but I'll leave
that until somebody hits on a test case.
More information about the get_iplayer
mailing list