Possible solution to the Windows AtomicParsley crashes?

dinkypumpkin dinkypumpkin at gmail.com
Sat Oct 8 13:37:15 EDT 2011


On 08/10/2011 16:20, David Woodhouse wrote:
> If you *decode* UTF-8, that implies turning it into actual letters...

A different "decoding" in this case.  What I mean here is that 
HTML::Entities::decode_entities() returns a "decoded" string with the 
utf8 flag set due to the presence of the expanded curly quotes.  The 
tagging falls over because I didn't check the utf8 flags before using 
the metadata.  An unfortunate mistake, I know.

> That seems wrong. Why do we have a file handle that *isn't* expecting
> normal UTF-8 text written to it? It can't be just "non-Unicode". It has

What I mean here is that the handle for the download history file is 
opened without a Unicode-capable I/O output layer on it, so a warning is 
generated when the string with expanded curly quotes is written to a 
history record.  As to why, I can't say.  This probably just never came 
up when get_iplayer was developed.   I've never seen curly quotes or 
similar entities in iPlayer metadata before, but they are certainly here 
now.  Somebody must have started doing cut-and-paste from MS Word.



More information about the get_iplayer mailing list