[libical] string formatting: escaping/dropping illegal characters

Arnout Engelen libical at bzzt.net
Sat Oct 24 08:30:51 PDT 2009


Hi,

I'm a bit confused as to how much input sanitation is done by libical, and how
much of it must be done by myself to generate valid ICalendar output.

A call like icalproperty_new_description(char* value) accepts the value as a 
string. 

For the final output, rfc2445 specifies which characters are allowed, and which
are allowed but only when escaped:

     text       = *(TSAFE-CHAR / ":" / DQUOTE / ESCAPED-CHAR)
     ; Folded according to description above

     ESCAPED-CHAR = "\\" / "\;" / "\," / "\N" / "\n")
     ; \\ encodes \, \N or \n encodes newline
     ; \; encodes ;, \, encodes ,

     TSAFE-CHAR = %x20-21 / %x23-2B / %x2D-39 / %x3C-5B
                  %x5D-7E / NON-US-ASCII
     ; Any character except CTLs not needed by the current
     ; character set, DQUOTE, ";", ":", "\", ","

     NON-US-ASCII       = %x80-F8
     ; Use restricted by charset parameter
     ; on outer MIME object (UTF-8 preferred)

I tried calling icalproperty_new_description with a string containing a 
newline, a tab and a 'vertical tab':

    icalproperty_new_description("backslash\\newline\ntab\tverticaltab\vend");

When output this resulted in:

DESCRIPTION:backslash\\newline\ntab\tverticaltab
                                     end

In other words:
* the backslash is escaped by libical (making it valid wrt rfc2445)
* the newline is escaped by libical
* the tab is also escaped by libical (making the output invalid wrt rfc2445)
* the vertical tab appears in the output as-is (output invalid wrt rfc2445)

Does this mean the caller of icalproperty_new_description should make sure any
illegal characters have been removed from the value, but any characters that 
have to be escaped appear in the value unescaped?

Or should libical make sure the output is always rfc2445-complicant, dropping 
any illegal characters?

If dropping illegal characters is the responsibility of the caller, where can
this be documented more prominently? It certainly was not obvious to me, 
especially while libical *does* do the escaping for me.

If libical should make sure the output is always rfc2445-complicant, then the
current behaviour is buggy - and I'd be willing to prepare a patch.


Arnout




More information about the libical-interest mailing list