[PATCH] mtd: nand: Kconfig: drop utf8 characters

Martin Walch walch.martin at web.de
Sun Dec 16 22:27:20 EST 2012


Am Montag, 3. Dezember 2012, 15:29:13 schrieb Artem Bityutskiy:
> On Mon, 2012-11-26 at 17:18 -0600, Scott Wood wrote:
> > On 11/26/2012 05:07:25 PM, Wolfram Sang wrote:
> > > The Linux Kernel Configuration system (lkc) expects 8 bit characters
> > > only (declared in scripts/kconfig/zconf.l: %option 8bit).
> > 
> > That option contrasts with being limited to 7-bit characters, not with
> > accepting UTF-8.  It may be that kconfig has problems with UTF-8, but I
> > don't think this is why.
> 
> Whatever has problems with UTF-8 - it is better to fix that instead of
> hiding the problem by removing UTF-8 characters.

The kernel configuration system does not support multibyte characters. I have 
not found any hint that support for multibyte characters has been specified or 
taken into account. In many places throughout the configuration system, only 
single byte characters are assumed. In bug #43067

> https://bugzilla.kernel.org/show_bug.cgi?id=43067

I have attached screenshots showing a problem with utf-8 characters in the 
interactive nconfig menu.

More usage of utf-8 characters could even lead to worse problems: the flex 
scanner only allows the characters [A-Za-z0-9_] in symbol names. Other input 
will make the scanner ignore a character or refuse the input at all ("syntax 
error").

The handling of multibyte characters in string values depends on the 
configuration menu in use. menuconfig will not allow any multibyte input. When 
editing a predefined string with multibyte characters in it, things will break. 
nconfig is even worse. xconfig substitutes characters with '?'.

Character counts do not work correctly. When using many multibyte characters 
funny things happen like text lines being cut off.

To make a long story short: multibyte characters in Kconfig files lead to 
undefined behaviour. This is no implementation bug. The configuration system 
just has not been designed for processing them. So utf-8 support can not be 
achieved with an easy fix, but will need comprehensive changes.

I do not know if anyone is willing to actually make all the necessary work to 
properly support utf-8 in the configuration system. However, I suppose this 
will not happen any time soon. Therefore I suggest removing the multibyte 
characters for now. 

Regards
Martin Walch
-- 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.infradead.org/pipermail/linux-mtd/attachments/20121217/184b6354/attachment.sig>


More information about the linux-mtd mailing list