--- xml/htdocs/proj/en/glep/glep-0031.txt 2004/10/28 17:00:22 1.1 +++ xml/htdocs/proj/en/glep/glep-0031.txt 2004/11/01 20:24:06 1.2 @@ -1,19 +1,19 @@ GLEP: 31 Title: Character Sets for Portage Tree Items -Version: $Revision: 1.1 $ +Version: $Revision: 1.2 $ Author: Ciaran McCreesh -Last-Modified: $Date: 2004/10/28 17:00:22 $ +Last-Modified: $Date: 2004/11/01 20:24:06 $ Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 27-October-2004 -Post-Date: 28-October-2004 +Post-Date: 28-October-2004, 1-November-2004 Abstract ======== -A set of rules regarding what characters are permissible in the portage -tree and how they should be encoded is required. +A set of guidelines regarding what characters are permissible in the +portage tree and how they should be encoded is required. Motivation ========== @@ -23,7 +23,7 @@ standard 'safe' 0..127 ASCII range. There is no current standard on how these should be represented, leading to inconsistency across the tree. -Although the issues involved have been discussed many times informally, no +Although the issues involved have been discussed informally many times, no official decision has been made. Specification @@ -51,9 +51,10 @@ For the same reasons as previously, it is proposed that UTF-8 is used as the official encoding for ebuild and eclass files. -However, developers should be warned that any output which is parsed by -bash (in other words, non-comments), and any output which is echoed to the -screen (for example, einfo messages) must not use anything outside the +However, developers should be warned that any code which is parsed by bash +(in other words, non-comments), and any output which is echoed to the +screen (for example, einfo messages) or given to portage (for example any +of the standard global variables) must not use anything outside the regular ASCII 0..127 range for compatibility purposes. files/ Entries Character Sets @@ -85,8 +86,9 @@ since the ``echangelog`` tool is generally the correct way to generate ChangeLog entries, this should not be a major problem. Generating metadata.xml files correctly in these editors could become problematic. -(The ``vim`` and ``emacs`` editors, which appear to be most widely used, -are both capable of handling UTF-8 cleanly.) +The ``vim`` and ``emacs`` editors, which appear to be most widely used, +are both capable of handling UTF-8 cleanly -- for vim, this could be +configured automatically via the ``gentoo-syntax`` ([4]_) package. References ========== @@ -95,6 +97,8 @@ http://www.ietf.org/rfc/rfc3629.txt .. [2] ISO/IEC 10646 (Universal Multiple-Octet Coded Character Set) .. [3] ISO/IEC 8859 (8-bit single-byte coded graphic character sets) +.. [4] The app-vim/gentoo-syntax package, + https://developer.berlios.de/projects/gentoo-syntax/ Copyright =========