| 1 | GLEP: 31 |
1 | GLEP: 31 |
| 2 | Title: Character Sets for Portage Tree Items |
2 | Title: Character Sets for Portage Tree Items |
| 3 | Version: $Revision: 1.1 $ |
3 | Version: $Revision: 1.2 $ |
| 4 | Author: Ciaran McCreesh <ciaranm@gentoo.org> |
4 | Author: Ciaran McCreesh <ciaranm@gentoo.org> |
| 5 | Last-Modified: $Date: 2004/10/28 17:00:22 $ |
5 | Last-Modified: $Date: 2004/11/01 20:24:06 $ |
| 6 | Status: Draft |
6 | Status: Draft |
| 7 | Type: Standards Track |
7 | Type: Standards Track |
| 8 | Content-Type: text/x-rst |
8 | Content-Type: text/x-rst |
| 9 | Created: 27-October-2004 |
9 | Created: 27-October-2004 |
| 10 | Post-Date: 28-October-2004 |
10 | Post-Date: 28-October-2004, 1-November-2004 |
| 11 | |
11 | |
| 12 | Abstract |
12 | Abstract |
| 13 | ======== |
13 | ======== |
| 14 | |
14 | |
| 15 | A set of rules regarding what characters are permissible in the portage |
15 | A set of guidelines regarding what characters are permissible in the |
| 16 | tree and how they should be encoded is required. |
16 | portage tree and how they should be encoded is required. |
| 17 | |
17 | |
| 18 | Motivation |
18 | Motivation |
| 19 | ========== |
19 | ========== |
| 20 | |
20 | |
| 21 | At present we have several developers and many more users whose names |
21 | At present we have several developers and many more users whose names |
| 22 | require characters (for example, accents) which are not part of the |
22 | require characters (for example, accents) which are not part of the |
| 23 | standard 'safe' 0..127 ASCII range. There is no current standard on how |
23 | standard 'safe' 0..127 ASCII range. There is no current standard on how |
| 24 | these should be represented, leading to inconsistency across the tree. |
24 | these should be represented, leading to inconsistency across the tree. |
| 25 | |
25 | |
| 26 | Although the issues involved have been discussed many times informally, no |
26 | Although the issues involved have been discussed informally many times, no |
| 27 | official decision has been made. |
27 | official decision has been made. |
| 28 | |
28 | |
| 29 | Specification |
29 | Specification |
| 30 | ============= |
30 | ============= |
| 31 | |
31 | |
| … | |
… | |
| 49 | -------------------------------- |
49 | -------------------------------- |
| 50 | |
50 | |
| 51 | For the same reasons as previously, it is proposed that UTF-8 is used as |
51 | For the same reasons as previously, it is proposed that UTF-8 is used as |
| 52 | the official encoding for ebuild and eclass files. |
52 | the official encoding for ebuild and eclass files. |
| 53 | |
53 | |
| 54 | However, developers should be warned that any output which is parsed by |
54 | However, developers should be warned that any code which is parsed by bash |
| 55 | bash (in other words, non-comments), and any output which is echoed to the |
55 | (in other words, non-comments), and any output which is echoed to the |
| 56 | screen (for example, einfo messages) must not use anything outside the |
56 | screen (for example, einfo messages) or given to portage (for example any |
|
|
57 | of the standard global variables) must not use anything outside the |
| 57 | regular ASCII 0..127 range for compatibility purposes. |
58 | regular ASCII 0..127 range for compatibility purposes. |
| 58 | |
59 | |
| 59 | files/ Entries Character Sets |
60 | files/ Entries Character Sets |
| 60 | ----------------------------- |
61 | ----------------------------- |
| 61 | |
62 | |
| … | |
… | |
| 83 | |
84 | |
| 84 | Certain text editors are incapable of handling UTF-8 cleanly. However, |
85 | Certain text editors are incapable of handling UTF-8 cleanly. However, |
| 85 | since the ``echangelog`` tool is generally the correct way to generate |
86 | since the ``echangelog`` tool is generally the correct way to generate |
| 86 | ChangeLog entries, this should not be a major problem. Generating |
87 | ChangeLog entries, this should not be a major problem. Generating |
| 87 | metadata.xml files correctly in these editors could become problematic. |
88 | metadata.xml files correctly in these editors could become problematic. |
| 88 | (The ``vim`` and ``emacs`` editors, which appear to be most widely used, |
89 | The ``vim`` and ``emacs`` editors, which appear to be most widely used, |
| 89 | are both capable of handling UTF-8 cleanly.) |
90 | are both capable of handling UTF-8 cleanly -- for vim, this could be |
|
|
91 | configured automatically via the ``gentoo-syntax`` ([4]_) package. |
| 90 | |
92 | |
| 91 | References |
93 | References |
| 92 | ========== |
94 | ========== |
| 93 | |
95 | |
| 94 | .. [1] RFC 3629: UTF-8, a transformation format of ISO 10646 |
96 | .. [1] RFC 3629: UTF-8, a transformation format of ISO 10646 |
| 95 | http://www.ietf.org/rfc/rfc3629.txt |
97 | http://www.ietf.org/rfc/rfc3629.txt |
| 96 | .. [2] ISO/IEC 10646 (Universal Multiple-Octet Coded Character Set) |
98 | .. [2] ISO/IEC 10646 (Universal Multiple-Octet Coded Character Set) |
| 97 | .. [3] ISO/IEC 8859 (8-bit single-byte coded graphic character sets) |
99 | .. [3] ISO/IEC 8859 (8-bit single-byte coded graphic character sets) |
|
|
100 | .. [4] The app-vim/gentoo-syntax package, |
|
|
101 | https://developer.berlios.de/projects/gentoo-syntax/ |
| 98 | |
102 | |
| 99 | Copyright |
103 | Copyright |
| 100 | ========= |
104 | ========= |
| 101 | |
105 | |
| 102 | This document has been placed in the public domain. |
106 | This document has been placed in the public domain. |