| 1 | GLEP: 31 |
1 | GLEP: 31 |
| 2 | Title: Character Sets for Portage Tree Items |
2 | Title: Character Sets for Portage Tree Items |
| 3 | Version: $Revision: 1.1 $ |
3 | Version: $Revision: 1.4 $ |
| 4 | Author: Ciaran McCreesh <ciaranm@gentoo.org> |
4 | Author: Ciaran McCreesh <ciaranm@gentoo.org> |
| 5 | Last-Modified: $Date: 2004/10/28 17:00:22 $ |
5 | Last-Modified: $Date: 2005/10/30 21:35:50 $ |
| 6 | Status: Draft |
6 | Status: Approved |
| 7 | Type: Standards Track |
7 | Type: Standards Track |
| 8 | Content-Type: text/x-rst |
8 | Content-Type: text/x-rst |
| 9 | Created: 27-October-2004 |
9 | Created: 27-October-2004 |
| 10 | Post-Date: 28-October-2004 |
10 | Post-Date: 28-October-2004, 1-November-2004, 11-November-2004 |
| 11 | |
11 | |
| 12 | Abstract |
12 | Abstract |
| 13 | ======== |
13 | ======== |
| 14 | |
14 | |
| 15 | A set of rules regarding what characters are permissible in the portage |
15 | A set of guidelines regarding what characters are permissible in the |
| 16 | tree and how they should be encoded is required. |
16 | portage tree and how they should be encoded is required. |
|
|
17 | |
|
|
18 | Status |
|
|
19 | ====== |
|
|
20 | |
|
|
21 | Approved on 8-Nov-2004 assuming that implementation will include |
|
|
22 | documentation for correctly encoding files within nano. |
| 17 | |
23 | |
| 18 | Motivation |
24 | Motivation |
| 19 | ========== |
25 | ========== |
| 20 | |
26 | |
| 21 | At present we have several developers and many more users whose names |
27 | At present we have several developers and many more users whose names |
| 22 | require characters (for example, accents) which are not part of the |
28 | require characters (for example, accents) which are not part of the |
| 23 | standard 'safe' 0..127 ASCII range. There is no current standard on how |
29 | standard 'safe' 0..127 ASCII range. There is no current standard on how |
| 24 | these should be represented, leading to inconsistency across the tree. |
30 | these should be represented, leading to inconsistency across the tree. |
| 25 | |
31 | |
| 26 | Although the issues involved have been discussed many times informally, no |
32 | Although the issues involved have been discussed informally many times, no |
| 27 | official decision has been made. |
33 | official decision has been made. |
| 28 | |
34 | |
| 29 | Specification |
35 | Specification |
| 30 | ============= |
36 | ============= |
| 31 | |
37 | |
| … | |
… | |
| 49 | -------------------------------- |
55 | -------------------------------- |
| 50 | |
56 | |
| 51 | For the same reasons as previously, it is proposed that UTF-8 is used as |
57 | For the same reasons as previously, it is proposed that UTF-8 is used as |
| 52 | the official encoding for ebuild and eclass files. |
58 | the official encoding for ebuild and eclass files. |
| 53 | |
59 | |
| 54 | However, developers should be warned that any output which is parsed by |
60 | However, developers should be warned that any code which is parsed by bash |
| 55 | bash (in other words, non-comments), and any output which is echoed to the |
61 | (in other words, non-comments), and any output which is echoed to the |
| 56 | screen (for example, einfo messages) must not use anything outside the |
62 | screen (for example, einfo messages) or given to portage (for example any |
|
|
63 | of the standard global variables) must not use anything outside the |
| 57 | regular ASCII 0..127 range for compatibility purposes. |
64 | regular ASCII 0..127 range for compatibility purposes. |
| 58 | |
65 | |
| 59 | files/ Entries Character Sets |
66 | files/ Entries Character Sets |
| 60 | ----------------------------- |
67 | ----------------------------- |
| 61 | |
68 | |
| … | |
… | |
| 83 | |
90 | |
| 84 | Certain text editors are incapable of handling UTF-8 cleanly. However, |
91 | Certain text editors are incapable of handling UTF-8 cleanly. However, |
| 85 | since the ``echangelog`` tool is generally the correct way to generate |
92 | since the ``echangelog`` tool is generally the correct way to generate |
| 86 | ChangeLog entries, this should not be a major problem. Generating |
93 | ChangeLog entries, this should not be a major problem. Generating |
| 87 | metadata.xml files correctly in these editors could become problematic. |
94 | metadata.xml files correctly in these editors could become problematic. |
| 88 | (The ``vim`` and ``emacs`` editors, which appear to be most widely used, |
95 | The ``vim`` and ``emacs`` editors, which appear to be most widely used, |
| 89 | are both capable of handling UTF-8 cleanly.) |
96 | are both capable of handling UTF-8 cleanly -- for vim, this could be |
|
|
97 | configured automatically via the ``gentoo-syntax`` ([4]_) package. |
| 90 | |
98 | |
| 91 | References |
99 | References |
| 92 | ========== |
100 | ========== |
| 93 | |
101 | |
| 94 | .. [1] RFC 3629: UTF-8, a transformation format of ISO 10646 |
102 | .. [1] RFC 3629: UTF-8, a transformation format of ISO 10646 |
| 95 | http://www.ietf.org/rfc/rfc3629.txt |
103 | http://www.ietf.org/rfc/rfc3629.txt |
| 96 | .. [2] ISO/IEC 10646 (Universal Multiple-Octet Coded Character Set) |
104 | .. [2] ISO/IEC 10646 (Universal Multiple-Octet Coded Character Set) |
| 97 | .. [3] ISO/IEC 8859 (8-bit single-byte coded graphic character sets) |
105 | .. [3] ISO/IEC 8859 (8-bit single-byte coded graphic character sets) |
|
|
106 | .. [4] The app-vim/gentoo-syntax package, |
|
|
107 | https://developer.berlios.de/projects/gentoo-syntax/ |
| 98 | |
108 | |
| 99 | Copyright |
109 | Copyright |
| 100 | ========= |
110 | ========= |
| 101 | |
111 | |
| 102 | This document has been placed in the public domain. |
112 | This document has been placed in the public domain. |
| 103 | |
113 | |
| 104 | vim: set tw=74 fileencoding=utf-8 : |
114 | .. vim: set tw=74 fileencoding=utf-8 : |
| 105 | |
115 | |