/[gentoo]/xml/htdocs/proj/en/glep/glep-0031.html
Gentoo

Contents of /xml/htdocs/proj/en/glep/glep-0031.html

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.8 - (hide annotations) (download) (as text)
Sun Oct 14 17:00:15 2007 UTC (6 years, 10 months ago) by antarus
Branch: MAIN
CVS Tags: HEAD
Changes since 1.7: +6 -253 lines
File MIME type: text/html
the canary on 53 went well, changing the rest

1 g2boojum 1.1 <?xml version="1.0" encoding="utf-8" ?>
2     <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
3     <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
4 antarus 1.8
5 g2boojum 1.1 <head>
6     <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
7 g2boojum 1.5 <meta name="generator" content="Docutils 0.4: http://docutils.sourceforge.net/" />
8 g2boojum 1.1 <title>GLEP 31 -- Character Sets for Portage Tree Items</title>
9 antarus 1.8 <link rel="stylesheet" href="tools/glep.css" type="text/css" />
10 g2boojum 1.1 </head>
11     <body bgcolor="white">
12     <table class="navigation" cellpadding="0" cellspacing="0"
13     width="100%" border="0">
14     <tr><td class="navicon" width="150" height="35">
15     <a href="http://www.gentoo.org/" title="Gentoo Linux Home Page">
16     <img src="http://www.gentoo.org/images/gentoo-new.gif" alt="[Gentoo]"
17     border="0" width="150" height="35" /></a></td>
18     <td class="textlinks" align="left">
19     [<b><a href="http://www.gentoo.org/">Gentoo Linux Home</a></b>]
20 antarus 1.8 [<b><a href="http://www.gentoo.org/proj/en/glep">GLEP Index</a></b>]
21 g2boojum 1.5 [<b><a href="http://www.gentoo.org/proj/en/glep/glep-0031.txt">GLEP Source</a></b>]
22 g2boojum 1.1 </td></tr></table>
23 ciaranm 1.4 <table class="rfc2822 docutils field-list" frame="void" rules="none">
24 g2boojum 1.1 <col class="field-name" />
25     <col class="field-body" />
26     <tbody valign="top">
27     <tr class="field"><th class="field-name">GLEP:</th><td class="field-body">31</td>
28     </tr>
29     <tr class="field"><th class="field-name">Title:</th><td class="field-body">Character Sets for Portage Tree Items</td>
30     </tr>
31 antarus 1.8 <tr class="field"><th class="field-name">Version:</th><td class="field-body">1.6</td>
32 g2boojum 1.1 </tr>
33     <tr class="field"><th class="field-name">Author:</th><td class="field-body">Ciaran McCreesh &lt;ciaranm&#32;&#97;t&#32;gentoo.org&gt;</td>
34     </tr>
35 antarus 1.8 <tr class="field"><th class="field-name">Last-Modified:</th><td class="field-body"><a class="reference" href="http://www.gentoo.org/cgi-bin/viewcvs.cgi/xml/htdocs/proj/en/glep/glep-0031.txt?cvsroot=gentoo">2007/04/21 03:03:05</a></td>
36 g2boojum 1.1 </tr>
37 antarus 1.7 <tr class="field"><th class="field-name">Status:</th><td class="field-body">Final</td>
38 g2boojum 1.1 </tr>
39     <tr class="field"><th class="field-name">Type:</th><td class="field-body">Standards Track</td>
40     </tr>
41 g2boojum 1.5 <tr class="field"><th class="field-name">Content-Type:</th><td class="field-body"><a class="reference" href="glep-0002.html">text/x-rst</a></td>
42 g2boojum 1.1 </tr>
43 ciaranm 1.4 <tr class="field"><th class="field-name">Created:</th><td class="field-body">27-Oct-2004</td>
44 g2boojum 1.1 </tr>
45 ciaranm 1.4 <tr class="field"><th class="field-name">Post-History:</th><td class="field-body">28-Oct-2004, 1-Nov-2004, 11-Nov-2004</td>
46 g2boojum 1.1 </tr>
47     </tbody>
48     </table>
49     <hr />
50 g2boojum 1.5 <div class="contents topic">
51     <p class="topic-title first"><a id="contents" name="contents">Contents</a></p>
52 g2boojum 1.1 <ul class="simple">
53 g2boojum 1.2 <li><a class="reference" href="#abstract" id="id9" name="id9">Abstract</a></li>
54 g2boojum 1.3 <li><a class="reference" href="#status" id="id10" name="id10">Status</a></li>
55     <li><a class="reference" href="#motivation" id="id11" name="id11">Motivation</a></li>
56     <li><a class="reference" href="#specification" id="id12" name="id12">Specification</a><ul>
57     <li><a class="reference" href="#changelog-and-metadata-character-sets" id="id13" name="id13">ChangeLog and Metadata Character Sets</a></li>
58     <li><a class="reference" href="#ebuild-and-eclass-character-sets" id="id14" name="id14">Ebuild and Eclass Character Sets</a></li>
59     <li><a class="reference" href="#files-entries-character-sets" id="id15" name="id15">files/ Entries Character Sets</a></li>
60     <li><a class="reference" href="#suitable-characters-for-file-and-directory-names" id="id16" name="id16">Suitable Characters for File and Directory Names</a></li>
61 g2boojum 1.1 </ul>
62     </li>
63 g2boojum 1.3 <li><a class="reference" href="#backwards-compatibility" id="id17" name="id17">Backwards Compatibility</a></li>
64     <li><a class="reference" href="#references" id="id18" name="id18">References</a></li>
65     <li><a class="reference" href="#copyright" id="id19" name="id19">Copyright</a></li>
66 g2boojum 1.1 </ul>
67     </div>
68 g2boojum 1.5 <div class="section">
69     <h1><a class="toc-backref" href="#id9" id="abstract" name="abstract">Abstract</a></h1>
70 g2boojum 1.2 <p>A set of guidelines regarding what characters are permissible in the
71     portage tree and how they should be encoded is required.</p>
72 g2boojum 1.1 </div>
73 g2boojum 1.5 <div class="section">
74     <h1><a class="toc-backref" href="#id10" id="status" name="status">Status</a></h1>
75     <p>Approved on 8-Nov-2004 assuming that implementation will include
76 g2boojum 1.3 documentation for correctly encoding files within nano.</p>
77     </div>
78 g2boojum 1.5 <div class="section">
79     <h1><a class="toc-backref" href="#id11" id="motivation" name="motivation">Motivation</a></h1>
80 g2boojum 1.1 <p>At present we have several developers and many more users whose names
81     require characters (for example, accents) which are not part of the
82     standard 'safe' 0..127 ASCII range. There is no current standard on how
83     these should be represented, leading to inconsistency across the tree.</p>
84 g2boojum 1.2 <p>Although the issues involved have been discussed informally many times, no
85 g2boojum 1.1 official decision has been made.</p>
86     </div>
87 g2boojum 1.5 <div class="section">
88     <h1><a class="toc-backref" href="#id12" id="specification" name="specification">Specification</a></h1>
89     <div class="section">
90     <h2><a class="toc-backref" href="#id13" id="changelog-and-metadata-character-sets" name="changelog-and-metadata-character-sets">ChangeLog and Metadata Character Sets</a></h2>
91 g2boojum 1.2 <p>It is proposed that UTF-8 (<a class="footnote-reference" href="#id5" id="id1" name="id1">[1]</a>) is used for encoding ChangeLog and
92 g2boojum 1.1 metadata.xml files inside the portage tree.</p>
93 g2boojum 1.2 <p>UTF-8 allows the full range of Unicode (<a class="footnote-reference" href="#id6" id="id2" name="id2">[2]</a>) characters to be expressed,
94 g2boojum 1.1 which is necessary given the diversity of the Gentoo developer- and
95     user-base. It is character-compatible with ASCII for the 0..127
96     characters and does not significantly increase the storage requirements
97     for files which consist mainly of American English characters. It is
98     widely supported, widely used and an official standard.</p>
99 g2boojum 1.2 <p>The ISO-8859-* character sets (<a class="footnote-reference" href="#id7" id="id3" name="id3">[3]</a>) would <em>not</em> be appropriate since they
100 g2boojum 1.1 cannot express the full range of required characters.</p>
101     </div>
102 g2boojum 1.5 <div class="section">
103     <h2><a class="toc-backref" href="#id14" id="ebuild-and-eclass-character-sets" name="ebuild-and-eclass-character-sets">Ebuild and Eclass Character Sets</a></h2>
104 g2boojum 1.1 <p>For the same reasons as previously, it is proposed that UTF-8 is used as
105     the official encoding for ebuild and eclass files.</p>
106 g2boojum 1.2 <p>However, developers should be warned that any code which is parsed by bash
107     (in other words, non-comments), and any output which is echoed to the
108     screen (for example, einfo messages) or given to portage (for example any
109     of the standard global variables) must not use anything outside the
110 g2boojum 1.1 regular ASCII 0..127 range for compatibility purposes.</p>
111     </div>
112 g2boojum 1.5 <div class="section">
113     <h2><a class="toc-backref" href="#id15" id="files-entries-character-sets" name="files-entries-character-sets">files/ Entries Character Sets</a></h2>
114 g2boojum 1.1 <p>Patches must clearly be in the same character set as the file they are
115     patching. For other files/ entries (for example, GNOME desktop files),
116     consistency with the upstream-recommended character set is most sensible.</p>
117     </div>
118 g2boojum 1.5 <div class="section">
119     <h2><a class="toc-backref" href="#id16" id="suitable-characters-for-file-and-directory-names" name="suitable-characters-for-file-and-directory-names">Suitable Characters for File and Directory Names</a></h2>
120 g2boojum 1.1 <p>Characters outside the ASCII 0..127 range cannot safely be used for file
121     or directory names. (Of course, not all characters inside the ASCII 0..127
122     range can be used safely either.)</p>
123     </div>
124     </div>
125 g2boojum 1.5 <div class="section">
126     <h1><a class="toc-backref" href="#id17" id="backwards-compatibility" name="backwards-compatibility">Backwards Compatibility</a></h1>
127 g2boojum 1.1 <p>The existing tree uses a mixture of encodings. It would be straightforward
128     to fix existing ChangeLogs and metadata files to use UTF-8.</p>
129 ciaranm 1.4 <p>The <tt class="docutils literal"><span class="pre">echangelog</span></tt> tool is character-set agnostic. In order to properly
130 g2boojum 1.1 enter UTF-8, developers would have to switch to a UTF-8 shell session.
131     This only applies if the developer is entering new text which uses 'fancy'
132     characters -- existing characters are not mangled.</p>
133     <p>Certain text editors are incapable of handling UTF-8 cleanly. However,
134 ciaranm 1.4 since the <tt class="docutils literal"><span class="pre">echangelog</span></tt> tool is generally the correct way to generate
135 g2boojum 1.1 ChangeLog entries, this should not be a major problem. Generating
136     metadata.xml files correctly in these editors could become problematic.
137 ciaranm 1.4 The <tt class="docutils literal"><span class="pre">vim</span></tt> and <tt class="docutils literal"><span class="pre">emacs</span></tt> editors, which appear to be most widely used,
138 g2boojum 1.2 are both capable of handling UTF-8 cleanly -- for vim, this could be
139 ciaranm 1.4 configured automatically via the <tt class="docutils literal"><span class="pre">gentoo-syntax</span></tt> (<a class="footnote-reference" href="#id8" id="id4" name="id4">[4]</a>) package.</p>
140 g2boojum 1.1 </div>
141 g2boojum 1.5 <div class="section">
142     <h1><a class="toc-backref" href="#id18" id="references" name="references">References</a></h1>
143 ciaranm 1.4 <table class="docutils footnote" frame="void" id="id5" rules="none">
144 g2boojum 1.1 <colgroup><col class="label" /><col /></colgroup>
145     <tbody valign="top">
146 g2boojum 1.2 <tr><td class="label"><a class="fn-backref" href="#id1" name="id5">[1]</a></td><td><a class="reference" href="http://www.faqs.org/rfcs/rfc3629.html">RFC 3629</a>: UTF-8, a transformation format of ISO 10646
147 g2boojum 1.1 <a class="reference" href="http://www.ietf.org/rfc/rfc3629.txt">http://www.ietf.org/rfc/rfc3629.txt</a></td></tr>
148     </tbody>
149     </table>
150 ciaranm 1.4 <table class="docutils footnote" frame="void" id="id6" rules="none">
151 g2boojum 1.2 <colgroup><col class="label" /><col /></colgroup>
152     <tbody valign="top">
153     <tr><td class="label"><a class="fn-backref" href="#id2" name="id6">[2]</a></td><td>ISO/IEC 10646 (Universal Multiple-Octet Coded Character Set)</td></tr>
154     </tbody>
155     </table>
156 ciaranm 1.4 <table class="docutils footnote" frame="void" id="id7" rules="none">
157 g2boojum 1.1 <colgroup><col class="label" /><col /></colgroup>
158     <tbody valign="top">
159 g2boojum 1.2 <tr><td class="label"><a class="fn-backref" href="#id3" name="id7">[3]</a></td><td>ISO/IEC 8859 (8-bit single-byte coded graphic character sets)</td></tr>
160 g2boojum 1.1 </tbody>
161     </table>
162 ciaranm 1.4 <table class="docutils footnote" frame="void" id="id8" rules="none">
163 g2boojum 1.1 <colgroup><col class="label" /><col /></colgroup>
164     <tbody valign="top">
165 g2boojum 1.2 <tr><td class="label"><a class="fn-backref" href="#id4" name="id8">[4]</a></td><td>The app-vim/gentoo-syntax package,
166     <a class="reference" href="https://developer.berlios.de/projects/gentoo-syntax/">https://developer.berlios.de/projects/gentoo-syntax/</a></td></tr>
167 g2boojum 1.1 </tbody>
168     </table>
169     </div>
170 g2boojum 1.5 <div class="section">
171     <h1><a class="toc-backref" href="#id19" id="copyright" name="copyright">Copyright</a></h1>
172 g2boojum 1.1 <p>This document has been placed in the public domain.</p>
173 ciaranm 1.4 <!-- vim: set tw=74 fileencoding=utf-8 : -->
174 g2boojum 1.1 </div>
175 ciaranm 1.4
176 g2boojum 1.1 </div>
177 ciaranm 1.4 <div class="footer">
178 g2boojum 1.1 <hr class="footer" />
179     <a class="reference" href="glep-0031.txt">View document source</a>.
180 antarus 1.8 Generated on: 2007-10-13 13:39 UTC.
181 g2boojum 1.1 Generated by <a class="reference" href="http://docutils.sourceforge.net/">Docutils</a> from <a class="reference" href="http://docutils.sourceforge.net/rst.html">reStructuredText</a> source.
182 ciaranm 1.4
183 g2boojum 1.1 </div>
184     </body>
185     </html>

  ViewVC Help
Powered by ViewVC 1.1.20