/[gentoo]/xml/htdocs/proj/en/glep/glep-0031.html
Gentoo

Contents of /xml/htdocs/proj/en/glep/glep-0031.html

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.1 - (show annotations) (download) (as text)
Thu Oct 28 17:00:34 2004 UTC (10 years, 1 month ago) by g2boojum
Branch: MAIN
File MIME type: text/html
new glep

1 <?xml version="1.0" encoding="utf-8" ?>
2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
3 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
4 <!--
5 This HTML is auto-generated. DO NOT EDIT THIS FILE! If you are writing a new
6 PEP, see http://www.python.org/peps/pep-0001.html for instructions and links
7 to templates. DO NOT USE THIS HTML FILE AS YOUR TEMPLATE!
8 -->
9 <head>
10 <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
11 <meta name="generator" content="Docutils 0.3.3: http://docutils.sourceforge.net/" />
12 <title>GLEP 31 -- Character Sets for Portage Tree Items</title>
13 <link rel="stylesheet" href="tools/glep.css" type="text/css" />
14 </head>
15 <body bgcolor="white">
16 <table class="navigation" cellpadding="0" cellspacing="0"
17 width="100%" border="0">
18 <tr><td class="navicon" width="150" height="35">
19 <a href="http://www.gentoo.org/" title="Gentoo Linux Home Page">
20 <img src="http://www.gentoo.org/images/gentoo-new.gif" alt="[Gentoo]"
21 border="0" width="150" height="35" /></a></td>
22 <td class="textlinks" align="left">
23 [<b><a href="http://www.gentoo.org/">Gentoo Linux Home</a></b>]
24 [<b><a href="http://www.gentoo.org/proj/en/glep">GLEP Index</a></b>]
25 [<b><a href="./glep-0031.txt">GLEP Source</a></b>]
26 </td></tr></table>
27 <div class="document">
28 <table class="rfc2822 field-list" frame="void" rules="none">
29 <col class="field-name" />
30 <col class="field-body" />
31 <tbody valign="top">
32 <tr class="field"><th class="field-name">GLEP:</th><td class="field-body">31</td>
33 </tr>
34 <tr class="field"><th class="field-name">Title:</th><td class="field-body">Character Sets for Portage Tree Items</td>
35 </tr>
36 <tr class="field"><th class="field-name">Version:</th><td class="field-body">1.1</td>
37 </tr>
38 <tr class="field"><th class="field-name">Author:</th><td class="field-body">Ciaran McCreesh &lt;ciaranm&#32;&#97;t&#32;gentoo.org&gt;</td>
39 </tr>
40 <tr class="field"><th class="field-name">Last-Modified:</th><td class="field-body"><a class="reference" href="http://www.gentoo.org/cgi-bin/viewcvs/xml/htdocs/proj/en/glep/glep-0031.txt?cvsroot=gentoo">2004/10/28 17:00:22</a></td>
41 </tr>
42 <tr class="field"><th class="field-name">Status:</th><td class="field-body">Draft</td>
43 </tr>
44 <tr class="field"><th class="field-name">Type:</th><td class="field-body">Standards Track</td>
45 </tr>
46 <tr class="field"><th class="field-name">Content-Type:</th><td class="field-body"><a class="reference" href="glep-0012.html">text/x-rst</a></td>
47 </tr>
48 <tr class="field"><th class="field-name">Created:</th><td class="field-body">27-October-2004</td>
49 </tr>
50 <tr class="field"><th class="field-name">Post-Date:</th><td class="field-body">28-October-2004</td>
51 </tr>
52 </tbody>
53 </table>
54 <hr />
55 <div class="contents topic" id="contents">
56 <p class="topic-title first"><a name="contents">Contents</a></p>
57 <ul class="simple">
58 <li><a class="reference" href="#abstract" id="id7" name="id7">Abstract</a></li>
59 <li><a class="reference" href="#motivation" id="id8" name="id8">Motivation</a></li>
60 <li><a class="reference" href="#specification" id="id9" name="id9">Specification</a><ul>
61 <li><a class="reference" href="#changelog-and-metadata-character-sets" id="id10" name="id10">ChangeLog and Metadata Character Sets</a></li>
62 <li><a class="reference" href="#ebuild-and-eclass-character-sets" id="id11" name="id11">Ebuild and Eclass Character Sets</a></li>
63 <li><a class="reference" href="#files-entries-character-sets" id="id12" name="id12">files/ Entries Character Sets</a></li>
64 <li><a class="reference" href="#suitable-characters-for-file-and-directory-names" id="id13" name="id13">Suitable Characters for File and Directory Names</a></li>
65 </ul>
66 </li>
67 <li><a class="reference" href="#backwards-compatibility" id="id14" name="id14">Backwards Compatibility</a></li>
68 <li><a class="reference" href="#references" id="id15" name="id15">References</a></li>
69 <li><a class="reference" href="#copyright" id="id16" name="id16">Copyright</a></li>
70 </ul>
71 </div>
72 <div class="section" id="abstract">
73 <h1><a class="toc-backref" href="#id7" name="abstract">Abstract</a></h1>
74 <p>A set of rules regarding what characters are permissible in the portage
75 tree and how they should be encoded is required.</p>
76 </div>
77 <div class="section" id="motivation">
78 <h1><a class="toc-backref" href="#id8" name="motivation">Motivation</a></h1>
79 <p>At present we have several developers and many more users whose names
80 require characters (for example, accents) which are not part of the
81 standard 'safe' 0..127 ASCII range. There is no current standard on how
82 these should be represented, leading to inconsistency across the tree.</p>
83 <p>Although the issues involved have been discussed many times informally, no
84 official decision has been made.</p>
85 </div>
86 <div class="section" id="specification">
87 <h1><a class="toc-backref" href="#id9" name="specification">Specification</a></h1>
88 <div class="section" id="changelog-and-metadata-character-sets">
89 <h2><a class="toc-backref" href="#id10" name="changelog-and-metadata-character-sets">ChangeLog and Metadata Character Sets</a></h2>
90 <p>It is proposed that UTF-8 (<a class="footnote-reference" href="#id4" id="id1" name="id1">[1]</a>) is used for encoding ChangeLog and
91 metadata.xml files inside the portage tree.</p>
92 <p>UTF-8 allows the full range of Unicode (<a class="footnote-reference" href="#id5" id="id2" name="id2">[2]</a>) characters to be expressed,
93 which is necessary given the diversity of the Gentoo developer- and
94 user-base. It is character-compatible with ASCII for the 0..127
95 characters and does not significantly increase the storage requirements
96 for files which consist mainly of American English characters. It is
97 widely supported, widely used and an official standard.</p>
98 <p>The ISO-8859-* character sets (<a class="footnote-reference" href="#id6" id="id3" name="id3">[3]</a>) would <em>not</em> be appropriate since they
99 cannot express the full range of required characters.</p>
100 </div>
101 <div class="section" id="ebuild-and-eclass-character-sets">
102 <h2><a class="toc-backref" href="#id11" name="ebuild-and-eclass-character-sets">Ebuild and Eclass Character Sets</a></h2>
103 <p>For the same reasons as previously, it is proposed that UTF-8 is used as
104 the official encoding for ebuild and eclass files.</p>
105 <p>However, developers should be warned that any output which is parsed by
106 bash (in other words, non-comments), and any output which is echoed to the
107 screen (for example, einfo messages) must not use anything outside the
108 regular ASCII 0..127 range for compatibility purposes.</p>
109 </div>
110 <div class="section" id="files-entries-character-sets">
111 <h2><a class="toc-backref" href="#id12" name="files-entries-character-sets">files/ Entries Character Sets</a></h2>
112 <p>Patches must clearly be in the same character set as the file they are
113 patching. For other files/ entries (for example, GNOME desktop files),
114 consistency with the upstream-recommended character set is most sensible.</p>
115 </div>
116 <div class="section" id="suitable-characters-for-file-and-directory-names">
117 <h2><a class="toc-backref" href="#id13" name="suitable-characters-for-file-and-directory-names">Suitable Characters for File and Directory Names</a></h2>
118 <p>Characters outside the ASCII 0..127 range cannot safely be used for file
119 or directory names. (Of course, not all characters inside the ASCII 0..127
120 range can be used safely either.)</p>
121 </div>
122 </div>
123 <div class="section" id="backwards-compatibility">
124 <h1><a class="toc-backref" href="#id14" name="backwards-compatibility">Backwards Compatibility</a></h1>
125 <p>The existing tree uses a mixture of encodings. It would be straightforward
126 to fix existing ChangeLogs and metadata files to use UTF-8.</p>
127 <p>The <tt class="literal"><span class="pre">echangelog</span></tt> tool is character-set agnostic. In order to properly
128 enter UTF-8, developers would have to switch to a UTF-8 shell session.
129 This only applies if the developer is entering new text which uses 'fancy'
130 characters -- existing characters are not mangled.</p>
131 <p>Certain text editors are incapable of handling UTF-8 cleanly. However,
132 since the <tt class="literal"><span class="pre">echangelog</span></tt> tool is generally the correct way to generate
133 ChangeLog entries, this should not be a major problem. Generating
134 metadata.xml files correctly in these editors could become problematic.
135 (The <tt class="literal"><span class="pre">vim</span></tt> and <tt class="literal"><span class="pre">emacs</span></tt> editors, which appear to be most widely used,
136 are both capable of handling UTF-8 cleanly.)</p>
137 </div>
138 <div class="section" id="references">
139 <h1><a class="toc-backref" href="#id15" name="references">References</a></h1>
140 <table class="footnote" frame="void" id="id4" rules="none">
141 <colgroup><col class="label" /><col /></colgroup>
142 <tbody valign="top">
143 <tr><td class="label"><a class="fn-backref" href="#id1" name="id4">[1]</a></td><td><a class="reference" href="http://www.faqs.org/rfcs/rfc3629.html">RFC 3629</a>: UTF-8, a transformation format of ISO 10646
144 <a class="reference" href="http://www.ietf.org/rfc/rfc3629.txt">http://www.ietf.org/rfc/rfc3629.txt</a></td></tr>
145 </tbody>
146 </table>
147 <table class="footnote" frame="void" id="id5" rules="none">
148 <colgroup><col class="label" /><col /></colgroup>
149 <tbody valign="top">
150 <tr><td class="label"><a class="fn-backref" href="#id2" name="id5">[2]</a></td><td>ISO/IEC 10646 (Universal Multiple-Octet Coded Character Set)</td></tr>
151 </tbody>
152 </table>
153 <table class="footnote" frame="void" id="id6" rules="none">
154 <colgroup><col class="label" /><col /></colgroup>
155 <tbody valign="top">
156 <tr><td class="label"><a class="fn-backref" href="#id3" name="id6">[3]</a></td><td>ISO/IEC 8859 (8-bit single-byte coded graphic character sets)</td></tr>
157 </tbody>
158 </table>
159 </div>
160 <div class="section" id="copyright">
161 <h1><a class="toc-backref" href="#id16" name="copyright">Copyright</a></h1>
162 <p>This document has been placed in the public domain.</p>
163 <blockquote>
164 vim: set tw=74 fileencoding=utf-8 :</blockquote>
165 </div>
166 </div>
167
168 <hr class="footer" />
169 <div class="footer">
170 <a class="reference" href="glep-0031.txt">View document source</a>.
171 Generated on: 2004-10-28 16:53 UTC.
172 Generated by <a class="reference" href="http://docutils.sourceforge.net/">Docutils</a> from <a class="reference" href="http://docutils.sourceforge.net/rst.html">reStructuredText</a> source.
173 </div>
174 </body>
175 </html>

  ViewVC Help
Powered by ViewVC 1.1.20