/[gentoo]/xml/htdocs/proj/en/glep/glep-0031.html
Gentoo

Contents of /xml/htdocs/proj/en/glep/glep-0031.html

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.3 - (hide annotations) (download) (as text)
Thu Nov 11 21:38:14 2004 UTC (9 years, 10 months ago) by g2boojum
Branch: MAIN
Changes since 1.2: +29 -23 lines
File MIME type: text/html
update

1 g2boojum 1.1 <?xml version="1.0" encoding="utf-8" ?>
2     <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
3     <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
4     <!--
5     This HTML is auto-generated. DO NOT EDIT THIS FILE! If you are writing a new
6     PEP, see http://www.python.org/peps/pep-0001.html for instructions and links
7     to templates. DO NOT USE THIS HTML FILE AS YOUR TEMPLATE!
8     -->
9     <head>
10     <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
11     <meta name="generator" content="Docutils 0.3.3: http://docutils.sourceforge.net/" />
12     <title>GLEP 31 -- Character Sets for Portage Tree Items</title>
13     <link rel="stylesheet" href="tools/glep.css" type="text/css" />
14     </head>
15     <body bgcolor="white">
16     <table class="navigation" cellpadding="0" cellspacing="0"
17     width="100%" border="0">
18     <tr><td class="navicon" width="150" height="35">
19     <a href="http://www.gentoo.org/" title="Gentoo Linux Home Page">
20     <img src="http://www.gentoo.org/images/gentoo-new.gif" alt="[Gentoo]"
21     border="0" width="150" height="35" /></a></td>
22     <td class="textlinks" align="left">
23     [<b><a href="http://www.gentoo.org/">Gentoo Linux Home</a></b>]
24     [<b><a href="http://www.gentoo.org/proj/en/glep">GLEP Index</a></b>]
25     [<b><a href="./glep-0031.txt">GLEP Source</a></b>]
26     </td></tr></table>
27     <div class="document">
28     <table class="rfc2822 field-list" frame="void" rules="none">
29     <col class="field-name" />
30     <col class="field-body" />
31     <tbody valign="top">
32     <tr class="field"><th class="field-name">GLEP:</th><td class="field-body">31</td>
33     </tr>
34     <tr class="field"><th class="field-name">Title:</th><td class="field-body">Character Sets for Portage Tree Items</td>
35     </tr>
36 g2boojum 1.3 <tr class="field"><th class="field-name">Version:</th><td class="field-body">1.3</td>
37 g2boojum 1.1 </tr>
38     <tr class="field"><th class="field-name">Author:</th><td class="field-body">Ciaran McCreesh &lt;ciaranm&#32;&#97;t&#32;gentoo.org&gt;</td>
39     </tr>
40 g2boojum 1.3 <tr class="field"><th class="field-name">Last-Modified:</th><td class="field-body"><a class="reference" href="http://www.gentoo.org/cgi-bin/viewcvs/xml/htdocs/proj/en/glep/glep-0031.txt?cvsroot=gentoo">2004/11/11 21:38:05</a></td>
41 g2boojum 1.1 </tr>
42 g2boojum 1.3 <tr class="field"><th class="field-name">Status:</th><td class="field-body">Approved</td>
43 g2boojum 1.1 </tr>
44     <tr class="field"><th class="field-name">Type:</th><td class="field-body">Standards Track</td>
45     </tr>
46     <tr class="field"><th class="field-name">Content-Type:</th><td class="field-body"><a class="reference" href="glep-0012.html">text/x-rst</a></td>
47     </tr>
48     <tr class="field"><th class="field-name">Created:</th><td class="field-body">27-October-2004</td>
49     </tr>
50 g2boojum 1.3 <tr class="field"><th class="field-name">Post-Date:</th><td class="field-body">28-October-2004, 1-November-2004, 11-November-2004</td>
51 g2boojum 1.1 </tr>
52     </tbody>
53     </table>
54     <hr />
55     <div class="contents topic" id="contents">
56     <p class="topic-title first"><a name="contents">Contents</a></p>
57     <ul class="simple">
58 g2boojum 1.2 <li><a class="reference" href="#abstract" id="id9" name="id9">Abstract</a></li>
59 g2boojum 1.3 <li><a class="reference" href="#status" id="id10" name="id10">Status</a></li>
60     <li><a class="reference" href="#motivation" id="id11" name="id11">Motivation</a></li>
61     <li><a class="reference" href="#specification" id="id12" name="id12">Specification</a><ul>
62     <li><a class="reference" href="#changelog-and-metadata-character-sets" id="id13" name="id13">ChangeLog and Metadata Character Sets</a></li>
63     <li><a class="reference" href="#ebuild-and-eclass-character-sets" id="id14" name="id14">Ebuild and Eclass Character Sets</a></li>
64     <li><a class="reference" href="#files-entries-character-sets" id="id15" name="id15">files/ Entries Character Sets</a></li>
65     <li><a class="reference" href="#suitable-characters-for-file-and-directory-names" id="id16" name="id16">Suitable Characters for File and Directory Names</a></li>
66 g2boojum 1.1 </ul>
67     </li>
68 g2boojum 1.3 <li><a class="reference" href="#backwards-compatibility" id="id17" name="id17">Backwards Compatibility</a></li>
69     <li><a class="reference" href="#references" id="id18" name="id18">References</a></li>
70     <li><a class="reference" href="#copyright" id="id19" name="id19">Copyright</a></li>
71 g2boojum 1.1 </ul>
72     </div>
73     <div class="section" id="abstract">
74 g2boojum 1.2 <h1><a class="toc-backref" href="#id9" name="abstract">Abstract</a></h1>
75     <p>A set of guidelines regarding what characters are permissible in the
76     portage tree and how they should be encoded is required.</p>
77 g2boojum 1.1 </div>
78 g2boojum 1.3 <div class="section" id="status">
79     <h1><a class="toc-backref" href="#id10" name="status">Status</a></h1>
80     <p>Approved on 8-Nov-2004 assuming that implementation will include
81     documentation for correctly encoding files within nano.</p>
82     </div>
83 g2boojum 1.1 <div class="section" id="motivation">
84 g2boojum 1.3 <h1><a class="toc-backref" href="#id11" name="motivation">Motivation</a></h1>
85 g2boojum 1.1 <p>At present we have several developers and many more users whose names
86     require characters (for example, accents) which are not part of the
87     standard 'safe' 0..127 ASCII range. There is no current standard on how
88     these should be represented, leading to inconsistency across the tree.</p>
89 g2boojum 1.2 <p>Although the issues involved have been discussed informally many times, no
90 g2boojum 1.1 official decision has been made.</p>
91     </div>
92     <div class="section" id="specification">
93 g2boojum 1.3 <h1><a class="toc-backref" href="#id12" name="specification">Specification</a></h1>
94 g2boojum 1.1 <div class="section" id="changelog-and-metadata-character-sets">
95 g2boojum 1.3 <h2><a class="toc-backref" href="#id13" name="changelog-and-metadata-character-sets">ChangeLog and Metadata Character Sets</a></h2>
96 g2boojum 1.2 <p>It is proposed that UTF-8 (<a class="footnote-reference" href="#id5" id="id1" name="id1">[1]</a>) is used for encoding ChangeLog and
97 g2boojum 1.1 metadata.xml files inside the portage tree.</p>
98 g2boojum 1.2 <p>UTF-8 allows the full range of Unicode (<a class="footnote-reference" href="#id6" id="id2" name="id2">[2]</a>) characters to be expressed,
99 g2boojum 1.1 which is necessary given the diversity of the Gentoo developer- and
100     user-base. It is character-compatible with ASCII for the 0..127
101     characters and does not significantly increase the storage requirements
102     for files which consist mainly of American English characters. It is
103     widely supported, widely used and an official standard.</p>
104 g2boojum 1.2 <p>The ISO-8859-* character sets (<a class="footnote-reference" href="#id7" id="id3" name="id3">[3]</a>) would <em>not</em> be appropriate since they
105 g2boojum 1.1 cannot express the full range of required characters.</p>
106     </div>
107     <div class="section" id="ebuild-and-eclass-character-sets">
108 g2boojum 1.3 <h2><a class="toc-backref" href="#id14" name="ebuild-and-eclass-character-sets">Ebuild and Eclass Character Sets</a></h2>
109 g2boojum 1.1 <p>For the same reasons as previously, it is proposed that UTF-8 is used as
110     the official encoding for ebuild and eclass files.</p>
111 g2boojum 1.2 <p>However, developers should be warned that any code which is parsed by bash
112     (in other words, non-comments), and any output which is echoed to the
113     screen (for example, einfo messages) or given to portage (for example any
114     of the standard global variables) must not use anything outside the
115 g2boojum 1.1 regular ASCII 0..127 range for compatibility purposes.</p>
116     </div>
117     <div class="section" id="files-entries-character-sets">
118 g2boojum 1.3 <h2><a class="toc-backref" href="#id15" name="files-entries-character-sets">files/ Entries Character Sets</a></h2>
119 g2boojum 1.1 <p>Patches must clearly be in the same character set as the file they are
120     patching. For other files/ entries (for example, GNOME desktop files),
121     consistency with the upstream-recommended character set is most sensible.</p>
122     </div>
123     <div class="section" id="suitable-characters-for-file-and-directory-names">
124 g2boojum 1.3 <h2><a class="toc-backref" href="#id16" name="suitable-characters-for-file-and-directory-names">Suitable Characters for File and Directory Names</a></h2>
125 g2boojum 1.1 <p>Characters outside the ASCII 0..127 range cannot safely be used for file
126     or directory names. (Of course, not all characters inside the ASCII 0..127
127     range can be used safely either.)</p>
128     </div>
129     </div>
130     <div class="section" id="backwards-compatibility">
131 g2boojum 1.3 <h1><a class="toc-backref" href="#id17" name="backwards-compatibility">Backwards Compatibility</a></h1>
132 g2boojum 1.1 <p>The existing tree uses a mixture of encodings. It would be straightforward
133     to fix existing ChangeLogs and metadata files to use UTF-8.</p>
134     <p>The <tt class="literal"><span class="pre">echangelog</span></tt> tool is character-set agnostic. In order to properly
135     enter UTF-8, developers would have to switch to a UTF-8 shell session.
136     This only applies if the developer is entering new text which uses 'fancy'
137     characters -- existing characters are not mangled.</p>
138     <p>Certain text editors are incapable of handling UTF-8 cleanly. However,
139     since the <tt class="literal"><span class="pre">echangelog</span></tt> tool is generally the correct way to generate
140     ChangeLog entries, this should not be a major problem. Generating
141     metadata.xml files correctly in these editors could become problematic.
142 g2boojum 1.2 The <tt class="literal"><span class="pre">vim</span></tt> and <tt class="literal"><span class="pre">emacs</span></tt> editors, which appear to be most widely used,
143     are both capable of handling UTF-8 cleanly -- for vim, this could be
144     configured automatically via the <tt class="literal"><span class="pre">gentoo-syntax</span></tt> (<a class="footnote-reference" href="#id8" id="id4" name="id4">[4]</a>) package.</p>
145 g2boojum 1.1 </div>
146     <div class="section" id="references">
147 g2boojum 1.3 <h1><a class="toc-backref" href="#id18" name="references">References</a></h1>
148 g2boojum 1.2 <table class="footnote" frame="void" id="id5" rules="none">
149 g2boojum 1.1 <colgroup><col class="label" /><col /></colgroup>
150     <tbody valign="top">
151 g2boojum 1.2 <tr><td class="label"><a class="fn-backref" href="#id1" name="id5">[1]</a></td><td><a class="reference" href="http://www.faqs.org/rfcs/rfc3629.html">RFC 3629</a>: UTF-8, a transformation format of ISO 10646
152 g2boojum 1.1 <a class="reference" href="http://www.ietf.org/rfc/rfc3629.txt">http://www.ietf.org/rfc/rfc3629.txt</a></td></tr>
153     </tbody>
154     </table>
155 g2boojum 1.2 <table class="footnote" frame="void" id="id6" rules="none">
156     <colgroup><col class="label" /><col /></colgroup>
157     <tbody valign="top">
158     <tr><td class="label"><a class="fn-backref" href="#id2" name="id6">[2]</a></td><td>ISO/IEC 10646 (Universal Multiple-Octet Coded Character Set)</td></tr>
159     </tbody>
160     </table>
161     <table class="footnote" frame="void" id="id7" rules="none">
162 g2boojum 1.1 <colgroup><col class="label" /><col /></colgroup>
163     <tbody valign="top">
164 g2boojum 1.2 <tr><td class="label"><a class="fn-backref" href="#id3" name="id7">[3]</a></td><td>ISO/IEC 8859 (8-bit single-byte coded graphic character sets)</td></tr>
165 g2boojum 1.1 </tbody>
166     </table>
167 g2boojum 1.2 <table class="footnote" frame="void" id="id8" rules="none">
168 g2boojum 1.1 <colgroup><col class="label" /><col /></colgroup>
169     <tbody valign="top">
170 g2boojum 1.2 <tr><td class="label"><a class="fn-backref" href="#id4" name="id8">[4]</a></td><td>The app-vim/gentoo-syntax package,
171     <a class="reference" href="https://developer.berlios.de/projects/gentoo-syntax/">https://developer.berlios.de/projects/gentoo-syntax/</a></td></tr>
172 g2boojum 1.1 </tbody>
173     </table>
174     </div>
175     <div class="section" id="copyright">
176 g2boojum 1.3 <h1><a class="toc-backref" href="#id19" name="copyright">Copyright</a></h1>
177 g2boojum 1.1 <p>This document has been placed in the public domain.</p>
178     <blockquote>
179     vim: set tw=74 fileencoding=utf-8 :</blockquote>
180     </div>
181     </div>
182    
183     <hr class="footer" />
184     <div class="footer">
185     <a class="reference" href="glep-0031.txt">View document source</a>.
186 g2boojum 1.3 Generated on: 2004-11-11 21:31 UTC.
187 g2boojum 1.1 Generated by <a class="reference" href="http://docutils.sourceforge.net/">Docutils</a> from <a class="reference" href="http://docutils.sourceforge.net/rst.html">reStructuredText</a> source.
188     </div>
189     </body>
190     </html>

  ViewVC Help
Powered by ViewVC 1.1.20