/[gentoo]/xml/htdocs/proj/en/glep/glep-0047.txt
Gentoo

Contents of /xml/htdocs/proj/en/glep/glep-0047.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.4 - (show annotations) (download)
Mon Feb 13 21:00:50 2006 UTC (8 years, 7 months ago) by grobian
Branch: MAIN
Changes since 1.3: +3 -3 lines
File MIME type: text/plain
typo

1 GLEP: 47
2 Title: Creating 'safe' environment variables
3 Version: $Revision: 1.3 $
4 Last-Modified: $Date: 2006/02/12 19:57:57 $
5 Author: Diego Pettenò, Fabian Groffen
6 Status: Draft
7 Type: Standards Track
8 Content-Type: text/x-rst
9 Created: 14-Oct-2005
10 Post-History: 09-Feb-2006
11
12
13 Credits
14 =======
15
16 The text of this GLEP is a result of a discussion and input of the
17 following persons, in no particular order: Mike Frysinger, Diego
18 Pettenò, Fabian Groffen and Finn Thain.
19
20
21 Abstract
22 ========
23
24 In order for ebuilds and eclasses to be able to make host specific
25 decisions, it is necessary to have a number of environmental variables
26 which allow for such decisions. This GLEP introduces some measures that
27 need to be made to make these decisions 'safe', by making sure the
28 variables the decisions are based on are 'safe'. A small overlap with
29 GLEP 22 [1]_ is being handled in this GLEP where the use of 2-tuple
30 keywords are being kept instead of 4-tuple keywords. Additionally, the
31 ``ELIBC``, ``KERNEL`` and ``ARCH`` get auto filled starting from
32 ``CHOST`` and the 2-tuple keyword, instead of solely from they 4-tuple
33 keyword as proposed in GLEP 22.
34
35 The destiny of the ``USERLAND`` variable is out of the scope of this
36 GLEP. Depending on its presence in the tree, it may be decided to set
37 this variable the same way we propose to set ``ELIBC``, ``KERNEL`` and
38 ``ARCH``, or alternatively, e.g. via the profiles.
39
40
41 Motivation
42 ==========
43
44 The Gentoo/Alt project is in an emerging state to get ready to serve a
45 plethora of 'alternative' configurations such as FreeBSD, NetBSD,
46 DragonflyBSD, GNU/kFreeBSD, Mac OS X, (Open)Darwin, (Open)Solaris and so
47 on. As such, the project is in need for a better grip on the actual
48 host being built on. This information on the host environment is
49 necessary to make proper (automated) decisions on settings that are
50 highly dependant on the build environment, such as platform or C-library
51 implementation.
52
53
54 Rationale
55 =========
56
57 Gentoo's unique Portage system allows easy installation of applications
58 from source packages. Compiling sources is prone to many environmental
59 settings and availability of certain tools. Only recently the Gentoo
60 for FreeBSD project has started, as second Gentoo project that operates
61 on a foreign host operating system using foreign (non-GNU) C-libraries
62 and userland utilities. Such projects suffer from the current implicit
63 assumption made within Gentoo Portage's ebuilds that there is a single
64 type of operating system, C-libraries and system utilities. In order to
65 enable ebuilds -- and also eclasses -- to be aware of these
66 environmental differences, information regarding it should be supplied.
67 Since decisions based on this information can be vital, it is of high
68 importance that this information can be trusted and the values can be
69 considered 'safe' and correct.
70
71
72 Backwards Compatibility
73 =======================
74
75 The proposed keywording scheme in this GLEP is fully compatible with the
76 current situation of the portage tree, this in contrast to GLEP 22. The
77 variables provided by GLEP 22 can't be extracted from the new keyword,
78 but since GLEP 22-style keywords aren't in the tree at the moment, that
79 is not a problem. The same information can be extracted from the CHOST
80 variable, if necessary. No modifications to ebuilds will have to be
81 made.
82
83
84 Specification
85 =============
86
87 Unlike GLEP 22 the currently used keyword scheme is not changed.
88 Instead of proposing a 4-tuple [2]_ keyword, a 2-tuple keyword is chosen
89 for archs that require them. Archs for which a 1-tuple keyword
90 suffices, can keep that keyword. Since this doesn't change anything to
91 the current situation in the tree, it is considered to be a big
92 advantage over the 4-tuple keyword from GLEP 22. This GLEP is an
93 official specification of the syntax of the keyword.
94
95 Keywords will consist out of two parts separated by a hyphen ('-'). The
96 part up to the first hyphen from the left of the keyword 2-tuple is the
97 architecture, such as ppc64, sparc and x86. Allowed characters for the
98 architecture name are in ``a-z0-9``. The remaining part on the right of
99 the first hyphen from the left indicates the operating system or
100 distribution, such as linux, macos, darwin, obsd, et-cetera. If the
101 right hand part is omitted, it implies the operating system/distribution
102 type is Gentoo GNU/Linux. In such case the hyphen is also omitted, and
103 the keyword consists of solely the architecture. The operating system
104 or distribution name can consist out of characters in ``a-zA-Z0-9_+:-``.
105 Please note that the hyphen is an allowed character, and therefore the
106 separation of the two fields in the keyword is only determinable by
107 scanning for the first hyphen character from the start of the keyword
108 string. Examples of keywords following this specification are
109 ppc-darwin and x86. This is fully compatible with the current use of
110 keywords in the tree.
111
112 The variables ``ELIBC``, ``KERNEL`` and ``ARCH`` are currently set in
113 the profiles when other than their defaults for a GNU/Linux system.
114 They can as such easily be overridden and defined by the user. To
115 prevent this from happening, the variables should be auto filled by
116 Portage itself, based on the ``CHOST`` variable. While the ``CHOST``
117 variable can be as easy as the others set by the user, it still is
118 assumed to be 'safe'. This assumption is grounded in the fact that the
119 variable itself is being used in various other places with the same
120 intention, and that an invalid ``CHOST`` will cause major malfunctioning
121 of the system. A user that changes the ``CHOST`` into something that is
122 not valid for the system, is already warned that this might render the
123 system unusable. Concluding, the 'safeness' of the ``CHOST`` variable
124 is based on externally assumed 'safeness', which's discussion falls
125 outside this GLEP.
126
127 Current USE-expansion of the variables is being maintained, as this
128 results in full backward compatibility. Since the variables themself
129 don't change in what they represent, but only how they are being
130 assigned, there should be no problem in maintaining them. Using
131 USE-expansion, conditional code can be written down in ebuilds, which is
132 not different from any existing methods at all::
133
134 ...
135 RDEPEND="elibc_FreeBSD? ( sys-libs/com_err )"
136 ...
137 src_compile() {
138 ...
139 use elibc_FreeBSD && myconf="${myconf} -Dlibc=/usr/lib/libc.a"
140 ...
141 }
142
143 Alternatively, the variables ``ELIBC``, ``KERNEL`` and ``ARCH``
144 are available in the ebuild evironment and they can be used instead of
145 invoking ``xxx_Xxxx`` or in switch statements where they are actually
146 necessary.
147
148 A map file can be used to have the various ``CHOST`` values being
149 translated to the correct values for the four variables. This change is
150 invisible for ebuilds and eclasses, but allows to rely on these
151 variables as they are based on a 'safe' value -- the ``CHOST`` variable.
152 Ebuilds should not be sensitive to the keyword value, but use the
153 aforementioned four variables instead. They allow specific tests for
154 properties. If this is undesirable, the full ``CHOST`` variable can be
155 used to match a complete operating system.
156
157
158 Variable Assignment
159 -------------------
160
161 The ``ELIBC``, ``KERNEL``, ``ARCH`` variables are filled from a profile
162 file. The file can be overlaid, such that the following entries in the
163 map file (on the left of the arrow) will result in the assigned
164 variables on the right hand side of the arrow::
165
166 *-*-linux-* -> KERNEL="linux"
167 *-*-*-gnu -> ELIBC="glibc"
168 *-*-kfreebsd-gnu -> KERNEL="FreeBSD" ELIBC="glibc"
169 *-*-freebsd* -> KERNEL="FreeBSD" ELIBC="FreeBSD"
170 *-*-darwin* -> KERNEL="Darwin" ELIBC="Darwin"
171 *-*-netbsd* -> KERNEL="NetBSD" ELIBC="NetBSD"
172 *-*-solaris* -> KERNEL="Solaris" ELIBC="Solaris"
173
174 A way to achieve this is proposed by Mike Frysinger, which
175 suggests to have an env-map file, for instance filled with::
176
177 % cat env-map
178 *-linux-* KERNEL=linux
179 *-gnu ELIBC=glibc
180 x86_64-* ARCH=amd64
181
182 then the following bash script can be used to set the four variables to
183 their correct values::
184
185 % cat readmap
186 #!/bin/bash
187
188 CBUILD=${CBUILD:-${CHOST=${CHOST:-$1}}}
189 [[ -z ${CHOST} ]] && echo need chost
190
191 unset KERNEL ELIBC ARCH
192
193 while read LINE ; do
194 set -- ${LINE}
195 targ=$1
196 shift
197 [[ ${CBUILD} == ${targ} ]] && eval $@
198 done < env-map
199
200 echo ARCH=${ARCH} KERNEL=${KERNEL} ELIBC=${ELIBC}
201
202 Given the example env-map file, this script would result in::
203
204 % ./readmap x86_64-pc-linux-gnu
205 ARCH=amd64 KERNEL=linux ELIBC=glibc
206
207 The entries in the ``env-map`` file will be evaluated in a forward
208 linear full scan. A side-effect of this exhaustive search is that the
209 variables can be re-assigned if multiple entries match the given
210 ``CHOST``. Because of this, the order of the entries does matter.
211 Because the ``env-map`` file size is assumed not to exceed the block
212 size of the file system, the performance penalty of a full scan versus
213 'first-hit-stop technique' is assumed to be minimal.
214
215 It should be noted, however, that the above bash script is a proof of
216 concept implementation. Since Portage is largerly written in Python, it
217 will be more efficient to write an equivalent of this code in Python
218 also. Coding wise, this is considered to be a non-issue, but the format
219 of the ``env-map`` file, and especially its wildcard characters, might
220 not be the best match with Python. For this purpose, the format
221 specification of the ``env-map`` file is deferred to the Python
222 implementation, and only the requirements are given here.
223
224 The ``env-map`` file should be capable of encoding a ``key``, ``value``
225 pair, where ``key`` is a (regular) expression that matches a
226 chost-string, and ``value`` contains at least one, distinct variable
227 assignment for the variables ``ARCH``, ``KERNEL`` and ``ELIBC``. The
228 interpreter of the ``env-map`` file must scan the file linearly and
229 continue trying to match the ``key``\s and assign variables if
230 appropriate until the end of file.
231
232 Since Portage will use the ``env-map`` file, the location of the file is
233 beyond the scope of this GLEP and up to the Portage implementors.
234
235
236 References
237 ==========
238
239 .. [1] GLEP 22, New "keyword" system to incorporate various
240 userlands/kernels/archs, Goodyear,
241 (http://glep.gentoo.org/glep-0022.html)
242
243 .. [2] For the purpose of readability, we will refer to 1, 2 and
244 4-tuples, even though tuple in itself suggest a field consisting of
245 two values. For clarity: a 1-tuple describes a single value field,
246 while a 4-tuple describes a field consisting out of four values.
247
248
249 Copyright
250 =========
251
252 This document has been placed in the public domain.

  ViewVC Help
Powered by ViewVC 1.1.20