/[gentoo]/xml/htdocs/proj/en/glep/glep-0047.txt
Gentoo

Contents of /xml/htdocs/proj/en/glep/glep-0047.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.2 - (show annotations) (download)
Sat Feb 11 21:43:14 2006 UTC (8 years, 6 months ago) by grobian
Branch: MAIN
Changes since 1.1: +83 -54 lines
File MIME type: text/plain
Many pieces rewritten, reworded and explained in more detail.
- clarified 'safeness' of CHOST variable
- note on USE-expansion variables and use of variables separate +
  example
- defined semantics env-map file interpreter
- defined requirements for env-map file expressiveness

1 GLEP: 47
2 Title: Creating 'safe' environment variables
3 Version: $Revision: 1.1 $
4 Last-Modified: $Date: 2006/02/09 21:42:57 $
5 Author: Diego Pettenò, Fabian Groffen
6 Status: Draft
7 Type: Standards Track
8 Content-Type: text/x-rst
9 Created: 14-Oct-2005
10 Post-History: 09-Feb-2006
11
12
13 Credits
14 =======
15
16 The text of this GLEP is a result of a discussion and input of the
17 following persons, in no particular order: Mike Frysinger, Diego
18 Pettenò, Fabian Groffen and Finn Thain.
19
20
21 Abstract
22 ========
23
24 In order for ebuilds and eclasses to be able to make host specific
25 decisions, it is necessary to have a number of environmental variables
26 which allow for such decisions. This GLEP introduces some measures that
27 need to be made to make these decisions 'safe', by making sure the
28 variables the decisions are based on are 'safe'. A small overlap with
29 GLEP 22 [1]_ is being handled in this GLEP where the use of 2-tuple
30 keywords are being kept instead of 4-tuple keywords. Additionally, the
31 ``ELIBC``, ``KERNEL`` and ``ARCH`` get auto filled starting from
32 ``CHOST`` and the 2-tuple keyword, instead of solely from they 4-tuple
33 keyword as proposed in GLEP 22.
34
35 The destiny of the ``USERLAND`` variable is out of the scope of this
36 GLEP. Depending on its presence in the tree, it may be decided to set
37 this variable the same way we propose to set ``ELIBC``, ``KERNEL`` and
38 ``ARCH``, or alternatively, e.g. via the profiles.
39
40
41 Motivation
42 ==========
43
44 The Gentoo/Alt project is in an emerging state to get ready to serve a
45 plethora of 'alternative' configurations such as FreeBSD, NetBSD,
46 DragonflyBSD, GNU/kFreeBSD, Mac OS X, (Open)Darwin, (Open)Solaris and so
47 on. As such, the project is in need for a better grip on the actual
48 host being built on. This information on the host environment is
49 necessary to make proper (automated) decisions on settings that are
50 highly dependant on the build environment, such as platform or C-library
51 implementation.
52
53
54 Rationale
55 =========
56
57 Gentoo's unique Portage system allows easy installation of applications
58 from source packages. Compiling sources is prone to many environmental
59 settings and availability of certain tools. Only recently the Gentoo
60 for FreeBSD project has started, as second Gentoo project that operates
61 on a foreign host operating system using foreign (non-GNU) C-libraries
62 and userland utilities. Such projects suffer from the current implicit
63 assumption made within Gentoo Portage's ebuilds that there is a single
64 type of operating system, C-libraries and system utilities. In order to
65 enable ebuilds -- and also eclasses -- to be aware of these
66 environmental differences, information regarding it should be supplied.
67 Since decisions based on this information can be vital, it is of high
68 importance that this information can be trusted and the values can be
69 considered 'safe' and correct.
70
71
72 Backwards Compatibility
73 =======================
74
75 The proposed keywording scheme in this GLEP is fully compatible with the
76 current situation of the portage tree, this in contrast to GLEP 22. The
77 variables provided by GLEP 22 can't be extracted from the new keyword,
78 but since GLEP 22-style keywords aren't in the tree at the moment, that
79 is not a problem. The same information can be extracted from the CHOST
80 variable, if necessary. No modifications to ebuilds will have to be
81 made.
82
83
84 Specification
85 =============
86
87 Unlike GLEP 22 the currently used keyword scheme is not changed.
88 Instead of proposing a 4-tuple [2]_ keyword, a 2-tuple keyword is chosen
89 for archs that require them. Archs for which a 1-tuple keyword
90 suffices, can keep that keyword. Since this doesn't change anything to
91 the current situation in the tree, it is considered to be a big
92 advantage over the 4-tuple keyword from GLEP 22. This GLEP is an
93 official specification of the syntax of the keyword.
94
95 Keywords will consist out of two parts separated by a hyphen ('-'). The
96 left hand part of the keyword 2-tuple is the architecture, such as
97 ppc64, sparc and x86. The right hand part indicates the operating
98 system or distribution, such as linux, macos, darwin, obsd, etc. If the
99 right hand part is omitted, it implies the operating system/distribution
100 type is Gentoo GNU/Linux. In such case the hyphen is also omitted.
101 Examples of such keywords are ppc-darwin and x86. This is fully
102 compatible with the current use of keywords in the tree.
103
104 The variables ``ELIBC``, ``KERNEL`` and ``ARCH`` are currently set in
105 the profiles when other than their defaults for a GNU/Linux system.
106 They can as such easily be overridden and defined by the user. To
107 prevent this from happening, the variables should be auto filled by
108 Portage itself, based on the ``CHOST`` variable. While the ``CHOST``
109 variable can be as easy as the others set by the user, it still is
110 assumed to be 'safe'. This assumption is grounded in the fact that the
111 variable itself is being used in various other places with the same
112 intention, and that an invalid ``CHOST`` will cause major malfunctioning
113 of the system. A user that changes the ``CHOST`` into something that is
114 not valid for the system, is already warned that this might render the
115 system unusable. Concluding, the 'safeness' of the ``CHOST`` variable
116 is based on externally assumed 'safeness', which's discussion falls
117 outside this GLEP.
118
119 Current USE-expansion of the variables is being maintained, as this
120 results in full backward compatibility. Since the variables themself
121 don't change in what they represent, but only how they are being
122 assigned, there should be no problem in maintaining them. Using
123 USE-expansion, conditional code can be written down in ebuilds, which is
124 not different from any existing methods at all::
125
126 ...
127 RDEPEND="elibc_FreeBSD? ( sys-libs/com_err )"
128 ...
129 src_compile() {
130 ...
131 use elibc_FreeBSD && myconf="${myconf} -Dlibc=/usr/lib/libc.a"
132 ...
133 }
134
135 Alternatively, the variables ``ELIBC``, ``KERNEL`` and ``ARCH``
136 are available in the ebuild evironment and they can be used instead of
137 invoking ``xxx_Xxxx`` or in switch statements where they are actually
138 necessary.
139
140 A map file can be used to have the various ``CHOST`` values being
141 translated to the correct values for the four variables. This change is
142 invisible for ebuilds and eclasses, but allows to rely on these
143 variables as they are based on a 'safe' value -- the ``CHOST`` variable.
144 Ebuilds should not be sensitive to the keyword value, but use the
145 aforementioned four variables instead. They allow specific tests for
146 properties. If this is undesirable, the full ``CHOST`` variable can be
147 used to match a complete operating system.
148
149
150 Variable Assignment
151 -------------------
152
153 The ``ELIBC``, ``KERNEL``, ``ARCH`` variables are filled from a profile
154 file. The file can be overlaid, such that the following entries in the
155 map file (on the left of the arrow) will result in the assigned
156 variables on the right hand side of the arrow::
157
158 *-*-linux-* -> KERNEL="linux"
159 *-*-*-gnu -> ELIBC="glibc"
160 *-*-kfreebsd-gnu -> KERNEL="FreeBSD" ELIBC="glibc"
161 *-*-freebsd* -> KERNEL="FreeBSD" ELIBC="FreeBSD"
162 *-*-darwin* -> KERNEL="Darwin" ELIBC="Darwin"
163 *-*-netbsd* -> KERNEL="NetBSD" ELIBC="NetBSD"
164 *-*-solaris* -> KERNEL="Solaris" ELIBC="Solaris"
165
166 A way to achieve this is proposed by Mike Frysinger, which
167 suggests to have a env-map file, for instance filled with::
168
169 % cat env-map
170 *-linux-* KERNEL=linux
171 *-gnu ELIBC=glibc
172 x86_64-* ARCH=amd64
173
174 then the following bash script can be used to set the four variables to
175 their correct values::
176
177 % cat readmap
178 #!/bin/bash
179
180 CBUILD=${CBUILD:-${CHOST=${CHOST:-$1}}}
181 [[ -z ${CHOST} ]] && echo need chost
182
183 unset KERNEL ELIBC ARCH
184
185 while read LINE ; do
186 set -- ${LINE}
187 targ=$1
188 shift
189 [[ ${CBUILD} == ${targ} ]] && eval $@
190 done < env-map
191
192 echo ARCH=${ARCH} KERNEL=${KERNEL} ELIBC=${ELIBC}
193
194 Given the example env-map file, this script would result in::
195
196 % ./readmap x86_64-pc-linux-gnu
197 ARCH=amd64 KERNEL=linux ELIBC=glibc
198
199 The entries in the ``env-map`` file will be evaluated in a forward
200 linear full scan. A side-effect of this exhaustive search is that the
201 variables can be re-assigned if multiple entries match the given
202 ``CHOST``. Because of this, the order of the entries does matter.
203 Because the ``env-map`` file size is assumed not to exceed the block
204 size of the file system, the performance penalty of a full scan versus
205 'first-hit-stop technique' is assumed to be minimal.
206
207 It should be noted, however, that the above bash script is a proof of
208 concept implementation. Since Portage is largerly written in Python, it
209 will be more efficient to write an equivalent of this code in Python
210 also. Coding wise, this is considered to be a non-issue, but the format
211 of the ``env-map`` file, and especially its wildcard characters, might
212 not be the best match with Python. For this purpose, the format
213 specification of the ``env-map`` file is deferred to the Python
214 implementation, and only the requirements are given here.
215
216 The ``env-map`` file should be capable of encoding a ``key``, ``value``
217 pair, where ``key`` is a (regular) expression that matches a
218 chost-string, and ``value`` contains at least one, distinct variable
219 assignment for the variables ``ARCH``, ``KERNEL`` and ``ELIBC``. The
220 interpreter of the ``env-map`` file must scan the file linearly and
221 continue trying to match the ``key``\s and assign variables if
222 appropriate until the end of file.
223
224 Since Portage will use the ``env-map`` file, the location of the file is
225 beyond the scope of this GLEP and up to the Portage implementors.
226
227
228 References
229 ==========
230
231 .. [1] GLEP 22, New "keyword" system to incorporate various
232 userlands/kernels/archs, Goodyear,
233 (http://glep.gentoo.org/glep-0022.html)
234
235 .. [2] For the purpose of readability, we will refer to 1, 2 and
236 4-tuples, even though tuple in itself suggest a field consisting of
237 two values. For clarity: a 1-tuple describes a single value field,
238 while a 4-tuple describes a field consisting out of four values.
239
240
241 Copyright
242 =========
243
244 This document has been placed in the public domain.

  ViewVC Help
Powered by ViewVC 1.1.20