| 1 | GLEP: 47 |
1 | GLEP: 47 |
| 2 | Title: Creating 'safe' environment variables |
2 | Title: Creating 'safe' environment variables |
| 3 | Version: $Revision: 1.1 $ |
3 | Version: $Revision: 1.3 $ |
| 4 | Last-Modified: $Date: 2006/02/09 21:42:57 $ |
4 | Last-Modified: $Date: 2006/02/12 19:57:57 $ |
| 5 | Author: Diego Pettenò, Fabian Groffen <{flameeyes,grobian}@gentoo.org> |
5 | Author: Diego Pettenò, Fabian Groffen |
| 6 | Status: Active |
6 | Status: Draft |
| 7 | Type: Standards Track |
7 | Type: Standards Track |
| 8 | Content-Type: text/x-rst |
8 | Content-Type: text/x-rst |
| 9 | Created: 14-Oct-2005 |
9 | Created: 14-Oct-2005 |
| 10 | Post-History: |
10 | Post-History: 09-Feb-2006 |
| 11 | |
11 | |
| 12 | |
12 | |
| 13 | Credits |
13 | Credits |
| 14 | ======= |
14 | ======= |
| 15 | |
15 | |
| … | |
… | |
| 45 | plethora of 'alternative' configurations such as FreeBSD, NetBSD, |
45 | plethora of 'alternative' configurations such as FreeBSD, NetBSD, |
| 46 | DragonflyBSD, GNU/kFreeBSD, Mac OS X, (Open)Darwin, (Open)Solaris and so |
46 | DragonflyBSD, GNU/kFreeBSD, Mac OS X, (Open)Darwin, (Open)Solaris and so |
| 47 | on. As such, the project is in need for a better grip on the actual |
47 | on. As such, the project is in need for a better grip on the actual |
| 48 | host being built on. This information on the host environment is |
48 | host being built on. This information on the host environment is |
| 49 | necessary to make proper (automated) decisions on settings that are |
49 | necessary to make proper (automated) decisions on settings that are |
| 50 | highly dependant on the build environment, such as platform or as in |
50 | highly dependant on the build environment, such as platform or C-library |
| 51 | [2]_. |
51 | implementation. |
| 52 | |
52 | |
| 53 | |
53 | |
| 54 | Rationale |
54 | Rationale |
| 55 | ========= |
55 | ========= |
| 56 | |
56 | |
| … | |
… | |
| 82 | |
82 | |
| 83 | |
83 | |
| 84 | Specification |
84 | Specification |
| 85 | ============= |
85 | ============= |
| 86 | |
86 | |
| 87 | Unlike GLEP 22 the current keyword scheme as used in practice is not |
87 | Unlike GLEP 22 the currently used keyword scheme is not changed. |
| 88 | changed. Instead of proposing a 4-tuple [3]_ keyword, a 2-tuple |
88 | Instead of proposing a 4-tuple [2]_ keyword, a 2-tuple keyword is chosen |
| 89 | keyword is chosen for archs that require them. Archs for which a |
89 | for archs that require them. Archs for which a 1-tuple keyword |
| 90 | 1-tuple keyword suffices, keep that keyword. Since this doesn't change |
90 | suffices, can keep that keyword. Since this doesn't change anything to |
| 91 | anything to the current situation in the tree, it is considered to be a |
91 | the current situation in the tree, it is considered to be a big |
| 92 | big advantage over the 4-tuple keyword from GLEP 22. This GLEP is an |
92 | advantage over the 4-tuple keyword from GLEP 22. This GLEP is an |
| 93 | official specification of the syntax of the keyword. |
93 | official specification of the syntax of the keyword. |
| 94 | |
94 | |
| 95 | Keywords will consist out of two parts separated by a hyphen ('-'). The |
95 | Keywords will consist out of two parts separated by a hyphen ('-'). The |
| 96 | left hand part of the keyword 2-tuple is the architecture, such as |
96 | part up to the first hyphen from the left of the keyword 2-tuple is the |
| 97 | ppc64, sparc and x86. The right hand part indicates the operating |
97 | architecture, such as ppc64, sparc and x86. Allowed characters for the |
|
|
98 | architecture name are in ``a-z0-9``. The remaining part on the right of |
|
|
99 | the first hyphen from the left indicates the operating system or |
| 98 | system or distribution, such as linux, macos, darwin, obsd, etc. If the |
100 | distribution, such as linux, macos, darwin, obsd, et-cetera. If the |
| 99 | right hand part is omitted, it implies the operating system/distribution |
101 | right hand part is omitted, it implies the operating system/distribution |
| 100 | type is GNU/Linux. In such case the hyphen is also omitted. Examples |
102 | type is Gentoo GNU/Linux. In such case the hyphen is also omitted, and |
| 101 | of such keywords are ppc-darwin and x86. This is fully compatible with |
103 | the keyword consists of solely the architecture. The operating system |
| 102 | the current keywords used in the tree. |
104 | or distribution name can consist out of characters in ``a-zA-Z0-9_+:-``. |
|
|
105 | Please note that the hyphen is an allowed character, and therefore the |
|
|
106 | separation of the two fields in the keyword is only determinable by |
|
|
107 | scanning for the first hyphen character from the start of the keyword |
|
|
108 | string. Examples of keywords following this specification are |
|
|
109 | ppc-darwin and x86. This is fully compatible with the current use of |
|
|
110 | keywords in the tree. |
| 103 | |
111 | |
| 104 | The variables ``ELIBC``, ``KERNEL`` and ``ARCH`` are currently set in |
112 | The variables ``ELIBC``, ``KERNEL`` and ``ARCH`` are currently set in |
| 105 | the profiles when other than their defaults for a GNU/Linux system. |
113 | the profiles when other than their defaults for a GNU/Linux system. |
| 106 | They can as such easily be overridden and defined by the user. To |
114 | They can as such easily be overridden and defined by the user. To |
| 107 | prevent this from happening, the variables should be auto filled by |
115 | prevent this from happening, the variables should be auto filled by |
| 108 | Portage itself, based on the ``CHOST`` variable. |
116 | Portage itself, based on the ``CHOST`` variable. While the ``CHOST`` |
|
|
117 | variable can be as easy as the others set by the user, it still is |
|
|
118 | assumed to be 'safe'. This assumption is grounded in the fact that the |
|
|
119 | variable itself is being used in various other places with the same |
|
|
120 | intention, and that an invalid ``CHOST`` will cause major malfunctioning |
|
|
121 | of the system. A user that changes the ``CHOST`` into something that is |
|
|
122 | not valid for the system, is already warned that this might render the |
|
|
123 | system unusable. Concluding, the 'safeness' of the ``CHOST`` variable |
|
|
124 | is based on externally assumed 'safeness', which's discussion falls |
|
|
125 | outside this GLEP. |
|
|
126 | |
|
|
127 | Current USE-expansion of the variables is being maintained, as this |
|
|
128 | results in full backward compatibility. Since the variables themself |
|
|
129 | don't change in what they represent, but only how they are being |
|
|
130 | assigned, there should be no problem in maintaining them. Using |
|
|
131 | USE-expansion, conditional code can be written down in ebuilds, which is |
|
|
132 | not different from any existing methods at all:: |
|
|
133 | |
|
|
134 | ... |
|
|
135 | RDEPEND="elibc_FreeBSD? ( sys-libs/com_err )" |
|
|
136 | ... |
|
|
137 | src_compile() { |
|
|
138 | ... |
|
|
139 | use elibc_FreeBSD && myconf="${myconf} -Dlibc=/usr/lib/libc.a" |
|
|
140 | ... |
|
|
141 | } |
|
|
142 | |
|
|
143 | Alternatively, the variables ``ELIBC``, ``KERNEL`` and ``ARCH`` |
|
|
144 | are available in the ebuild evironment and they can be used instead of |
|
|
145 | invoking ``xxx_Xxxx`` or in switch statements where they are actually |
|
|
146 | necessary. |
| 109 | |
147 | |
| 110 | A map file can be used to have the various ``CHOST`` values being |
148 | A map file can be used to have the various ``CHOST`` values being |
| 111 | translated to the correct values for the four variables. This change is |
149 | translated to the correct values for the four variables. This change is |
| 112 | invisible for ebuilds and eclasses, but allows to rely on these |
150 | invisible for ebuilds and eclasses, but allows to rely on these |
| 113 | variables as they are based on a 'safe' value -- the ``CHOST`` variable. |
151 | variables as they are based on a 'safe' value -- the ``CHOST`` variable. |
| 114 | Ebuilds should not be sensitive to the keyword value, but use the |
152 | Ebuilds should not be sensitive to the keyword value, but use the |
| 115 | aforementioned four variables instead. They allow specific tests for |
153 | aforementioned four variables instead. They allow specific tests for |
| 116 | properties. If this is undesirable, the full ``CHOST`` variable can be |
154 | properties. If this is undesirable, the full ``CHOST`` variable can be |
| 117 | used to match a complete operating system. |
155 | used to match a complete operating system. |
| 118 | |
156 | |
| 119 | Current USE-expansion is being maintained, for backwards compatibility. |
|
|
| 120 | Since the expansion is based on the variables mentioned above which do |
|
|
| 121 | not change, but only in the way they are generated, there should be no |
|
|
| 122 | problem in maintaining them. |
|
|
| 123 | |
157 | |
| 124 | |
158 | Variable Assignment |
| 125 | Variables |
159 | ------------------- |
| 126 | --------- |
|
|
| 127 | |
160 | |
| 128 | The ``ELIBC``, ``KERNEL``, ``ARCH`` variables are filled from a profile |
161 | The ``ELIBC``, ``KERNEL``, ``ARCH`` variables are filled from a profile |
| 129 | file. The file can be overlaid, such that the following entries in the |
162 | file. The file can be overlaid, such that the following entries in the |
| 130 | map file (on the left of the arrow) will result in the assigned |
163 | map file (on the left of the arrow) will result in the assigned |
| 131 | variables on the right hand side of the arrow: |
164 | variables on the right hand side of the arrow:: |
| 132 | |
|
|
| 133 | :: |
|
|
| 134 | |
165 | |
| 135 | *-*-linux-* -> KERNEL="linux" |
166 | *-*-linux-* -> KERNEL="linux" |
| 136 | *-*-*-gnu -> ELIBC="glibc" |
167 | *-*-*-gnu -> ELIBC="glibc" |
| 137 | *-*-kfreebsd-gnu -> KERNEL="FreeBSD" ELIBC="glibc" |
168 | *-*-kfreebsd-gnu -> KERNEL="FreeBSD" ELIBC="glibc" |
| 138 | *-*-freebsd* -> KERNEL="FreeBSD" ELIBC="FreeBSD" |
169 | *-*-freebsd* -> KERNEL="FreeBSD" ELIBC="FreeBSD" |
| 139 | *-*-darwin* -> KERNEL="Darwin" ELIBC="Darwin" |
170 | *-*-darwin* -> KERNEL="Darwin" ELIBC="Darwin" |
| 140 | *-*-netbsd* -> KERNEL="NetBSD" ELIBC="NetBSD" |
171 | *-*-netbsd* -> KERNEL="NetBSD" ELIBC="NetBSD" |
| 141 | *-*-solaris* -> KERNEL="Solaris" ELIBC="Solaris" |
172 | *-*-solaris* -> KERNEL="Solaris" ELIBC="Solaris" |
| 142 | |
173 | |
| 143 | A way to achieve this is proposed by Mike Frysinger [4]_, which |
174 | A way to achieve this is proposed by Mike Frysinger, which |
| 144 | suggests to have a env-map file, for instance filled with: |
175 | suggests to have a env-map file, for instance filled with:: |
| 145 | |
|
|
| 146 | :: |
|
|
| 147 | |
176 | |
| 148 | % cat env-map |
177 | % cat env-map |
| 149 | *-linux-* KERNEL=linux |
178 | *-linux-* KERNEL=linux |
| 150 | *-gnu ELIBC=glibc |
179 | *-gnu ELIBC=glibc |
| 151 | x86_64-* ARCH=amd64 |
180 | x86_64-* ARCH=amd64 |
| 152 | |
181 | |
| 153 | then the following bash script can be used to set the four variables to |
182 | then the following bash script can be used to set the four variables to |
| 154 | their correct values: |
183 | their correct values:: |
| 155 | |
|
|
| 156 | :: |
|
|
| 157 | |
184 | |
| 158 | % cat readmap |
185 | % cat readmap |
| 159 | #!/bin/bash |
186 | #!/bin/bash |
| 160 | |
187 | |
| 161 | CBUILD=${CBUILD:-${CHOST=${CHOST:-$1}}} |
188 | CBUILD=${CBUILD:-${CHOST=${CHOST:-$1}}} |
| … | |
… | |
| 170 | [[ ${CBUILD} == ${targ} ]] && eval $@ |
197 | [[ ${CBUILD} == ${targ} ]] && eval $@ |
| 171 | done < env-map |
198 | done < env-map |
| 172 | |
199 | |
| 173 | echo ARCH=${ARCH} KERNEL=${KERNEL} ELIBC=${ELIBC} |
200 | echo ARCH=${ARCH} KERNEL=${KERNEL} ELIBC=${ELIBC} |
| 174 | |
201 | |
| 175 | Given the example env-map file, this script would result in: |
202 | Given the example env-map file, this script would result in:: |
| 176 | |
|
|
| 177 | :: |
|
|
| 178 | |
203 | |
| 179 | % ./readmap x86_64-pc-linux-gnu |
204 | % ./readmap x86_64-pc-linux-gnu |
| 180 | ARCH=amd64 KERNEL=linux ELIBC=glibc |
205 | ARCH=amd64 KERNEL=linux ELIBC=glibc |
| 181 | |
206 | |
|
|
207 | The entries in the ``env-map`` file will be evaluated in a forward |
|
|
208 | linear full scan. A side-effect of this exhaustive search is that the |
|
|
209 | variables can be re-assigned if multiple entries match the given |
|
|
210 | ``CHOST``. Because of this, the order of the entries does matter. |
|
|
211 | Because the ``env-map`` file size is assumed not to exceed the block |
|
|
212 | size of the file system, the performance penalty of a full scan versus |
|
|
213 | 'first-hit-stop technique' is assumed to be minimal. |
|
|
214 | |
| 182 | It should be noted, however, that the bash script is a proof of concept |
215 | It should be noted, however, that the above bash script is a proof of |
| 183 | implementation. It cannot be used as Portage will need this |
216 | concept implementation. Since Portage is largerly written in Python, it |
| 184 | information, which is written in Python. Hence, an equivalent of this |
217 | will be more efficient to write an equivalent of this code in Python |
| 185 | bash script should be written in Python to be able to use it within |
218 | also. Coding wise, this is considered to be a non-issue, but the format |
| 186 | Portage. This is considered to be a non-issue coding wise. |
219 | of the ``env-map`` file, and especially its wildcard characters, might |
|
|
220 | not be the best match with Python. For this purpose, the format |
|
|
221 | specification of the ``env-map`` file is deferred to the Python |
|
|
222 | implementation, and only the requirements are given here. |
|
|
223 | |
|
|
224 | The ``env-map`` file should be capable of encoding a ``key``, ``value`` |
|
|
225 | pair, where ``key`` is a (regular) expression that matches a |
|
|
226 | chost-string, and ``value`` contains at least one, distinct variable |
|
|
227 | assignment for the variables ``ARCH``, ``KERNEL`` and ``ELIBC``. The |
|
|
228 | interpreter of the ``env-map`` file must scan the file linearly and |
|
|
229 | continue trying to match the ``key``\s and assign variables if |
|
|
230 | appropriate until the end of file. |
|
|
231 | |
|
|
232 | Since Portage will use the ``env-map`` file, the location of the file is |
|
|
233 | beyond the scope of this GLEP and up to the Portage implementors. |
| 187 | |
234 | |
| 188 | |
235 | |
| 189 | References |
236 | References |
| 190 | ========== |
237 | ========== |
| 191 | |
238 | |
| 192 | .. [1] GLEP 22, New "keyword" system to incorporate various |
239 | .. [1] GLEP 22, New "keyword" system to incorporate various |
| 193 | userlands/kernels/archs, Goodyear, |
240 | userlands/kernels/archs, Goodyear, |
| 194 | (http://glep.gentoo.org/glep-0022.html) |
241 | (http://glep.gentoo.org/glep-0022.html) |
| 195 | |
242 | |
| 196 | .. [2] For example in the perl ebuild, it is necessary to |
|
|
| 197 | fill in the C-library part, which on a FreeBSD system is other than |
|
|
| 198 | on a Linux system and currently is handled as follows: |
|
|
| 199 | :: |
|
|
| 200 | |
|
|
| 201 | [[ ${ELIBC} == "FreeBSD" ]] && myconf="${myconf} -Dlibc=/usr/lib/libc.a" |
|
|
| 202 | |
|
|
| 203 | .. [3] For the purpose of readability, we will refer to 1, 2 and |
243 | .. [2] For the purpose of readability, we will refer to 1, 2 and |
| 204 | 4-tuples, even though tuple in itself suggest a field consisting of |
244 | 4-tuples, even though tuple in itself suggest a field consisting of |
| 205 | two values. For clarity: a 1-tuple describes a single value field, |
245 | two values. For clarity: a 1-tuple describes a single value field, |
| 206 | while a 4-tuple decribes a field consisting out of four values. |
246 | while a 4-tuple describes a field consisting out of four values. |
| 207 | |
|
|
| 208 | .. [4] mailto:vapier [at) gentoo .dot org |
|
|
| 209 | |
|
|
| 210 | |
247 | |
| 211 | |
248 | |
| 212 | Copyright |
249 | Copyright |
| 213 | ========= |
250 | ========= |
| 214 | |
251 | |