--- xml/htdocs/proj/en/glep/glep-0047.html 2006/02/13 21:00:50 1.4 +++ xml/htdocs/proj/en/glep/glep-0047.html 2006/10/10 20:25:14 1.5 @@ -8,9 +8,252 @@ --> - + GLEP 47 -- Creating 'safe' environment variables - + -
- +
@@ -33,9 +275,9 @@ - + - + @@ -43,7 +285,7 @@ - + @@ -52,8 +294,8 @@
Title:Creating 'safe' environment variables
Version:1.3
Version:1.4
Last-Modified:2006/02/12 19:57:57
Last-Modified:2006/02/13 21:00:50
Author:Diego Pettenò, Fabian Groffen
Type:Standards Track
Content-Type:text/x-rst
Content-Type:text/x-rst
Created:14-Oct-2005

-
-

Contents

+
+

Contents

-
-

Credits

+
+

Credits

The text of this GLEP is a result of a discussion and input of the following persons, in no particular order: Mike Frysinger, Diego Pettenò, Fabian Groffen and Finn Thain.

-
-

Abstract

+
+

Abstract

In order for ebuilds and eclasses to be able to make host specific decisions, it is necessary to have a number of environmental variables which allow for such decisions. This GLEP introduces some measures that @@ -83,16 +325,16 @@ variables the decisions are based on are 'safe'. A small overlap with GLEP 22 [1] is being handled in this GLEP where the use of 2-tuple keywords are being kept instead of 4-tuple keywords. Additionally, the -ELIBC, KERNEL and ARCH get auto filled starting from -CHOST and the 2-tuple keyword, instead of solely from they 4-tuple +ELIBC, KERNEL and ARCH get auto filled starting from +CHOST and the 2-tuple keyword, instead of solely from they 4-tuple keyword as proposed in GLEP 22.

-

The destiny of the USERLAND variable is out of the scope of this +

The destiny of the USERLAND variable is out of the scope of this GLEP. Depending on its presence in the tree, it may be decided to set -this variable the same way we propose to set ELIBC, KERNEL and -ARCH, or alternatively, e.g. via the profiles.

+this variable the same way we propose to set ELIBC, KERNEL and +ARCH, or alternatively, e.g. via the profiles.

-
-

Motivation

+
+

Motivation

The Gentoo/Alt project is in an emerging state to get ready to serve a plethora of 'alternative' configurations such as FreeBSD, NetBSD, DragonflyBSD, GNU/kFreeBSD, Mac OS X, (Open)Darwin, (Open)Solaris and so @@ -102,8 +344,8 @@ highly dependant on the build environment, such as platform or C-library implementation.

-
-

Rationale

+
+

Rationale

Gentoo's unique Portage system allows easy installation of applications from source packages. Compiling sources is prone to many environmental settings and availability of certain tools. Only recently the Gentoo @@ -118,8 +360,8 @@ importance that this information can be trusted and the values can be considered 'safe' and correct.

-
-

Backwards Compatibility

+
+

Backwards Compatibility

The proposed keywording scheme in this GLEP is fully compatible with the current situation of the portage tree, this in contrast to GLEP 22. The variables provided by GLEP 22 can't be extracted from the new keyword, @@ -128,8 +370,8 @@ variable, if necessary. No modifications to ebuilds will have to be made.

-
-

Specification

+
+

Specification

Unlike GLEP 22 the currently used keyword scheme is not changed. Instead of proposing a 4-tuple [2] keyword, a 2-tuple keyword is chosen for archs that require them. Archs for which a 1-tuple keyword @@ -140,31 +382,31 @@

Keywords will consist out of two parts separated by a hyphen ('-'). The part up to the first hyphen from the left of the keyword 2-tuple is the architecture, such as ppc64, sparc and x86. Allowed characters for the -architecture name are in a-z0-9. The remaining part on the right of +architecture name are in a-z0-9. The remaining part on the right of the first hyphen from the left indicates the operating system or distribution, such as linux, macos, darwin, obsd, et-cetera. If the right hand part is omitted, it implies the operating system/distribution type is Gentoo GNU/Linux. In such case the hyphen is also omitted, and the keyword consists of solely the architecture. The operating system -or distribution name can consist out of characters in a-zA-Z0-9_+:-. +or distribution name can consist out of characters in a-zA-Z0-9_+:-. Please note that the hyphen is an allowed character, and therefore the separation of the two fields in the keyword is only determinable by scanning for the first hyphen character from the start of the keyword string. Examples of keywords following this specification are ppc-darwin and x86. This is fully compatible with the current use of keywords in the tree.

-

The variables ELIBC, KERNEL and ARCH are currently set in +

The variables ELIBC, KERNEL and ARCH are currently set in the profiles when other than their defaults for a GNU/Linux system. They can as such easily be overridden and defined by the user. To prevent this from happening, the variables should be auto filled by -Portage itself, based on the CHOST variable. While the CHOST +Portage itself, based on the CHOST variable. While the CHOST variable can be as easy as the others set by the user, it still is assumed to be 'safe'. This assumption is grounded in the fact that the variable itself is being used in various other places with the same -intention, and that an invalid CHOST will cause major malfunctioning -of the system. A user that changes the CHOST into something that is +intention, and that an invalid CHOST will cause major malfunctioning +of the system. A user that changes the CHOST into something that is not valid for the system, is already warned that this might render the -system unusable. Concluding, the 'safeness' of the CHOST variable +system unusable. Concluding, the 'safeness' of the CHOST variable is based on externally assumed 'safeness', which's discussion falls outside this GLEP.

Current USE-expansion of the variables is being maintained, as this @@ -183,26 +425,26 @@ ... } -

Alternatively, the variables ELIBC, KERNEL and ARCH +

Alternatively, the variables ELIBC, KERNEL and ARCH are available in the ebuild evironment and they can be used instead of -invoking xxx_Xxxx or in switch statements where they are actually +invoking xxx_Xxxx or in switch statements where they are actually necessary.

-

A map file can be used to have the various CHOST values being +

A map file can be used to have the various CHOST values being translated to the correct values for the four variables. This change is invisible for ebuilds and eclasses, but allows to rely on these -variables as they are based on a 'safe' value -- the CHOST variable. +variables as they are based on a 'safe' value -- the CHOST variable. Ebuilds should not be sensitive to the keyword value, but use the aforementioned four variables instead. They allow specific tests for -properties. If this is undesirable, the full CHOST variable can be +properties. If this is undesirable, the full CHOST variable can be used to match a complete operating system.

-
-

Variable Assignment

-

The ELIBC, KERNEL, ARCH variables are filled from a profile +

+

Variable Assignment

+

The ELIBC, KERNEL, ARCH variables are filled from a profile file. The file can be overlaid, such that the following entries in the map file (on the left of the arrow) will result in the assigned variables on the right hand side of the arrow:

-*-*-linux-*      -> KERNEL="linux"  
+*-*-linux-*      -> KERNEL="linux"
 *-*-*-gnu        -> ELIBC="glibc"
 *-*-kfreebsd-gnu -> KERNEL="FreeBSD" ELIBC="glibc"
 *-*-freebsd*     -> KERNEL="FreeBSD" ELIBC="FreeBSD"
@@ -221,57 +463,57 @@
 

then the following bash script can be used to set the four variables to their correct values:

-% cat readmap 
-#!/bin/bash 
+% cat readmap
+#!/bin/bash
 
-CBUILD=${CBUILD:-${CHOST=${CHOST:-$1}}} 
-[[ -z ${CHOST} ]] && echo need chost 
+CBUILD=${CBUILD:-${CHOST=${CHOST:-$1}}}
+[[ -z ${CHOST} ]] && echo need chost
 
-unset KERNEL ELIBC ARCH 
+unset KERNEL ELIBC ARCH
 
-while read LINE ; do 
-    set -- ${LINE} 
-    targ=$1 
-    shift 
-    [[ ${CBUILD} == ${targ} ]] && eval $@ 
-done < env-map 
+while read LINE ; do
+    set -- ${LINE}
+    targ=$1
+    shift
+    [[ ${CBUILD} == ${targ} ]] && eval $@
+done < env-map
 
 echo ARCH=${ARCH} KERNEL=${KERNEL} ELIBC=${ELIBC}
 

Given the example env-map file, this script would result in:

-% ./readmap x86_64-pc-linux-gnu 
+% ./readmap x86_64-pc-linux-gnu
 ARCH=amd64 KERNEL=linux ELIBC=glibc
 
-

The entries in the env-map file will be evaluated in a forward +

The entries in the env-map file will be evaluated in a forward linear full scan. A side-effect of this exhaustive search is that the variables can be re-assigned if multiple entries match the given -CHOST. Because of this, the order of the entries does matter. -Because the env-map file size is assumed not to exceed the block +CHOST. Because of this, the order of the entries does matter. +Because the env-map file size is assumed not to exceed the block size of the file system, the performance penalty of a full scan versus 'first-hit-stop technique' is assumed to be minimal.

It should be noted, however, that the above bash script is a proof of concept implementation. Since Portage is largerly written in Python, it will be more efficient to write an equivalent of this code in Python also. Coding wise, this is considered to be a non-issue, but the format -of the env-map file, and especially its wildcard characters, might +of the env-map file, and especially its wildcard characters, might not be the best match with Python. For this purpose, the format -specification of the env-map file is deferred to the Python +specification of the env-map file is deferred to the Python implementation, and only the requirements are given here.

-

The env-map file should be capable of encoding a key, value -pair, where key is a (regular) expression that matches a -chost-string, and value contains at least one, distinct variable -assignment for the variables ARCH, KERNEL and ELIBC. The -interpreter of the env-map file must scan the file linearly and -continue trying to match the keys and assign variables if +

The env-map file should be capable of encoding a key, value +pair, where key is a (regular) expression that matches a +chost-string, and value contains at least one, distinct variable +assignment for the variables ARCH, KERNEL and ELIBC. The +interpreter of the env-map file must scan the file linearly and +continue trying to match the keys and assign variables if appropriate until the end of file.

-

Since Portage will use the env-map file, the location of the file is +

Since Portage will use the env-map file, the location of the file is beyond the scope of this GLEP and up to the Portage implementors.

-
-

References

- +
[1]GLEP 22, New "keyword" system to incorporate various @@ -279,7 +521,7 @@ (http://glep.gentoo.org/glep-0022.html)
- +
[2]For the purpose of readability, we will refer to 1, 2 and @@ -289,17 +531,18 @@
- - +