/[gentoo]/xml/htdocs/proj/en/glep/glep-0033.txt
Gentoo

Contents of /xml/htdocs/proj/en/glep/glep-0033.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.2 - (hide annotations) (download)
Sun Mar 6 20:33:20 2005 UTC (9 years, 4 months ago) by g2boojum
Branch: MAIN
Changes since 1.1: +133 -101 lines
File MIME type: text/plain
update

1 g2boojum 1.1 GLEP: 33
2     Title: Eclass Restructure/Redesign
3     Version: $Revision: 1.0 $
4     Last-Modified: $Date: 2005/02/15 00:00:00 $
5 g2boojum 1.2 Author: Brian Harring <ferringb@gentoo.org>, John Mylchreest <johnm@gentoo.org>
6 g2boojum 1.1 Status: Draft
7     Type: Standards Track
8     Content-Type: text/x-rst
9     Created: 29-Jan-2005
10     Post-History: 29-Jan-2005
11    
12    
13     Abstract
14     ========
15    
16     For any design, the transition from theoretical to applied exposes inadequacies
17     in the original design. This document is intended to document, and propose a
18     revision of the current eclass setup to address current eclass inadequacies.
19    
20 g2boojum 1.2 This document proposes several things- the creation of ebuild libraries, 'elibs',
21 g2boojum 1.1 a narrowing of the focus of eclasses, a move of eclasses w/in the tree, the
22     addition of changelogs, and a way to allow for simple eclass gpg signing.
23     In general, a large scale restructuring of what eclasses are and how they're
24     implemented. Essentially version two of the eclass setup.
25    
26    
27     Terminology
28     ===========
29    
30     From this point on, the proposed eclass setup will be called 'new eclasses', the
31     existing crop (as of this writing) will be referenced as 'old eclasses'. The
32 g2boojum 1.2 distinction is elaborated on within this document.
33 g2boojum 1.1
34    
35     Motivation and Rationale
36     ========================
37    
38     Eclasses within the tree currently are a bit of a mess- they're forced to
39 g2boojum 1.2 maintain backwards compatibility w/ all previous functionality. In effect,
40 g2boojum 1.1 their api is constant, and can only be added to- never changing the existing
41 g2boojum 1.2 functionality. This obviously is quite limiting, and leads to cruft accruing in
42 g2boojum 1.1 eclasses as a eclasses design is refined. This needs to be dealt with prior to
43 g2boojum 1.2 eclass code reaching a critical mass where they become unmanageable/fragile
44     (recent pushes for eclass versioning could be interpreted as proof of this).
45 g2boojum 1.1
46     Beyond that, eclasses were originally intended as a method to allow for ebuilds
47     to use a pre-existing block of code, rather then having to duplicate the code in
48     each ebuild. This is a good thing, but there are ill effects that result from
49     the current design. Eclasses inherit other eclasses to get a single function- in
50     doing so, modifying the the exported 'template' (default src_compile, default
51     src_unpack, various vars, etc). All the eclass designer was after was reusing a
52     function, not making their eclass sensitive to changes in the template of the
53     eclass it's inheriting. The eclass designer -should- be aware of changes in the
54     function they're using, but shouldn't have to worry about their default src_*
55     and pkg_* functions being overwritten, let alone the env changes.
56    
57     Addressing up front why a collection of eclass refinements are being rolled into
58     a single set of changes, parts of this proposal -could- be split into multiple
59     phases. Why do it though? It's simpler for developers to know that the first
60     eclass specification was this, and that the second specification is that,
61     rather then requiring them to be aware of what phase of eclass changes is in
62     progress.
63    
64     By rolling all changes into one large change, a line is intentionally drawn in
65     the sand. Old eclasses allowed for this, behaved this way. New eclasses allow
66     for that, and behave this way. This should reduce misconceptions about what is
67     allowed/possible with eclasses, thus reducing bugs that result from said
68     misconceptions.
69    
70 g2boojum 1.2 A few words on elibs- think of them as a clear definition between behavioral
71     functionality of an eclass, and the library functionality. Eclass's modify
72     template data, and are the basis for other ebuilds- elibs, however are *just*
73     common bash functionality.
74    
75     Consider the majority of the portage bin/* scripts- these all are candidates for
76     being added to the tree as elibs, as is the bulk of eutils.
77    
78 g2boojum 1.1
79     Specification.
80     ==============
81    
82     The various parts of this proposal are broken down into a set of changes and
83     elaborations on why a proposed change is preferable. It's advisable to the
84     reader that this be read serially, rather then jumping around.
85    
86    
87     Ebuild Libraries (elibs for short)
88     ----------------------------------
89    
90     As briefly touched upon in Motivation and Rationale, the original eclass design
91     allowed for the eclass to modify the metadata of an ebuild, metadata being the
92     DEPENDS, RDEPENDS, SRC_URI, IUSE, etc, vars that are required to be constant,
93     and used by portage for dep resolution, fetching, etc. Using the earlier
94     example, if you're after a single function from an eclass (say epatch from
95     eutils), you -don't- want the metadata modifications the eclass you're
96     inheriting might do. You want to treat the eclass you're pulling from as a
97     library, pure and simple.
98    
99     A new directory named elib should be added to the top level of the tree to serve
100     as a repository of ebuild function libraries. Rather then relying on using the
101     source command, an 'elib' function should be added to portage to import that
102     libraries functionality. The reason for the indirection via the function is
103     mostly related to portage internals, but it does serve as an abstraction such
104 g2boojum 1.2 that (for example) zsh compatibility hacks could be hidden in the elib function.
105 g2boojum 1.1
106     Elib's will be collections of bash functions- they're not allowed to do anything
107     in the global scope aside from function definition, and any -minimal-
108     initialization of the library that is absolutely needed. Additionally, they
109 g2boojum 1.2 cannot modify any ebuild template functions- src_compile, src_unpack. Since they are
110 g2boojum 1.1 required to not modify the metadata keys, nor in any way affect the ebuild aside
111     from providing functionality, they can be conditionally pulled in. They also
112     are allowed to pull in other elibs, but strictly just elibs- no eclasses, just
113 g2boojum 1.2 other elibs. A real world example would be the eutils eclass.
114 g2boojum 1.1
115     Portage, since the elib's don't modify metadata, isn't required to track elibs
116     as it tracks eclasses. Thus a change in an elib doesn't result in half the tree
117     forced to be regenerated/marked stale when changed (this is more of an infra
118     benefit, although regen's that take too long due to eclass changes have been
119 g2boojum 1.2 known to cause rsync issues due to missing timestamps).
120    
121     Elibs will not be available in the global scope of an eclass, or ebuild- nor during the
122     depends phase (basically a phase that sources the ebuild, to get it's metadata). Elib
123     calls in the global scope will be tracked, but the elib will not be loaded till just before
124     the setup phase (pkg_setup). There are two reasons for this- first, it ensures elibs are
125     completely incapable of modifying metadata. There is no room for confusion, late loading
126     of elibs gives you the functionality for all phases, except for depends- depends being the
127     only phase that is capable of specifying metadata. Second, as an added bonus, late
128     loading reduces the amount of bash sourced for a regen- faster regens. This however is minor,
129     and is an ancillary benefit of the first reason.
130    
131     There are a few further restrictions with elibs- mainly, elibs to load can only be specified
132     in either global scope, or in the setup, unpack, compile, test, and install phases. You can
133     not load elibs in prerm, postrm, preinst, and postinst. The reason being, for *rm phases,
134     installed pkgs will have to look to the tree for the elib, which allows for api drift to cause
135     breakage. For *inst phases, same thing, except the culprit is binpkgs.
136    
137     There is a final restriction- elibs cannot change their exported api dependent on the api
138     (as some eclass do for example). The reason mainly being that elibs are loaded once- not
139     multiple times, as eclasses are.
140    
141     To clarify, for example this is invalid.
142     ::
143     if [[ -n ${SOME_VAR} ]]; then
144     func x() { echo "I'm accessible only via tweaking some var";}
145     else
146     func x() { echo "this is invalid, do not do it."; }
147     fi
148    
149 g2boojum 1.1
150     Regarding maintainability of elibs, it should be a less of a load then old
151     eclasses. One of the major issues with old eclasses is that their functions are
152     quite incestuous- they're bound tightly to the env they're defined in. This
153     makes eclass functions a bit fragile- the restrictions on what can, and cannot
154     be done in elibs will address this, making functionality less fragile (thus a
155     bit more maintainable).
156    
157 g2boojum 1.2 There is no need for backwards compatibility with elibs- they just must work
158 g2boojum 1.1 against the current tree. Thus elibs can be removed when the tree no longer
159     needs them. The reasons for this are explained below.
160    
161     Structuring of the elibs directory will be exactly the same as that of the new
162     eclass directory (detailed below), sans a different extension.
163    
164 g2boojum 1.2 As to why their are so many restrictions, the answer is simple- the definition of
165     what elibs are, what they are capable of, and how to use them is nailed down as much as
166     possible to avoid *any* ambiguity related to them. The intention is to make it clear,
167     such that no misconceptions occur, resulting in bugs.
168 g2boojum 1.1
169     The reduced role of Eclasses, and a clarification of existing Eclass requirements
170     ---------------------------------------------------------------------------------
171    
172     Since elibs are now intended on holding common bash functionality, the focus of
173     eclasses should be in defining an appropriate template for ebuilds. For example,
174     defining common DEPENDS, RDEPENDS, src_compile functions, src_unpack, etc.
175     Additionally, eclasses should pull in any elibs they need for functionality.
176    
177     Eclass functionality that isn't directly related to the metadata, or src_* and
178     pkg_* funcs should be shifted into elibs to allow for maximal code reuse. This
179     however isn't a hard requirement, merely a strongly worded suggestion.
180    
181     Previously, it was 'strongly' suggested by developers to avoid having any code
182     executed in the global scope that wasn't required. This suggestion is now a
183     requirement. Execute only what must be executed in the global scope. Any code
184     executed in the global scope that is related to configuring/building the package
185     must be placed in pkg_setup. Metadata keys (already a rule, but now stated as
186     an absolute requirement to clarify it) *must* be constant. The results of
187     metadata keys exported from an ebuild on system A, must be *exactly* the same as
188     the keys exported on system B.
189    
190     If an eclass (or ebuild for that matter) violates this constant requirement, it
191     leads to portage doing the wrong thing for rsync users- for example, wrong deps
192 g2boojum 1.2 pulled in, leading to compilation failure, or dud deps.
193 g2boojum 1.1
194     If the existing metadata isn't flexible enough for what is required for a
195     package, the parsing of the metadata is changed to address that. Cases where
196     the constant requirement is violated are known, and a select few are allowed-
197     these are exceptions to the rule that are required due to inadequacies in
198 g2boojum 1.2 portage. Any case where it's determined the constant requirement may need to be
199     violated the dev must make it aware to the majority of devs, along with the portage
200     devs. This should be done prior to committing.
201 g2boojum 1.1
202     It's quite likely there is a way to allow what you're attempting- if you just go
203 g2boojum 1.2 and do it, the rsync users (our user base) suffer the results of compilation
204 g2boojum 1.1 failures and unneeded deps being pulled in.
205    
206     After that stern reminder, back to new eclasses. Defining INHERITED and ECLASS
207     within the eclass is no longer required. Portage already handles those vars if
208     they aren't defined.
209    
210 g2boojum 1.2 As with elibs, it's no longer required that backwards compatibility be maintained
211     indefinitely- compatibility must be maintained against the current tree, but
212 g2boojum 1.1 just that. As such new eclasses (the true distinction of new vs old is
213     elaborated in the next section) can be removed from the tree once they're no
214     longer in use.
215    
216    
217 g2boojum 1.2 The end of backwards compatibility...
218 g2boojum 1.1 -------------------------------------
219    
220     With current eclasses, once the eclass is in use, it's api can no longer be
221     changed, nor can the eclass ever be removed from the tree. This is why we still
222     have *ancient* eclasses that are completely unused sitting in the tree, for
223     example inherit.eclass . The reason for this, not surprisingly is a portage
224     deficiency- on unmerging an installed ebuild, portage used the eclass from the
225     current tree.
226    
227     For a real world example of this, if you merged a glibc 2 years back, whatever
228     eclasses it used must still be compatible, or you may not be able to unmerge the
229     older glibc version during an upgrade to a newer version. So either the glibc
230     maintainer is left with the option of leaving people using ancient versions out
231 g2boojum 1.2 in the rain, or maintaining an ever increasing load of backwards compatibility
232 g2boojum 1.1 cruft in any used eclasses.
233    
234     Binpkgs suffer a similar fate. Merging of a binpkg pulls needed eclasses from
235     the tree, so you may not be able to even merge a binpkg if the eclasses api has
236     changed. If the eclass was removed, you can't even merge the binpkg, period.
237    
238     The next major release of portage will address this- the environment that the
239     ebuild was built in already contains the eclasses functions, as such the env can
240     be re-used rather then relying on the eclass. In other words, binpkgs and
241     installed ebuilds will no longer go and pull needed eclasses from the tree,
242     they'll use the 'saved' version of the eclass they were built/merged with.
243    
244 g2boojum 1.2 So the backwards compatibility requirement for users of the next major portage
245 g2boojum 1.1 version (and beyond) isn't required. All the cruft can be dropped.
246    
247 g2boojum 1.2 The problem is that there will be users using older versions of portage that don't
248     support this functionality- these older installations *cannot* use the
249     new eclasses, due to the fact that their portage version is incapable of
250     properly relying on the env- in other words, the varying api of the eclass will
251     result in user-visible failures during unmerging.
252    
253     So we're able to do a clean break of all old eclasses, and api cruft, but we need
254     a means to basically disallow access to the new eclasses for all portage versions
255     incapable of properly handling the env requirements.
256    
257     Unfortunately, we cannot just rely on a different grouping/naming convention within
258     the old eclass directory. The new eclasses must be inaccessible, and portage throws
259     a snag into this- the existing inherit function that is used to handle existing
260     eclasses. Basically, whatever it's passed (inherit kernel or inherit
261 g2boojum 1.1 kernel/kernel) it will pull in (kernel.eclass, and kernel/kernel.eclass
262     respectively). So even if the new eclasses were implemented within a
263     subdirectory of the eclass dir in the tree, all current portage versions would
264     still be able to access them.
265    
266     In other words, these new eclasses would in effect, be old eclasses since older
267     portage versions could still access them.
268    
269    
270     Tree restructuring.
271     -------------------
272    
273     There are only two way to block the existing (as of this writing) inherit
274     functionality from accessing the new eclasses- either change the extension of
275 g2boojum 1.2 eclasses to something other then 'eclass', or to have them stored in a separate
276 g2boojum 1.1 subdirectory of the tree then eclass.
277    
278     The latter is preferable, and the proposed solution. Reasons are- the current
279     eclass directory is already overgrown. Structuring of the new eclass dir
280     (clarified below) will allow for easier signing, ChangeLogs, and grouping of
281     eclasses. New eclasses allow for something akin to a clean break and have new
282     capabilities/requirements, thus it's advisable to start with a clean directory,
283     devoid of all cruft from the old eclass implementation.
284    
285     If it's unclear as to why the old inherit function *cannot* access the new
286     eclasses, please reread the previous section. It's unfortunately a requirement
287     to take advantage of all that the next major portage release will allow.
288    
289 g2boojum 1.2 The proposed directory structure is ${PORTDIR}/include/{eclass,elib}.
290 g2boojum 1.1 Something like ${PORTDIR}/new-eclass, or ${PORTDIR}/eclass-ng could be used
291     (although many would cringe at the -ng), but such a name is unwise. Consider the
292     possibility (likely a fact) that new eclasses someday may be found lacking, and
293     refined further (version three as it were). Or perhaps we want to add yet more
294     functionality with direct relation to sourcing new files, and we would then need
295     to further populate ${PORTDIR}.
296    
297     The new-eclass directory will be (at least) 2 levels deep- for example:
298    
299     ::
300     kernel/
301     kernel/linux-info.eclass
302     kernel/linux-mod.eclass
303     kernel/kernel-2.6.eclass
304     kernel/kernel-2.4.eclass
305     kernel/ChangeLog
306     kernel/Manifest
307    
308     No eclasses will be allowed in the base directory- grouping of new eclasses will
309     be required to help keep things tidy, and for the following reasons. Grouping
310     of eclasses allows for the addition of ChangeLogs that are specific to that
311     group of eclasses, grouping of files/patches as needed, and allows for
312 g2boojum 1.2 saner/easier signing of eclasses- you can just stick a signed
313 g2boojum 1.1 Manifest file w/in that grouping, thus providing the information portage needs
314     to ensure no files are missing, and that nothing has been tainted.
315    
316     The elib directory will be structured in the same way, for the same reasons.
317    
318     Repoman will have to be extended to work within new eclass and elib groups, and
319 g2boojum 1.2 to handle signing and committing. This is intentional, and a good thing. This
320 g2boojum 1.1 gives repoman the possibility of doing sanity checks on elibs/new eclasses.
321 g2boojum 1.2
322     Note these checks will not prevent developers from doing dumb things with eclass-
323     these checks would only be capable of doing basic sanity checks, such as syntax checks.
324     There is no way to prevent people from doing dumb things (exempting perhaps repeated
325     applications of a cattle prod)- these are strictly automatic checks, akin to repoman's
326     dependency checks.
327 g2boojum 1.1
328    
329 g2boojum 1.2 The start of a different phase of backwards compatibility
330 g2boojum 1.1 ---------------------------------------------------------
331    
332 g2boojum 1.2 As clarified above, new eclasses will exist in a separate directory that will be
333 g2boojum 1.1 intentionally inaccessible to the inherit function. As such, users of older
334     portage versions *will* have to upgrade to merge any ebuild that uses elibs/new
335 g2boojum 1.2 eclasses. A depend on the next major portage version would transparently handle
336     this for rsync users.
337 g2boojum 1.1
338     There still is the issue of users who haven't upgraded to the required portage
339     version. This is a minor concern frankly- portage releases include new
340     functionality, and bug fixes. If they won't upgrade, it's assumed they have
341     their reasons and are big boys, thus able to handle the complications themselves.
342    
343     The real issue is broken envs, whether in binpkgs, or for installed packages.
344     Two options exist- either the old eclasses are left in the tree indefinitely, or
345     they're left for N months, then shifted out of the tree, and into a tarball that
346     can be merged.
347    
348     Shifting them out of the tree is advisable for several reasons- less cruft in
349     the tree, but more importantly the fact that they are not signed (thus an angle
350     for attack). Note that the proposed method of eclass signing doesn't even try
351     to address them. Frankly, it's not worth the effort supporting two variations
352     of eclass signing, when the old eclass setup isn't designed to allow for easy
353     signing.
354    
355     If this approach is taken, then either the old eclasses would have to be merged
356     to an overlay directory's eclass directory (ugly), or to a safe location that
357     portage's inherit function knows to look for (less ugly).
358    
359     For users who do not upgrade within the window of N months while the old
360     eclasses are in the tree, as stated, it's assumed they know what they are doing.
361     If they specifically block the new portage version, as the ebuilds in the tree
362     migrate to the new eclasses, they will have less and less ebuilds available to
363 g2boojum 1.2 them. If they tried injecting the new portage version (lying to portage,
364     essentially), portage would bail since it cannot find the new eclass.
365     For ebuilds that use the new eclasses, there really isn't any way to sidestep
366     the portage version requirement- same as it has been for other portage features.
367 g2boojum 1.1
368     What is a bit more annoying is that once the old eclasses are out of the tree,
369 g2boojum 1.2 if a user has not upgraded to a portage version supporting env processing, they
370     will lose the ability to unmerge any installed ebuild that used an old
371     eclass. Same cause, different symptom being they will lose the ability to merge
372     any tbz2 that uses old eclasses also.
373    
374     There is one additional case that is a rarity, but should be noted- if a user
375     has suffered significant corruption of their installed package database (vdb). This is
376     ignoring the question of whether the vdb is even usable at this point, but the possibility
377     exists for the saved envs to be non usable due to either A) missing, or B) corrupted.
378     In such a case, even with the new portage capabilities, they would need
379     the old eclass compat ebuild.
380    
381     Note for this to happen requires either rather... unwise uses of root, or significant
382     fs corruption. Regardless of the cause, it's quite likely for this to even become an
383     issue, the system's vdb is completely unusable. It's a moot issue at that point.
384     If you lose your vdb, or it gets seriously damaged, it's akin to lobotomizing portage-
385     it doesn't know what's installed, it doesn't know of it's own files, and in general,
386     a rebuilding of the system is about the only sane course of action. The missing env is
387     truly the least of the users concern in such a case.
388    
389     Continuing with the more likely scenario, users unwilling to upgrade portage will
390     *not* be left out in the rain. Merging the old eclass compat ebuild will provide
391     the missing eclasses, thus providing that lost functionality .
392 g2boojum 1.1
393 g2boojum 1.2 Note the intention isn't to force them to upgrade, hence the ability to restore the
394 g2boojum 1.1 lost functionality. The intention is to clean up the existing mess, and allow us
395 g2boojum 1.2 to move forward. The saying "you've got to break a few eggs to make an omelet"
396 g2boojum 1.1 is akin, exempting the fact we're providing a way to make the eggs whole again
397     (the king's men would've loved such an option).
398    
399    
400     Migrating to the new setup
401     --------------------------
402    
403     As has been done in the past whenever a change in the tree results in ebuilds
404     requiring a specific version of portage, as ebuilds migrate to the new eclasses,
405     they should depend on a version of portage that supports it. From the users
406     viewpoint, this transparently handles the migration.
407    
408     This isn't so transparent for devs or a particular infrastructure server however.
409     Devs, due to them using cvs for their tree, lack the pregenerated cache rsync
410     users have. Devs will have to be early adopters of the new portage. Older
411     portage versions won't be able to access the new eclasses, thus the local cache
412     generation for that ebuild will fail, ergo the depends on a newer portage
413     version won't transparently handle it for them.
414    
415     Additionally, prior to any ebuilds in the tree using the new eclasses, the
416     infrastructure server that generates the cache for rsync users will have to
417     either be upgraded to a version of portage supporting new eclasses, or patched.
418     The former being much more preferable then the latter for the portage devs.
419    
420     Beyond that, an appropriate window for old eclasses to exist in the tree must be
421     determined, and prior to that window passing an ebuild must be added to the tree
422     so users can get the old eclasses if needed.
423    
424     For eclass devs to migrate from old to new, it is possible for them to just
425     transfer the old eclass into an appropriate grouping in the new eclass directory,
426     although it's advisable they cleanse all cruft out of the eclass. You can
427     migrate ebuilds gradually over to the new eclass, and don't have to worry about
428     having to support ebuilds from X years back.
429    
430     Essentially, you have a chance to nail the design perfectly/cleanly, and have a
431     window in which to redesign it. It's humbly suggested eclass devs take
432     advantage of it. :)
433    
434    
435     Backwards Compatibility
436     =======================
437    
438 g2boojum 1.2 All backwards compatibility issues are addressed in line, but a recap is offered-
439     it's suggested that if the a particular compatibility issue is
440 g2boojum 1.1 questioned/worried over, the reader read the relevant section. There should be
441     a more in depth discussion of the issue, along with a more extensive explanation
442 g2boojum 1.2 of the potential solutions, and reasons for the chosen solution.
443 g2boojum 1.1
444     To recap:
445     ::
446     New eclasses and elib functionality will be tied to a specific portage
447     version. A DEPENDs on said portage version should address this for rsync
448     users who refuse to upgrade to a portage version that supports the new
449     eclasses/elibs and will gradually be unable to merge ebuilds that use said
450     functionality. It is their choice to upgrade, as such, the gradual
451     'thinning' of available ebuilds should they block the portage upgrade is
452     their responsibility.
453    
454     Old eclasses at some point in the future should be removed from the tree,
455     and released in a tarball/ebuild. This will cause installed ebuilds that
456 g2boojum 1.2 rely on the old eclass to be unable to unmerge, with the same applying for
457     merging of binpkgs dependent on the following paragraph.
458 g2boojum 1.1
459 g2boojum 1.2 The old eclass-compat is only required for users who do not upgrade their
460     portage installation, and one further exemption- if the user has somehow
461     corrupted/destroyed their installed pkgs database (/var/db/pkg currently),
462     in the process, they've lost their saved environments. The eclass-compat
463     ebuild would be required for ebuilds that required older eclasses in such a
464     case. Note, this case is rare also- as clarified above, it's mentioned
465     strictly to be complete, it's not much of a real world scenario as elaborated
466     above.
467 g2boojum 1.1
468    
469     Copyright
470     =========
471    
472     This document has been placed in the public domain.

  ViewVC Help
Powered by ViewVC 1.1.20