/[gentoo]/xml/htdocs/doc/en/gcc-optimization.xml
Gentoo

Diff of /xml/htdocs/doc/en/gcc-optimization.xml

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

Revision 1.18 Revision 1.20
1<?xml version='1.0' encoding='UTF-8'?> 1<?xml version='1.0' encoding='UTF-8'?>
2<!DOCTYPE guide SYSTEM "/dtd/guide.dtd"> 2<!DOCTYPE guide SYSTEM "/dtd/guide.dtd">
3<!-- $Header: /var/cvsroot/gentoo/xml/htdocs/doc/en/gcc-optimization.xml,v 1.18 2010/07/27 00:24:29 nightmorph Exp $ --> 3<!-- $Header: /var/cvsroot/gentoo/xml/htdocs/doc/en/gcc-optimization.xml,v 1.20 2012/07/24 12:12:51 swift Exp $ -->
4 4
5<guide> 5<guide>
6<title>Compilation Optimization Guide</title> 6<title>Compilation Optimization Guide</title>
7 7
8<author title="Author"> 8<author title="Author">
9 <mail link="nightmorph"/> 9 <mail link="nightmorph"/>
10</author> 10</author>
11 11
12<abstract> 12<abstract>
13This guide provides an introduction to optimizing compiled code using safe, sane 13This guide provides an introduction to optimizing compiled code using safe, sane
14CFLAGS and CXXFLAGS. It also as describes the theory behind optimizing in 14CFLAGS and CXXFLAGS. It also as describes the theory behind optimizing in
15general. 15general.
16</abstract> 16</abstract>
17 17
18<!-- The content of this document is licensed under the CC-BY-SA license --> 18<!-- The content of this document is licensed under the CC-BY-SA license -->
19<!-- See http://creativecommons.org/licenses/by-sa/2.5 --> 19<!-- See http://creativecommons.org/licenses/by-sa/2.5 -->
20<license/> 20<license/>
21 21
22<version>2</version> 22<version>4</version>
23<date>2010-07-26</date> 23<date>2012-04-27</date>
24 24
25<chapter> 25<chapter>
26<title>Introduction</title> 26<title>Introduction</title>
27<section> 27<section>
28<title>What are CFLAGS and CXXFLAGS?</title> 28<title>What are CFLAGS and CXXFLAGS?</title>
29<body> 29<body>
30 30
31<p> 31<p>
32CFLAGS and CXXFLAGS are environment variables that are used to tell the GNU 32CFLAGS and CXXFLAGS are environment variables that are used to tell the GNU
33Compiler Collection, <c>gcc</c>, what kinds of switches to use when compiling 33Compiler Collection, <c>gcc</c>, what kinds of switches to use when compiling
34source code. CFLAGS are for code written in C, while CXXFLAGS are for code 34source code. CFLAGS are for code written in C, while CXXFLAGS are for code
35written in C++. 35written in C++.
36</p> 36</p>
37 37
38<p> 38<p>
45</p> 45</p>
46 46
47</body> 47</body>
48</section> 48</section>
49<section> 49<section>
50<title>How are they used?</title> 50<title>How are they used?</title>
51<body> 51<body>
52 52
53<p> 53<p>
54CFLAGS and CXXFLAGS can be used in two ways. First, they can be used 54CFLAGS and CXXFLAGS can be used in two ways. First, they can be used
55per-program with Makefiles generated by automake. 55per-program with Makefiles generated by automake.
56</p> 56</p>
57 57
58<p> 58<p>
59However, this should not be done when installing packages found in the Portage 59However, this should not be done when installing packages found in the Portage
60tree. Instead, set your CFLAGS and CXXFLAGS in <path>/etc/make.conf</path>. This 60tree. Instead, set your CFLAGS and CXXFLAGS in <path>/etc/portage/make.conf</path>. This
61way all packages will be compiled using the options you specify. 61way all packages will be compiled using the options you specify.
62</p> 62</p>
63 63
64<pre caption="CFLAGS in /etc/make.conf"> 64<pre caption="CFLAGS in /etc/portage/make.conf">
65CFLAGS="-march=athlon64 -O2 -pipe" 65CFLAGS="-march=athlon64 -O2 -pipe"
66CXXFLAGS="${CFLAGS}" 66CXXFLAGS="${CFLAGS}"
67</pre> 67</pre>
68 68
69<p> 69<p>
70As you can see, CXXFLAGS is set to use all the options present in CFLAGS. This 70As you can see, CXXFLAGS is set to use all the options present in CFLAGS. This
71is what you'll want almost without fail. You shouldn't ever need to specify 71is what you'll want almost without fail. You shouldn't ever need to specify
72additional options in CXXFLAGS. 72additional options in CXXFLAGS.
73</p> 73</p>
74 74
75</body> 75</body>
76</section> 76</section>
77<section> 77<section>
78<title>Misconceptions</title> 78<title>Misconceptions</title>
79<body> 79<body>
150<title>-march</title> 150<title>-march</title>
151<body> 151<body>
152 152
153<p> 153<p>
154The first and most important option is <c>-march</c>. This tells the compiler 154The first and most important option is <c>-march</c>. This tells the compiler
155what code it should produce for your processor <uri 155what code it should produce for your processor <uri
156link="http://en.wikipedia.org/wiki/Microarchitecture">architecture</uri> (or 156link="http://en.wikipedia.org/wiki/Microarchitecture">architecture</uri> (or
157<e>arch</e>); it says that it should produce code for a certain kind of CPU. 157<e>arch</e>); it says that it should produce code for a certain kind of CPU.
158Different CPUs have different capabilities, support different instruction sets, 158Different CPUs have different capabilities, support different instruction sets,
159and have different ways of executing code. The <c>-march</c> flag will instruct 159and have different ways of executing code. The <c>-march</c> flag will instruct
160the compiler to produce code specifically for your CPU, with all its 160the compiler to produce code specifically for your CPU, with all its
161capabilities, features, instruction sets, quirks, and so on. 161capabilities, features, instruction sets, quirks, and so on.
162</p> 162</p>
163 163
164<p> 164<p>
165Even though the CHOST variable in <path>/etc/make.conf</path> specifies the 165Even though the CHOST variable in <path>/etc/portage/make.conf</path> specifies the
166general architecture used, <c>-march</c> should still be used so that programs 166general architecture used, <c>-march</c> should still be used so that programs
167can be optimized for your specific processor. x86 and x86-64 CPUs (among others) 167can be optimized for your specific processor. x86 and x86-64 CPUs (among others)
168should make use of the <c>-march</c> flag. 168should make use of the <c>-march</c> flag.
169</p> 169</p>
170 170
171<p> 171<p>
172What kind of CPU do you have? To find out, run the following command: 172What kind of CPU do you have? To find out, run the following command:
173</p> 173</p>
174 174
175<pre caption="Examining CPU information"> 175<pre caption="Examining CPU information">
176$ <i>cat /proc/cpuinfo</i> 176$ <i>cat /proc/cpuinfo</i>
177</pre> 177</pre>
178 178
179<p> 179<p>
180Now let's see <c>-march</c> in action. This example is for an older Pentium III 180Now let's see <c>-march</c> in action. This example is for an older Pentium III
181chip: 181chip:
182</p> 182</p>
183 183
184<pre caption="/etc/make.conf: Pentium III"> 184<pre caption="/etc/portage/make.conf: Pentium III">
185CFLAGS="-march=pentium3" 185CFLAGS="-march=pentium3"
186CXXFLAGS="${CFLAGS}" 186CXXFLAGS="${CFLAGS}"
187</pre> 187</pre>
188 188
189<p> 189<p>
190Here's another one for a 64-bit AMD CPU: 190Here's another one for a 64-bit AMD CPU:
191</p> 191</p>
192 192
193<pre caption="/etc/make.conf: AMD64"> 193<pre caption="/etc/portage/make.conf: AMD64">
194CFLAGS="-march=athlon64" 194CFLAGS="-march=athlon64"
195CXXFLAGS="${CFLAGS}" 195CXXFLAGS="${CFLAGS}"
196</pre> 196</pre>
197 197
198<p> 198<p>
199If you still aren't sure what kind of CPU you have, you may just want to use 199If you still aren't sure what kind of CPU you have, you may just want to use
200<c>-march=native</c>. When this flag is used, GCC will detect your processor and 200<c>-march=native</c>. When this flag is used, GCC will detect your processor and
201automatically set appropriate flags for it. <brite>However, this should not be 201automatically set appropriate flags for it. <brite>However, this should not be
202used if you intend to compile packages for a different CPU!</brite> 202used if you intend to compile packages for a different CPU!</brite>
203</p> 203</p>
204 204
205<p> 205<p>
206So if you're compiling packages on one computer, but intend to run them on a 206So if you're compiling packages on one computer, but intend to run them on a
207different computer (such as when using a fast computer to build for an older, 207different computer (such as when using a fast computer to build for an older,
208slower machine), then <e>do not</e> use <c>-march=native</c>. "Native" means 208slower machine), then <e>do not</e> use <c>-march=native</c>. "Native" means
256</body> 256</body>
257</section> 257</section>
258<section> 258<section>
259<title>-O</title> 259<title>-O</title>
260<body> 260<body>
261 261
262<p> 262<p>
263Next up is the <c>-O</c> variable. This controls the overall level of 263Next up is the <c>-O</c> variable. This controls the overall level of
264optimization. This makes the code compilation take somewhat more time, and can 264optimization. This makes the code compilation take somewhat more time, and can
265take up much more memory, especially as you increase the level of optimization. 265take up much more memory, especially as you increase the level of optimization.
266</p> 266</p>
267 267
268<p> 268<p>
269There are five <c>-O</c> settings: <c>-O0</c>, <c>-O1</c>, <c>-O2</c>, 269There are five <c>-O</c> settings: <c>-O0</c>, <c>-O1</c>, <c>-O2</c>,
270<c>-O3</c>, and <c>-Os</c>. You should use only one of them in 270<c>-O3</c>, and <c>-Os</c>. You should use only one of them in
271<path>/etc/make.conf</path>. 271<path>/etc/portage/make.conf</path>.
272</p> 272</p>
273 273
274<p> 274<p>
275With the exception of <c>-O0</c>, the <c>-O</c> settings each activate several 275With the exception of <c>-O0</c>, the <c>-O</c> settings each activate several
276additional flags, so be sure to read the gcc manual's chapter on <uri 276additional flags, so be sure to read the gcc manual's chapter on <uri
277link="http://gcc.gnu.org/onlinedocs/gcc-4.4.4/gcc/Optimize-Options.html#Optimize-Options">optimization 277link="http://gcc.gnu.org/onlinedocs/gcc-4.4.4/gcc/Optimize-Options.html#Optimize-Options">optimization
278options</uri> to learn which flags are activated at each <c>-O</c> level, as 278options</uri> to learn which flags are activated at each <c>-O</c> level, as
279well as some explanations as to what they do. 279well as some explanations as to what they do.
280</p> 280</p>
281 281
282<p> 282<p>
283Let's examine each optimization level: 283Let's examine each optimization level:
284</p> 284</p>
285 285
286<ul> 286<ul>
398link="http://en.wikipedia.org/wiki/MMX">MMX</uri>, and <uri 398link="http://en.wikipedia.org/wiki/MMX">MMX</uri>, and <uri
399link="http://en.wikipedia.org/wiki/3dnow">3DNow!</uri> instruction sets for x86 399link="http://en.wikipedia.org/wiki/3dnow">3DNow!</uri> instruction sets for x86
400and x86-64 architectures. These are useful primarily in multimedia, gaming, and 400and x86-64 architectures. These are useful primarily in multimedia, gaming, and
401other floating point-intensive computing tasks, though they also contain several 401other floating point-intensive computing tasks, though they also contain several
402other mathematical enhancements. These instruction sets are found in more modern 402other mathematical enhancements. These instruction sets are found in more modern
403CPUs. 403CPUs.
404</p> 404</p>
405 405
406<impo> 406<impo>
407Be sure to check if your CPU supports these by running <c>cat /proc/cpuinfo</c>. 407Be sure to check if your CPU supports these by running <c>cat /proc/cpuinfo</c>.
408The output will include any supported additional instruction sets. Note that 408The output will include any supported additional instruction sets. Note that
409<b>pni</b> is just a different name for SSE3. 409<b>pni</b> is just a different name for SSE3.
410</impo> 410</impo>
411 411
412<p> 412<p>
413You normally don't need to add any of these flags to <path>/etc/make.conf</path> 413You normally don't need to add any of these flags to <path>/etc/portage/make.conf</path>
414as long as you are using the correct <c>-march</c> (for example, 414as long as you are using the correct <c>-march</c> (for example,
415<c>-march=nocona</c> implies <c>-msse3</c>). Some notable exceptions are newer 415<c>-march=nocona</c> implies <c>-msse3</c>). Some notable exceptions are newer
416VIA and AMD64 CPUs that support instructions not implied by <c>-march</c> (such 416VIA and AMD64 CPUs that support instructions not implied by <c>-march</c> (such
417as SSE3). For CPUs like these you'll need to enable additional flags where 417as SSE3). For CPUs like these you'll need to enable additional flags where
418appropriate after checking the output of <c>cat /proc/cpuinfo</c>. 418appropriate after checking the output of <c>cat /proc/cpuinfo</c>.
419</p> 419</p>
420 420
421<note> 421<note>
422You should check the <uri 422You should check the <uri
423link="http://gcc.gnu.org/onlinedocs/gcc-4.4.4/gcc/i386-and-x86_002d64-Options.html#i386-and-x86_002d64-Options">list</uri> 423link="http://gcc.gnu.org/onlinedocs/gcc-4.4.4/gcc/i386-and-x86_002d64-Options.html#i386-and-x86_002d64-Options">list</uri>
424of x86 and x86-64-specific flags to see which of these instruction sets are 424of x86 and x86-64-specific flags to see which of these instruction sets are
425activated by the proper CPU type flag. If an instruction is listed, then you 425activated by the proper CPU type flag. If an instruction is listed, then you
426don't need to specify it; it will be turned on by using the proper <c>-march</c> 426don't need to specify it; it will be turned on by using the proper <c>-march</c>
427setting. 427setting.
428</note> 428</note>
499 } 499 }
500</pre> 500</pre>
501 501
502<p> 502<p>
503As you can see, any value higher than 3 is treated as just <c>-O3</c>. 503As you can see, any value higher than 3 is treated as just <c>-O3</c>.
504</p> 504</p>
505 505
506</body> 506</body>
507</section> 507</section>
508<section> 508<section>
509<title>What about redundant flags?</title> 509<title>What about redundant flags?</title>
510<body> 510<body>
511 511
512<p> 512<p>
513Oftentimes CFLAGS and CXXFLAGS that are turned on at various <c>-O</c> levels 513Oftentimes CFLAGS and CXXFLAGS that are turned on at various <c>-O</c> levels
514are specified redundantly in <path>/etc/make.conf</path>. Sometimes this is done 514are specified redundantly in <path>/etc/portage/make.conf</path>. Sometimes this is done
515out of ignorance, but it is also done to avoid flag filtering or flag replacing. 515out of ignorance, but it is also done to avoid flag filtering or flag replacing.
516</p> 516</p>
517 517
518<p> 518<p>
519Flag filtering/replacing is done in many of the ebuilds in the Portage tree. It 519Flag filtering/replacing is done in many of the ebuilds in the Portage tree. It
520is usually done because packages fail to compile at certain <c>-O</c> levels, or 520is usually done because packages fail to compile at certain <c>-O</c> levels, or
521when the source code is too sensitive for any additional flags to be used. The 521when the source code is too sensitive for any additional flags to be used. The
522ebuild will either filter out some or all CFLAGS and CXXFLAGS, or it may replace 522ebuild will either filter out some or all CFLAGS and CXXFLAGS, or it may replace
523<c>-O</c> with a different level. 523<c>-O</c> with a different level.
524</p> 524</p>
525 525
526<p> 526<p>
527The <uri 527The <uri
528link="http://devmanual.gentoo.org/ebuild-writing/functions/src_compile/build-environment/index.html">Gentoo 528link="http://devmanual.gentoo.org/ebuild-writing/functions/src_compile/build-environment/index.html">Gentoo
529Developer Manual</uri> outlines where and how flag filtering/replacing works. 529Developer Manual</uri> outlines where and how flag filtering/replacing works.
540 540
541<p> 541<p>
542However, <brite>this is not a smart thing to do</brite>. CFLAGS are filtered for 542However, <brite>this is not a smart thing to do</brite>. CFLAGS are filtered for
543a reason! When flags are filtered, it means that it is unsafe to build a package 543a reason! When flags are filtered, it means that it is unsafe to build a package
544with those flags. Clearly, it is <e>not</e> safe to compile your whole system 544with those flags. Clearly, it is <e>not</e> safe to compile your whole system
545with <c>-O3</c> if some of the flags turned on by that level will cause problems 545with <c>-O3</c> if some of the flags turned on by that level will cause problems
546with certain packages. Therefore, you shouldn't try to "outsmart" the developers 546with certain packages. Therefore, you shouldn't try to "outsmart" the developers
547who maintain those packages. <e>Trust the developers</e>. Flag filtering and 547who maintain those packages. <e>Trust the developers</e>. Flag filtering and
548replacing is done for your benefit! If an ebuild specifies alternative flags, 548replacing is done for your benefit! If an ebuild specifies alternative flags,
549then don't try to get around it. 549then don't try to get around it.
550</p> 550</p>
551 551
552<p> 552<p>
553You will most likely continue to run into problems when you build a package with 553You will most likely continue to run into problems when you build a package with
554unacceptable flags. When you report your troubles on Bugzilla, the flags you use 554unacceptable flags. When you report your troubles on Bugzilla, the flags you use
555in <path>/etc/make.conf</path> will be readily visible and you will be told to 555in <path>/etc/portage/make.conf</path> will be readily visible and you will be told to
556recompile without those flags. Save yourself the trouble of recompiling by not 556recompile without those flags. Save yourself the trouble of recompiling by not
557using redundant flags in the first place! Don't just automatically assume that 557using redundant flags in the first place! Don't just automatically assume that
558you know better than the developers. 558you know better than the developers.
559</p> 559</p>
560 560
561</body> 561</body>
562</section> 562</section>
563<section> 563<section>
564<title>What about LDFLAGS?</title> 564<title>What about LDFLAGS?</title>
565<body> 565<body>
566 566
567<p> 567<p>
568The Gentoo developers have already set basic, safe LDFLAGS in the base profiles, 568The Gentoo developers have already set basic, safe LDFLAGS in the base profiles,
569so you don't need to change them. 569so you don't need to change them.
570</p> 570</p>
571 571
572</body> 572</body>
573</section> 573</section>
574<section> 574<section>
575<title>Can I use per-package flags?</title> 575<title>Can I use per-package flags?</title>
576<body> 576<body>
577 577
578<p>
579There is no supported method for using CFLAGS or other variables on a
580per-package basis, though there are a few <uri
581link="http://forums.gentoo.org/viewtopic-p-3832057.html#3832057">rather
582abusive</uri> ways of trying force Portage to do so.
583</p>
584
585<warn> 578<warn>
586You <e>should not</e> try to force Portage to use per-package flags, as it is 579Using per-package flags complicates debugging and support. Make sure you mention
587not in any way supported and will greatly complicate bug reports. Just set your 580in your bug reports if you make use of this feature and what the changes are you
588flags in <path>/etc/make.conf</path> to be used on a system-wide basis. 581made.
589</warn> 582</warn>
583
584<p>
585Information on how to use per-package environment variables (including CFLAGS)
586is described in the <uri
587link="/doc/en/handbook/handbook-amd64.xml?part=3&amp;chap=6#doc_chap2">Gentoo
588Handbook, "Per-Package Environment Variables"</uri>.
589</p>
590 590
591</body> 591</body>
592</section> 592</section>
593</chapter> 593</chapter>
594 594
595<chapter> 595<chapter>
596<title>Resources</title> 596<title>Resources</title>
597<section> 597<section>
598<body> 598<body>
599 599
600<p> 600<p>
601The following resources are of some help in further understanding optimization: 601The following resources are of some help in further understanding optimization:
602</p> 602</p>
603 603
604<ul> 604<ul>

Legend:
Removed from v.1.18  
changed lines
  Added in v.1.20

  ViewVC Help
Powered by ViewVC 1.1.20