/[gentoo]/xml/htdocs/doc/en/utf-8.xml
Gentoo

Diff of /xml/htdocs/doc/en/utf-8.xml

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

Revision 1.13 Revision 1.14
1<?xml version='1.0' encoding="UTF-8"?> 1<?xml version='1.0' encoding="UTF-8"?>
2<!-- $Header: /var/cvsroot/gentoo/xml/htdocs/doc/en/utf-8.xml,v 1.13 2005/04/24 16:13:48 swift Exp $ --> 2<!-- $Header: /var/cvsroot/gentoo/xml/htdocs/doc/en/utf-8.xml,v 1.14 2005/05/09 14:05:46 swift Exp $ -->
3<!DOCTYPE guide SYSTEM "/dtd/guide.dtd"> 3<!DOCTYPE guide SYSTEM "/dtd/guide.dtd">
4 4
5<guide link="/doc/en/utf-8.xml"> 5<guide link="/doc/en/utf-8.xml">
6<title>Using UTF-8 with Gentoo</title> 6<title>Using UTF-8 with Gentoo</title>
7 7
8<author title="Author"> 8<author title="Author">
9 <mail link="slarti@gentoo.org">Thomas Martin</mail> 9 <mail link="slarti@gentoo.org">Thomas Martin</mail>
10</author> 10</author>
11<author title="Contributor"> 11<author title="Contributor">
12 <mail link="devil@gentoo.org.ua">Alexander Simonov</mail> 12 <mail link="devil@gentoo.org.ua">Alexander Simonov</mail>
13</author> 13</author>
14 14
15<abstract> 15<abstract>
16This guide shows you how to set up and use the UTF-8 Unicode character set with 16This guide shows you how to set up and use the UTF-8 Unicode character set with
17your Gentoo Linux system, after explaining the benefits of Unicode and more 17your Gentoo Linux system, after explaining the benefits of Unicode and more
18specifically UTF-8. 18specifically UTF-8.
19</abstract> 19</abstract>
20 20
21<license /> 21<license />
22 22
23<version>1.9</version> 23<version>2.0</version>
24<date>2005-04-24</date> 24<date>2005-05-08</date>
25 25
26<chapter> 26<chapter>
27<title>Character Encodings</title> 27<title>Character Encodings</title>
28<section> 28<section>
29<title>What is a Character Encoding?</title> 29<title>What is a Character Encoding?</title>
30<body> 30<body>
31 31
32<p> 32<p>
33Computers do not understand text themselves. Instead, every character is 33Computers do not understand text themselves. Instead, every character is
34represented by a number. Traditionally, each set of numbers used to represent 34represented by a number. Traditionally, each set of numbers used to represent
35alphabets and characters (known as a coding system, encoding or character set) 35alphabets and characters (known as a coding system, encoding or character set)
36was limited in size due to limitations in computer hardware. 36was limited in size due to limitations in computer hardware.
37</p> 37</p>
38 38
39</body> 39</body>
587<pre caption="~/.muttrc for UTF-8"> 587<pre caption="~/.muttrc for UTF-8">
588set send_charset="utf8" <comment>(outgoing character set)</comment> 588set send_charset="utf8" <comment>(outgoing character set)</comment>
589set charset="utf8" <comment>(display character set)</comment> 589set charset="utf8" <comment>(display character set)</comment>
590</pre> 590</pre>
591 591
592<note> 592<note>
593You may still see '?' in mail you read with Mutt. This is a result of people 593You may still see '?' in mail you read with Mutt. This is a result of people
594using a mail client which does not indicate the used charset. You can't do much 594using a mail client which does not indicate the used charset. You can't do much
595about this than to ask them to configure their client correctly. 595about this than to ask them to configure their client correctly.
596</note> 596</note>
597 597
598<p> 598<p>
599Further information is available from the <uri 599Further information is available from the <uri
600link="http://wiki.mutt.org/index.cgi?MuttFaq/Charset"> Mutt WikiWiki</uri>. 600link="http://wiki.mutt.org/index.cgi?MuttFaq/Charset"> Mutt WikiWiki</uri>.
601</p> 601</p>
602
603</body>
604</section>
605<section>
606<title>Less</title>
607<body>
608
609<p>
610We all use a lot of <c>more</c> or <c>less</c> along with the <c>|</c> to be
611able to correctly see the output of a command, like for example
612<c>dmesg | less</c>. While <c>more</c> only needs the shell to be UTF-8 aware,
613<c>less</c> needs an environment variable set, <c>LESSCHARSET</c> to ensure
614that unicode characters are rendered correctly. This can be set in
615<path>/etc/profile</path> or <path>~/.bash_profile</path>. Fire up the editor
616of your choice and the add the following line to one of the files mentioned
617above.
618</p>
619
620<pre caption="Setting up the Environment variable for less">
621LESSCHARSET=utf-8
622</pre>
623
624</body>
625</section>
626<section>
627<title>Man</title>
628<body>
629
630<p>
631Man pages are an integral part of any Linux machine. To ensure that any
632unicode in your man pages render correctly, edit <path>/etc/man.conf</path>
633and replace a line as shown below.
634</p>
635
636<pre caption="man.conf changes for Unicode support">
637<comment>(This is the old line)</comment>
638NROFF /usr/bin/nroff -Tascii -c -mandoc
639<comment>(Replace the one above with this)</comment>
640NROFF /usr/bin/nroff -mandoc -c
641</pre>
602 642
603</body> 643</body>
604</section> 644</section>
605<section> 645<section>
606<title>Testing it all out</title> 646<title>Testing it all out</title>
607<body> 647<body>
608 648
609<p> 649<p>
610There are numerous UTF-8 test websites around. <c>net-www/w3m</c>, 650There are numerous UTF-8 test websites around. <c>net-www/w3m</c>,
611<c>net-www/links</c>, <c>net-www/elinks</c>, <c>net-www/lynx</c> and all 651<c>net-www/links</c>, <c>net-www/elinks</c>, <c>net-www/lynx</c> and all
612Mozilla based browsers (including Firefox) support UTF-8. Konqueror and Opera 652Mozilla based browsers (including Firefox) support UTF-8. Konqueror and Opera
613have full UTF-8 support too. 653have full UTF-8 support too.
614</p> 654</p>
615 655
616<p> 656<p>

Legend:
Removed from v.1.13  
changed lines
  Added in v.1.14

  ViewVC Help
Powered by ViewVC 1.1.20