/[gentoo]/xml/htdocs/doc/en/articles/l-sed2.xml
Gentoo

Diff of /xml/htdocs/doc/en/articles/l-sed2.xml

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

Revision 1.7 Revision 1.8
1<?xml version='1.0' encoding="UTF-8"?> 1<?xml version='1.0' encoding="UTF-8"?>
2<!-- $Header: /var/cvsroot/gentoo/xml/htdocs/doc/en/articles/l-sed2.xml,v 1.7 2011/09/04 17:53:41 swift Exp $ --> 2<!-- $Header: /var/cvsroot/gentoo/xml/htdocs/doc/en/articles/l-sed2.xml,v 1.8 2012/06/29 16:03:34 swift Exp $ -->
3<!DOCTYPE guide SYSTEM "/dtd/guide.dtd"> 3<!DOCTYPE guide SYSTEM "/dtd/guide.dtd">
4 4
5<guide disclaimer="articles"> 5<guide disclaimer="articles">
6<title>Sed by example, Part 2</title> 6<title>Sed by example, Part 2</title>
7 7
8<author title="Author"> 8<author title="Author">
9 <mail link="drobbins@gentoo.org">Daniel Robbins</mail> 9 <mail link="drobbins@gentoo.org">Daniel Robbins</mail>
10</author> 10</author>
11 11
12<abstract> 12<abstract>
13Sed is a very powerful and compact text stream editor. In this article, the 13Sed is a very powerful and compact text stream editor. In this article, the
14second in the series, Daniel shows you how to use sed to perform string 14second in the series, Daniel shows you how to use sed to perform string
15substitution; create larger sed scripts; and use sed's append, insert, and 15substitution; create larger sed scripts; and use sed's append, insert, and
16change line commands. 16change line commands.
17</abstract> 17</abstract>
18 18
19<!-- The original version of this article was published on IBM developerWorks, 19<!-- The original version of this article was published on IBM developerWorks,
20and is property of Westtech Information Services. This document is an updated 20and is property of Westtech Information Services. This document is an updated
21version of the original article, and contains various improvements made by the 21version of the original article, and contains various improvements made by the
22Gentoo Linux Documentation team --> 22Gentoo Linux Documentation team -->
23 23
24<version>1.2</version> 24<version>2</version>
25<date>2005-10-09</date> 25<date>2005-10-09</date>
26 26
27<chapter> 27<chapter>
28<title>How to further take advantage of the UNIX text editor</title> 28<title>How to further take advantage of the UNIX text editor</title>
29<section> 29<section>
30<title>Substitution!</title> 30<title>Substitution!</title>
31<body> 31<body>
32 32
33<p> 33<p>
34Let's look at one of sed's most useful commands, the substitution command. 34Let's look at one of sed's most useful commands, the substitution command.
35Using it, we can replace a particular string or matched regular expression with 35Using it, we can replace a particular string or matched regular expression with
36another string. Here's an example of the most basic use of this command: 36another string. Here's an example of the most basic use of this command:
37</p> 37</p>
38 38
39<pre caption="Most basic use of substitution command"> 39<pre caption="Most basic use of substitution command">
40$ <i>sed -e 's/foo/bar/' myfile.txt</i> 40$ <i>sed -e 's/foo/bar/' myfile.txt</i>
41</pre> 41</pre>
42 42
43<p> 43<p>
44The above command will output the contents of myfile.txt to stdout, with the 44The above command will output the contents of myfile.txt to stdout, with the
45first occurrence of 'foo' (if any) on each line replaced with the string 'bar'. 45first occurrence of 'foo' (if any) on each line replaced with the string 'bar'.
46Please note that I said first occurrence on each line, though this is normally 46Please note that I said first occurrence on each line, though this is normally
47not what you want. Normally, when I do a string replacement, I want to perform 47not what you want. Normally, when I do a string replacement, I want to perform
48it globally. That is, I want to replace all occurrences on every line, as 48it globally. That is, I want to replace all occurrences on every line, as
49follows: 49follows:
50</p> 50</p>
51 51
52<pre caption="Replacing all the occurences on every line"> 52<pre caption="Replacing all the occurrences on every line">
53$ <i>sed -e 's/foo/bar/g' myfile.txt</i> 53$ <i>sed -e 's/foo/bar/g' myfile.txt</i>
54</pre> 54</pre>
55 55
56<p> 56<p>
57The additional 'g' option after the last slash tells sed to perform a global 57The additional 'g' option after the last slash tells sed to perform a global
58replace. 58replace.
59</p> 59</p>
60 60
61<p> 61<p>
62Here are a few other things you should know about the <c>s///</c> substitution 62Here are a few other things you should know about the <c>s///</c> substitution
63command. First, it is a command, and a command only; there are no addresses 63command. First, it is a command, and a command only; there are no addresses
64specified in any of the above examples. This means that the <c>s///</c> command 64specified in any of the above examples. This means that the <c>s///</c> command
65can also be used with addresses to control what lines it will be applied to, as 65can also be used with addresses to control what lines it will be applied to, as
66follows: 66follows:
67</p> 67</p>
83<p> 83<p>
84This example will swap 'hills' for 'mountains', but only on blocks of text 84This example will swap 'hills' for 'mountains', but only on blocks of text
85beginning with a blank line, and ending with a line beginning with the three 85beginning with a blank line, and ending with a line beginning with the three
86characters 'END', inclusive. 86characters 'END', inclusive.
87</p> 87</p>
88 88
89<p> 89<p>
90Another nice thing about the <c>s///</c> command is that we have a lot of 90Another nice thing about the <c>s///</c> command is that we have a lot of
91options when it comes to those <c>/</c> separators. If we're performing string 91options when it comes to those <c>/</c> separators. If we're performing string
92substitution and the regular expression or replacement string has a lot of 92substitution and the regular expression or replacement string has a lot of
93slashes in it, we can change the separator by specifying a different character 93slashes in it, we can change the separator by specifying a different character
94after the 's'. For example, this will replace all occurrences of 94after the 's'. For example, this will replace all occurrences of
95<path>/usr/local</path> with <path>/usr</path>: 95<path>/usr/local</path> with <path>/usr</path>:
96</p> 96</p>
97 97
98<pre caption="Replacing all the occurences of one string with another one"> 98<pre caption="Replacing all the occurrences of one string with another one">
99$ <i>sed -e 's:/usr/local:/usr:g' mylist.txt</i> 99$ <i>sed -e 's:/usr/local:/usr:g' mylist.txt</i>
100</pre> 100</pre>
101 101
102<note> 102<note>
103In this example, we're using the colon as a separator. If you ever need to 103In this example, we're using the colon as a separator. If you ever need to
104specify the separator character in the regular expression, put a backslash 104specify the separator character in the regular expression, put a backslash
105before it. 105before it.
106</note> 106</note>
107 107
108</body> 108</body>
109</section> 109</section>
110<section> 110<section>
111<title>Regexp snafus</title> 111<title>Regexp snafus</title>
112<body> 112<body>
113 113
114<p> 114<p>
115Up until now, we've only performed simple string substitution. While this is 115Up until now, we've only performed simple string substitution. While this is
116handy, we can also match a regular expression. For example, the following sed 116handy, we can also match a regular expression. For example, the following sed
117command will match a phrase beginning with '&lt;' and ending with '&gt;', and 117command will match a phrase beginning with '&lt;' and ending with '&gt;', and
118containing any number of characters inbetween. This phrase will be deleted 118containing any number of characters in-between. This phrase will be deleted
119(replaced with an empty string): 119(replaced with an empty string):
120</p> 120</p>
121 121
122<pre caption="Deleting specified phrase"> 122<pre caption="Deleting specified phrase">
123$ <i>sed -e 's/&lt;.*&gt;//g' myfile.html</i> 123$ <i>sed -e 's/&lt;.*&gt;//g' myfile.html</i>
124</pre> 124</pre>
125 125
126<p> 126<p>
127This is a good first attempt at a sed script that will remove HTML tags from a 127This is a good first attempt at a sed script that will remove HTML tags from a
128file, but it won't work well, due to a regular expression quirk. The reason? 128file, but it won't work well, due to a regular expression quirk. The reason?
129When sed tries to match the regular expression on a line, it finds the longest 129When sed tries to match the regular expression on a line, it finds the longest
130match on the line. This wasn't an issue in my previous sed article, because we 130match on the line. This wasn't an issue in my previous sed article, because we
131were using the <c>d</c> and <c>p</c> commands, which would delete or print the 131were using the <c>d</c> and <c>p</c> commands, which would delete or print the
132entire line anyway. But when we use the <c>s///</c> command, it definitely makes 132entire line anyway. But when we use the <c>s///</c> command, it definitely makes
133a big difference, because the entire portion that the regular expression matches 133a big difference, because the entire portion that the regular expression matches
176their results. 176their results.
177</p> 177</p>
178 178
179</body> 179</body>
180</section> 180</section>
181<section> 181<section>
182<title>More character matching</title> 182<title>More character matching</title>
183<body> 183<body>
184 184
185<p> 185<p>
186The '[ ]' regular expression syntax has some more additional options. To specify 186The '[ ]' regular expression syntax has some more additional options. To specify
187a range of characters, you can use a '-' as long as it isn't in the first or 187a range of characters, you can use a '-' as long as it isn't in the first or
188last position, as follows: 188last position, as follows:
189</p> 189</p>
190 190
191<pre caption="Specifying a rangle of characters"> 191<pre caption="Specifying a range of characters">
192'[a-x]*' 192'[a-x]*'
193</pre> 193</pre>
194 194
195<p> 195<p>
196This will match zero or more characters, as long as all of them are 196This will match zero or more characters, as long as all of them are
197'a','b','c'...'v','w','x'. In addition, the '[:space:]' character class is 197'a','b','c'...'v','w','x'. In addition, the '[:space:]' character class is
198available for matching whitespace. Here's a fairly complete list of available 198available for matching whitespace. Here's a fairly complete list of available
199character classes: 199character classes:
200</p> 200</p>
201 201
202 202
203<table> 203<table>
204 <tr> 204 <tr>
205 <th>Character class</th> 205 <th>Character class</th>
206 <th>Description</th> 206 <th>Description</th>
245 <ti>[:space:]</ti> 245 <ti>[:space:]</ti>
246 <ti>Whitespace</ti> 246 <ti>Whitespace</ti>
247 </tr> 247 </tr>
248 <tr> 248 <tr>
249 <ti>[:upper:]</ti> 249 <ti>[:upper:]</ti>
250 <ti>Upper-case [A-Z]</ti> 250 <ti>Upper-case [A-Z]</ti>
251 </tr> 251 </tr>
252 <tr> 252 <tr>
253 <ti>[:xdigit:]</ti> 253 <ti>[:xdigit:]</ti>
254 <ti>hex digits [0-9 a-f A-F]</ti> 254 <ti>hex digits [0-9 a-f A-F]</ti>
255 </tr> 255 </tr>
256</table> 256</table>
257 257
258<p> 258<p>
259It's advantageous to use character classes whenever possible, because they adapt 259It's advantageous to use character classes whenever possible, because they adapt
260better to nonEnglish speaking locales (including accented characters when 260better to non-English speaking locales (including accented characters when
261necessary, etc.). 261necessary, etc.).
262</p> 262</p>
263 263
264</body> 264</body>
265</section> 265</section>
266<section> 266<section>
267<title>Advanced substitution stuff</title> 267<title>Advanced substitution stuff</title>
268<body> 268<body>
269 269
270<p> 270<p>
271We've looked at how to perform simple and even reasonably complex straight 271We've looked at how to perform simple and even reasonably complex straight
272substitutions, but sed can do even more. We can actually refer to either parts 272substitutions, but sed can do even more. We can actually refer to either parts
273of or the entire matched regular expression, and use these parts to construct 273of or the entire matched regular expression, and use these parts to construct
274the replacement string. As an example, let's say you were replying to a message. 274the replacement string. As an example, let's say you were replying to a message.
275The following example would prefix each line with the phrase "ralph said: ": 275The following example would prefix each line with the phrase "ralph said: ":

Legend:
Removed from v.1.7  
changed lines
  Added in v.1.8

  ViewVC Help
Powered by ViewVC 1.1.20