Text Manipulation
From Colwiki.org
In this module you will learn how to manipulate text in Linux. The learner will cover the following:
|
Contents |
cat the Swiss Army Knife
cat the editor The cat utility can be used as a rudimentary text editor as shown in the table below:
cat > short-message we are curious to meet penguins in Prague Crtl+D
Notice the use of Ctrl+D. This command is used for ending interactive input.
cat the reader More commonly cat is used only to flush text to stdout. Most common options are -n number each line of output -b number only non-blank output lines -A show carriage return Example
cat /etc/resolve.conf search mydomain.org nameserver 127.0.0.1
tac reads back-to-front
This command is the same as cat except that the text is read from the last line to the first.
tac short-message penguins in Prague to meet we are curious
Simple Tools
using head or tail
The utilities head and tail are often used to analyse logfiles. By default they output 10 lines of text. Here are the main usages.
List 20 first lines of /var/log/messages:
head -n 20 /var/log/messages head -20 /var/log/messages
List 20 last lines of /etc/aliases:
tail -20 /etc/aliases
The tail utility has an added option that allows one to list the end of a text starting at a given line.
List text starting at line 25 in /var/log/messages:
tail +25 /etc/log/messages
Exercise: If a text has 90 lines, how would you use tail and head to list lines 50 to 65? Is there only one way to do this? Finally tail can continuously read a file using the -f option. This is most useful when you are expecting a file to be modified in real time.
Counting lines, words and bytes
The wc utility counts the number of bytes, words, and lines in files. Several options allow you to control wc's output.
Options for wc -l count number of lines -w count number of characters or words -c or -m count number of bytes or characters
Remarks: With no argument wc will count what is typed in stdin. numbering lines
The nl utility has the same output as cat -b.
Number all lines including blanks
nl -ba /etc/lilo.conf
Number only lines with text
nl -bt /etc/lilo.conf
replacing tabs with spaces
The expand command is used to replace TABs with spaces. One can also use unexpand for the reverse operations.
viewing binary files
There are a number of tools available for this. The most common ones are od (octal dump) and hexdump.
Manipulating Text
The following tools modify text layouts.
choosing fields and characters with cut
The cut utility can extract a range of characters or fields from each line of a text.
The –c option is used to manipulate characters.
Syntax: cut –c {range1,range2}
Example
cut –c5-10,15- /etc/password
The example above outputs characters 5 to 10 and 15 to end of line for each line in /etc/password. One can specify the field delimiter (a space, a commas etc ...) of a file as well as the fields to output. These options are set with the –d and –f flags respectively.
Syntax:
cut -d {delimiter} -f {fields}
Example
cut -d: -f 1,7 --output-delimiter=" " /etc/passwd
This outputs fields 1st and 7th of /etc/passwd delimited with a space. The default output-delimiter is the same as the original input delimiter. The --output-delimiter option allows you to change this. joining and pasting text
The easiest utility is paste, which concatenates two files next to each other.
Syntax:
paste text1 text2
With join you can further specify which fields you are considering.
Syntax:
join -j1 {field_num} -j2{field_num} text1 text2 or
join -1 {field_num} -2{field_num} text1 text2
Text is sent to stdout only if the specified fields match. Comparison is done one line at a time and as soon as no match is made the process is stopped even if more matches exist at the end of the file. sorting output By default, sort will arrange a text in alphabetical order. To perform a numerical sort use the -n option.
formatting output You can modify the number of characters per line of output using fmt. By default fmt will concatenate lines and output 75 character lines.
fmt options
-w number of characters per line
-s split long lines but do not refill
-u place one space between each word and two spaces at the end of a sentence
translating characters The tr utility translates one set of characters into another.
Example changing uppercase letters into lowercase
tr '[A-B]' '[a-b]' < file.txt
Replacing delimiters in /etc/passwd:
tr ':' ' ' < /etc/passwd
Notice: tr has only two arguments! The file is not an argument.
The Vi Editor
In most Linux distributions vi is the text editor of choice. It is considered an essential admin tool such as grep or cat and is found therefore in the /bin directory. The vi Modes In order to perform complex operations such as copy/paste vi can operate in different modes.
- Command mode: This is the editing and navigation mode. Commands are often just a letter. For example use j to jump to the next line. As a rule of thumb if you want to perform an operation several times you can precede the command by a number. For example 10j will jump 10 lines.
- Last Line (or column) Mode: You enter this mode from the command mode by typing a colon. The column will appear at the bottom left corner of the screen. In this mode you can perform a simple search operation, save, quit or run a shell command.
- Insert Mode: The easiest way to enter this mode while in command mode is to use i or a. This mode is the most intuitive and is mainly used to interactively enter text into a document.
The Esc key will exit the insert mode and return to command mode
Text Items Items such as words and paragraphs are defined in command mode to allow editing commands to be applied to text documents without using a mouse.
Word, sentences and paragraphs
e resp. b Move to the end/begining of the current word
( resp. ) Move to the begining/end of the current sentence
{ resp. } Move to the begining/end of the current paragraph
w Similar to e but includes the space after the word
Beginning and End
^ Beginning of line $ End of line 1G Beginning of file G End of file
All these text items can be used to navigate through the text one word (w) or paragraph (})at a time, go to the beginning of a line (^) the end of the file (G) etc. One can also use these text items to execute commands such as deleting and copying. Inserting Text When in command mode typing i will allow you to enter text in the document interactively. As with all other features in vi there are many other ways of doing this. The table below lists all possible inserting modes.
Insert Commands
a Append text with cursor on the last letter of the line A Append text with cursor after last letter at the end of the line i Insert text at the current position o Insert text on a new line below O Insert text on a new line above s Delete the current letter and insert S Delete current line and insert Deleting Text
If you want to delete a single character while in command mode you would use x and dd would delete the current line. Remark: Nearly all vi commands can be repeated by specifying a number in front of the command. You can also apply the command to a text item (such as word., sentence, paragraph ...) by placing the entity after the command.
w single word
l single character
Examples: Delete a word: dw Delete text from here to the end of the current line d$
Delete text from here to the end of the current paragraph d} One can simultaneously delete an item and switch to insert mode with the c command. As usual you can use this command together with a text item such as w or {.
Copy Pasting
The copy action in vi is the command y (for yank), and the paste action is p. If an entire line is yanked the pasted text will be inserted on the next line below the cursor.
The text selection is made with the familiar text items w, l, }, $ etc ... There are a few exceptions such as the last example.
Examples:
Copy the text from here to the end of the current line
y$
Copy the entire current line
yy
Copy 3 lines
3yy
The latest deleted item is always buffered and can be pasted with the p command. This is equivalent to a cut-and-paste operation
Searching Since searching involves pattern matching we find ourselves once again dealing with regular expressions (regex). As many UNIX text manipulation tools such as grep or sed, vi recognises regular expressions too. To perform a search one must be in colon mode. The / (forward slash) command searches forward and the ? command searches backwards. One can also perform search and replace operations. The syntax is similar to sed.
Example: Search for words beginning with ‘comp’ in all the text
/\<comp
Search for lines starting with the letter z
/^z
Search in the whole text for the keyword ‘VAR’ and replace it by ‘var’
:% s/VAR/var
Undoing At this stage is is worth mentioning that one can always undo changes (while in command mode) with the u command, and this as long as one hasn’t saved the file yet.
Saving The command for saving is w. By default the complete document is saved. One can also specify an alternative name for the file. Portions of the text can be saved to another file while other files can be read and pasted in the current document. Here are the examples which illustrate this. Examples: Save the current document as ‘newfile’
:w newfile
Save lines 15 to 24 in a file called ‘extract’
:w 15,24 extract
Read from file ‘extract’. The text will be pasted at the cursor
:r extract
Warning: In the column mode context we have the following . is the current line $ is the end of the document
| In this module you learned about how to manipulate text in the Linux command interface. As a systems administrator, you will work more with configuration files and you should be able to adequately use the exising editors to modify the operations of running processes.
You may be wondering why we studied the vi editor. This is because this is the editor that is most commonly found in any Unix like operating system including Linux. |
cat >> message line 1 ^D Do the same but use the keyword STOP instead of the predefined eof control (^D). cat >> message << STOP line 2 STOP Next, append text to message using echo. echo line 3 >> message
e.g 001 Using_Linux Create a second file pricing with two fields REFERENCE and PRICE separated by a space e.g 001 9.99 Use join to display the reference, title and prices fields.
Do the same using cut.
network interface eth0.
mkdir /tmp/files Create 50 files in that directory: We want to change all the txt extensions to dat extentions. For this we need to type the following on the command line:
|
This work is licenced under a Creative Commons - By Attribution Licence - Share Alike License.

