Text Manipulation

From Colwiki.org

Jump to: navigation, search


Outcomes

In this module you will learn how to manipulate text in Linux. The learner will cover the following:
  1. The vi Editor: Modes, Inserting, Deleting, Copying, Searching, Saving and Undoing
  2. Regular Expressions
  3. The grep family and the sed Stream Editor
  4. Basic Shell Scripting and the Use of the Shell Environment


Contents

cat the Swiss Army Knife

cat the editor The cat utility can be used as a rudimentary text editor as shown in the table below:

Network.PNG cat  > short-message
     we are curious 
     to meet
     penguins in Prague
            Crtl+D

Notice the use of Ctrl+D. This command is used for ending interactive input.

cat the reader More commonly cat is used only to flush text to stdout. Most common options are -n number each line of output -b number only non-blank output lines -A show carriage return Example

Network.PNG cat  /etc/resolve.conf
     search    mydomain.org
     nameserver   127.0.0.1

tac reads back-to-front

This command is the same as cat except that the text is read from the last line to the first.

Network.PNG tac  short-message
     penguins in Prague
     to meet
     we are curious 

Simple Tools

using head or tail

The utilities head and tail are often used to analyse logfiles. By default they output 10 lines of text. Here are the main usages.

List 20 first lines of /var/log/messages:

Network.PNG head -n 20 /var/log/messages    
    head -20  /var/log/messages

List 20 last lines of /etc/aliases:

Network.PNG tail -20  /etc/aliases

The tail utility has an added option that allows one to list the end of a text starting at a given line.

List text starting at line 25 in /var/log/messages:

Network.PNGtail +25 /etc/log/messages

Exercise: If a text has 90 lines, how would you use tail and head to list lines 50 to 65? Is there only one way to do this? Finally tail can continuously read a file using the -f option. This is most useful when you are expecting a file to be modified in real time.


Counting lines, words and bytes

The wc utility counts the number of bytes, words, and lines in files. Several options allow you to control wc's output.

Options for wc -l count number of lines -w count number of characters or words -c or -m count number of bytes or characters

Remarks: With no argument wc will count what is typed in stdin. numbering lines

The nl utility has the same output as cat -b.

Number all lines including blanks

Network.PNGnl -ba /etc/lilo.conf

Number only lines with text

Network.PNGnl -bt  /etc/lilo.conf

replacing tabs with spaces

The expand command is used to replace TABs with spaces. One can also use unexpand for the reverse operations.

viewing binary files

There are a number of tools available for this. The most common ones are od (octal dump) and hexdump.

Manipulating Text

The following tools modify text layouts.

choosing fields and characters with cut

The cut utility can extract a range of characters or fields from each line of a text.

The –c option is used to manipulate characters.

Syntax: cut –c {range1,range2}

Example

Network.PNG cut –c5-10,15- /etc/password

The example above outputs characters 5 to 10 and 15 to end of line for each line in /etc/password. One can specify the field delimiter (a space, a commas etc ...) of a file as well as the fields to output. These options are set with the –d and –f flags respectively.

Syntax:

cut -d {delimiter} -f {fields}

Example

Network.PNGcut -d: -f 1,7 --output-delimiter=" " /etc/passwd

This outputs fields 1st and 7th of /etc/passwd delimited with a space. The default output-delimiter is the same as the original input delimiter. The --output-delimiter option allows you to change this. joining and pasting text

The easiest utility is paste, which concatenates two files next to each other.

Syntax:

    paste   text1   text2

With join you can further specify which fields you are considering.

Syntax:

    join  -j1 {field_num}  -j2{field_num}  text1  text2               or
    join   -1 {field_num}  -2{field_num}  text1  text2 

Text is sent to stdout only if the specified fields match. Comparison is done one line at a time and as soon as no match is made the process is stopped even if more matches exist at the end of the file. sorting output By default, sort will arrange a text in alphabetical order. To perform a numerical sort use the -n option.

formatting output You can modify the number of characters per line of output using fmt. By default fmt will concatenate lines and output 75 character lines.

fmt options

-w number of characters per line

-s split long lines but do not refill

-u place one space between each word and two spaces at the end of a sentence

translating characters The tr utility translates one set of characters into another.

Example changing uppercase letters into lowercase

    tr '[A-B]' '[a-b]'  < file.txt

Replacing delimiters in /etc/passwd:

Network.PNGtr  ':'  ' ' < /etc/passwd

Notice: tr has only two arguments! The file is not an argument.

The Vi Editor

In most Linux distributions vi is the text editor of choice. It is considered an essential admin tool such as grep or cat and is found therefore in the /bin directory. The vi Modes In order to perform complex operations such as copy/paste vi can operate in different modes.

  1. Command mode: This is the editing and navigation mode. Commands are often just a letter. For example use j to jump to the next line. As a rule of thumb if you want to perform an operation several times you can precede the command by a number. For example 10j will jump 10 lines.
  2. Last Line (or column) Mode: You enter this mode from the command mode by typing a colon. The column will appear at the bottom left corner of the screen. In this mode you can perform a simple search operation, save, quit or run a shell command.
  3. Insert Mode: The easiest way to enter this mode while in command mode is to use i or a. This mode is the most intuitive and is mainly used to interactively enter text into a document.

The Esc key will exit the insert mode and return to command mode

Text Items Items such as words and paragraphs are defined in command mode to allow editing commands to be applied to text documents without using a mouse.

Word, sentences and paragraphs

e resp. b Move to the end/begining of the current word

( resp. ) Move to the begining/end of the current sentence

{ resp. } Move to the begining/end of the current paragraph

w Similar to e but includes the space after the word

Beginning and End

^ Beginning of line $ End of line 1G Beginning of file G End of file

All these text items can be used to navigate through the text one word (w) or paragraph (})at a time, go to the beginning of a line (^) the end of the file (G) etc. One can also use these text items to execute commands such as deleting and copying. Inserting Text When in command mode typing i will allow you to enter text in the document interactively. As with all other features in vi there are many other ways of doing this. The table below lists all possible inserting modes.

Insert Commands

a Append text with cursor on the last letter of the line A Append text with cursor after last letter at the end of the line i Insert text at the current position o Insert text on a new line below O Insert text on a new line above s Delete the current letter and insert S Delete current line and insert Deleting Text

If you want to delete a single character while in command mode you would use x and dd would delete the current line. Remark: Nearly all vi commands can be repeated by specifying a number in front of the command. You can also apply the command to a text item (such as word., sentence, paragraph ...) by placing the entity after the command.


w single word l single character

Examples: Delete a word: dw Delete text from here to the end of the current line d$

Delete text from here to the end of the current paragraph d} One can simultaneously delete an item and switch to insert mode with the c command. As usual you can use this command together with a text item such as w or {.

Copy Pasting

The copy action in vi is the command y (for yank), and the paste action is p. If an entire line is yanked the pasted text will be inserted on the next line below the cursor.

The text selection is made with the familiar text items w, l, }, $ etc ... There are a few exceptions such as the last example.

Examples:

    Copy the text from here to the end of the current line
    y$
    Copy the entire current line
    yy
    Copy 3 lines
    3yy

The latest deleted item is always buffered and can be pasted with the p command. This is equivalent to a cut-and-paste operation

Searching Since searching involves pattern matching we find ourselves once again dealing with regular expressions (regex). As many UNIX text manipulation tools such as grep or sed, vi recognises regular expressions too. To perform a search one must be in colon mode. The / (forward slash) command searches forward and the ? command searches backwards. One can also perform search and replace operations. The syntax is similar to sed.


Example: Search for words beginning with ‘comp’ in all the text

     /\<comp

Search for lines starting with the letter z

     /^z

Search in the whole text for the keyword ‘VAR’ and replace it by ‘var’

     :% s/VAR/var

Undoing At this stage is is worth mentioning that one can always undo changes (while in command mode) with the u command, and this as long as one hasn’t saved the file yet.

Saving The command for saving is w. By default the complete document is saved. One can also specify an alternative name for the file. Portions of the text can be saved to another file while other files can be read and pasted in the current document. Here are the examples which illustrate this. Examples: Save the current document as ‘newfile’

    :w newfile

Save lines 15 to 24 in a file called ‘extract’

    :w 15,24 extract

Read from file ‘extract’. The text will be pasted at the cursor

    :r extract

Warning: In the column mode context we have the following . is the current line $ is the end of the document


Summary

In this module you learned about how to manipulate text in the Linux command interface. As a systems administrator, you will work more with configuration files and you should be able to adequately use the exising editors to modify the operations of running processes.

You may be wondering why we studied the vi editor. This is because this is the editor that is most commonly found in any Unix like operating system including Linux.




Assignment

  • Use cat to enter text into a file called message.

cat >> message line 1 ^D Do the same but use the keyword STOP instead of the predefined eof control (^D). cat >> message << STOP line 2 STOP Next, append text to message using echo.

echo line 3 >> message

  • Create a file called index with two fields REFERENCE and TITLE separated by a space.
       e.g	001	Using_Linux

Create a second file pricing with two fields REFERENCE and PRICE separated by a space

       e.g	001	9.99

Use join to display the reference, title and prices fields.

  • Using tr replace all colons by semi-colons in /etc/passwd.
      Do the same using cut.
  • Use head and tail to list lines 70 to 85 of /var/log/messages.
  • Use the cut utility together with grep and ifconfig to printout only the IP address of the first
     network interface eth0.
  • In /tmp make a directory called files
     mkdir /tmp/files 

Create 50 files in that directory:

We want to change all the txt extensions to dat extentions. For this we need to type the following on the command line:



Image:somerights20.png This work is licenced under a Creative Commons - By Attribution Licence - Share Alike License.

Personal tools
News & Events