In this post, we will look into the UNIX awk command and it's common usage, awk is a text processing tool in Linux where you can filter lines, extract specific fields, format output and so on, awk in itself is a programming language with many constructs designed for sophisticated text processing tasks, in this article, we will look into some important features of AWK.
Sample input file:
For the following examples, we will use the below input file.
3.1 Using ORS (Output Record Separator)
By setting the ORS built-in variable, we can specify the way in which the matching lines are printed, the ORS defaults to '/n' (new line), that's why we see each print statement separated by newline, if we need to print matching lines in a row format separated by semi-colon(;), use the below command
3.2 Using OFS (Output Field Separator)
Using OFS, one can print the output fields separated by a custom delimiter (which defaults to space), for example to print fields of a matching line with a dash separator (-), use the following awk command.
To print all matching lines separated by space with fields in a matching line separated by dash (-), use the following awk command.
The FS built-in variable sets the delimiter to be used for separating input fields (which by default is space), for example to separate input fields using comma(,) and print them use the below awk command
4.1 Match first occurrence of a pattern and exit
The idea here is to keep track of the latest line matching the pattern and print the same using END block (which executes after all lines in a file are processed)
We can use regular expressions for pattern matching as well
5.1 To match all lines ending with the letter 'e' use the following awk command.
Sample input file:
For the following examples, we will use the below input file.
$ cat input.txt This is line one This is line two This is line three This is line four1. awk: Print lines matching a pattern (similar to grep)
$ awk '/one/ {print;}' input.txt This is line one $ awk '/one/ {print $0;}' input.txt This is line one2. awk: Print file contents with line numbers
$ awk '{print NR, $0}' input.txt 1 This is line one 2 This is line two 3 This is line three 4 This is line four3. awk: The ORS, OFS, FS built-in variables
3.1 Using ORS (Output Record Separator)
By setting the ORS built-in variable, we can specify the way in which the matching lines are printed, the ORS defaults to '/n' (new line), that's why we see each print statement separated by newline, if we need to print matching lines in a row format separated by semi-colon(;), use the below command
$ awk 'BEGIN {ORS=";"} {print;} ' input.txt This is line one;This is line two;This is line three;This is line four;Note: The BEGIN block executes before matching the pattern, therefore any initializations go here.
3.2 Using OFS (Output Field Separator)
Using OFS, one can print the output fields separated by a custom delimiter (which defaults to space), for example to print fields of a matching line with a dash separator (-), use the following awk command.
$ awk 'BEGIN {OFS="-"} /two/ {print $3,$4}' input.txt line-two3.3 Combining OFS (Output Field Separator) and ORS (Output Record Separator)
To print all matching lines separated by space with fields in a matching line separated by dash (-), use the following awk command.
$ awk 'BEGIN { OFS = "-"; ORS= " " } {print $3, $4}' input.txt line-one line-two line-three line-four3.4 Using FS (Input Field Separator)
The FS built-in variable sets the delimiter to be used for separating input fields (which by default is space), for example to separate input fields using comma(,) and print them use the below awk command
$ echo "1,2,3,4,5" | awk 'BEGIN {FS = ","} {print $1,$2,$3,$4,$5}' 1 2 3 4 5 $ echo "1,2,3,4,5" | awk 'BEGIN {OFS="-";FS = ","} {print $1,$2,$3,$4,$5}' 1-2-3-4-5In our sample input, we can easily replace the word 'line' with 'number' using the above simple trick.
$ awk 'BEGIN {OFS = "number";FS = "line"} {print $1, $2}' input.txt This is number one This is number two This is number three This is number four4. awk: Match only the first and last occurrence of a pattern in a file
4.1 Match first occurrence of a pattern and exit
$ awk '/line/ {print;exit}' input.txt This is line one4.2 Match last occurrence of a pattern and exit
The idea here is to keep track of the latest line matching the pattern and print the same using END block (which executes after all lines in a file are processed)
$ awk '/line/ {line_content=$0} END { print line_content }' input.txt This is line four5. awk: Pattern matching with regular expressions
We can use regular expressions for pattern matching as well
5.1 To match all lines ending with the letter 'e' use the following awk command.
$ awk '/e$/ {print;}' input.txt This is line one This is line three5.2 To match all lines ending with letter 'e' or 'r' use the below awk command.
$ awk '/e$|r$/ {print;}' input.txt This is line one This is line three This is line four



