AWK Hints
Text processing hints with AWK
awk
is a powerful and versatile tool at your disposal. It allows you to perform complex operations on text files with ease. In this blog post, we’ll explore 10 common awk
examples, complete with inputs and expected outputs, to help you unlock its full potential.
1. Extracting Columns from a CSV File
Imagine you have a CSV file, and you want to extract specific columns. Here’s how you can do it:
Input (file.csv):
Name, Age, City
John, 25, New York
Alice, 30, Los Angeles
Command:
awk -F',' '{print $1, $3}' file.csv
Output:
Name City
John New York
Alice Los Angeles
2. Summing Values in a Column
Summing up values in a column is a common task. Let’s say you have a file with numbers, and you want to find the total:
Input (data.txt):
10
20
30
Command:
awk '{sum+=$1} END {print sum}' data.txt
Output:
60
3. Searching for Specific Lines
Searching for lines that match a specific pattern can be handy. Suppose you have a file with various words, and you want to find lines containing the letter ‘a’:
Input (file.txt):
apple
banana
cherry
date
Command:
awk '/a/' file.txt
Output:
apple
banana
date
4. Calculating Column Averages
You can calculate the average of values in a column using awk
. Consider you have a file with grades, and you want to find the average:
Input (grades.txt):
90
85
78
92
Command:
awk '{sum+=$1} END {print "Average: ", sum/NR}' grades.txt
Output:
Average: 86.25
5. Filtering Lines by Length
Sometimes, you need to filter lines based on their length. Let’s say you have a file with text, and you want lines longer than 20 characters:
Input (text.txt):
This is a short line.
This is a really long line that exceeds 20 characters.
Command:
awk 'length($0) > 20' text.txt
Output:
This is a really long line that exceeds 20 characters.
6. Formatting and Aligning Columns
Formatting columns for better readability is crucial. If you have a file with names and ages, you can align them like this:
Input (data.txt):
Alice 25
Bob 30
Carol 22
Command:
awk '{printf "%-10s %s\n", $1, $2}' data.txt
Output:
Alice 25
Bob 30
Carol 22
7. Counting Lines in a File
Counting lines in a file is a straightforward task. Let’s say you have a file with multiple lines:
Input (file.txt):
Line 1
Line 2
Line 3
Command:
awk 'END {print NR}' file.txt
Output:
3
8. Find and Replace Text
Searching and replacing text is a common operation. If you have a file with the text “Hello, World!” and you want to change “World” to “Universe,” you can do it like this:
Input (text.txt):
Hello, World!
Command:
awk '{gsub("World", "Universe")}1' text.txt
Output:
Hello, Universe!
9. Removing Duplicate Lines
Removing duplicate lines from a file can help clean up your data. Suppose you have a file with duplicate words:
Input (data.txt):
apple
banana
apple
cherry
Command:
awk '!seen[$0]++' data.txt
Output:
apple
banana
cherry
10. Calculating Column-Wise Sum and Average
Lastly, you can calculate the sum and average of each column in a multi-column file. Suppose you have a file with two columns:
Input (numbers.txt):
10 20
5 15
8 25
Command:
awk '{for(i=1; i<=NF; i++) {sum[i]+=$i}} END {for(i=1; i<=NF; i++) {print "Column", i, "Sum:", sum[i], "Average:", sum[i]/NR}}' numbers.txt
Output:
Column 1 Sum: 23 Average: 7.66667
Column 2 Sum: 60 Average: 20
These awk
examples showcase its flexibility and power in text processing and data analysis. Whether you need to manipulate data, extract specific information, or perform calculations, awk
is an invaluable tool in your Bash toolkit.