Skip to content

I/O, Pipes & Redirection

The pipe | is Unix's most powerful idea — chain small tools together to build pipelines that would take hundreds of lines in any other language, with no intermediate files and near-zero overhead.

Learning Objectives

  • Understand the three standard streams: stdin, stdout, stderr
  • Redirect output to files with >, >>, and 2>
  • Connect commands with pipes |
  • Use tee to split output to a file and stdout simultaneously
  • Use process substitution <(command) for advanced pipelines

Standard Streams

Every process has three default I/O streams:

Stream Number Default Variable
stdin 0 keyboard
stdout 1 terminal
stderr 2 terminal
echo "This goes to stdout"
echo "This goes to stderr" >&2

Redirection

Redirecting stdout

ls > filelist.txt           # overwrite (creates or truncates)
ls >> filelist.txt          # append
echo "hello" > /dev/null    # discard output

Redirecting stderr

ls /nonexistent 2> errors.txt        # redirect stderr to file
ls /nonexistent 2>> errors.txt       # append stderr
ls /nonexistent 2>/dev/null          # discard stderr

Redirecting Both stdout and stderr

command > output.txt 2>&1            # both to same file
command &> output.txt                # bash shorthand (same effect)
command > output.txt 2>/dev/null     # stdout to file, discard stderr

Order matters with 2>&1

command > file 2>&1 is correct — redirect stdout to file, then redirect stderr to where stdout is pointing (the file). command 2>&1 > file is wrong — it redirects stderr to the terminal (current stdout), then redirects stdout to the file.

Redirecting stdin

sort < names.txt             # sort reads from file instead of keyboard
while IFS= read -r line; do
    echo "$line"
done < data.txt

Pipes

A pipe | connects the stdout of one command to the stdin of the next:

ls -la | grep ".sh"          # list files, show only .sh files
cat /etc/passwd | wc -l      # count lines in passwd
ps aux | grep nginx          # find nginx processes

Building Pipelines

# Top 5 most-used commands in your history
history | awk '{print $2}' | sort | uniq -c | sort -rn | head -5
    142 git
     98 ls
     74 cd
     51 vim
     43 cat

# Find all running processes owned by your user, sorted by memory
ps aux | grep "^$USER" | sort -k4 -rn | head -10

Pipes create subshells

Each command in a pipeline runs in a subshell. Variables set inside a pipeline are not visible in the parent shell. If you need to use a value computed in a pipeline, capture it: result=$(command1 | command2).


tee — Write to File and stdout Simultaneously

command | tee output.txt             # write to file AND show on terminal
command | tee -a output.txt          # append to file, also show on terminal
command | tee output.txt | wc -l     # write to file, count lines in pipeline
# Log script output while still seeing it on the terminal
./deploy.sh | tee deploy_$(date +%Y%m%d).log

Here Documents

Pass multi-line input to a command:

cat << EOF
This is line 1
This is line 2
Variables expand: $HOME
EOF
This is line 1
This is line 2
Variables expand: /home/user

# Single quotes on the delimiter prevent expansion
cat << 'EOF'
This is $literal
No $(expansion) here
EOF

Process Substitution

<(command) runs a command and presents its output as a file-like input:

diff <(sort file1.txt) <(sort file2.txt)    # compare sorted versions
comm <(sort file1.txt) <(sort file2.txt)    # find common/unique lines

Common Mistakes

Useless use of cat

cat file | command is equivalent to command < file. The cat is unnecessary and slightly slower. Use command < file or command file when possible.

Swallowing stderr in pipelines

In command1 | command2, only command1's stdout goes to command2. Errors from command1 still go to the terminal. To capture both: command1 2>&1 | command2.


Practice Exercises

Warm-Up (run and observe)

  1. Run echo "hello" > /tmp/test.txt then cat /tmp/test.txt. Run it again — does the file grow or stay the same size?
  2. Run ls /nonexistent 2>/dev/null. What do you see? Where did the error message go?
  3. Run cat /etc/passwd | head -5 then head -5 /etc/passwd. Are they equivalent? Which is more efficient?

Main (write a short script)

Create ~/scripts/monitor.sh that logs system metrics to a file and the terminal:

#!/usr/bin/env bash
set -euo pipefail

LOGFILE="/tmp/monitor_$(date +%Y%m%d_%H%M%S).log"

{
    echo "=== System Report: $(date) ==="
    echo "Hostname: $(hostname)"
    echo "Uptime: $(uptime -p)"
    echo ""
    echo "=== Disk Usage ==="
    df -h
    echo ""
    echo "=== Memory ==="
    free -h
    echo ""
    echo "=== Top 5 Processes by CPU ==="
    ps aux --sort=-%cpu | head -6
} | tee "$LOGFILE"

echo ""
echo "Report saved to: $LOGFILE"

Stretch

  1. Use process substitution to find files that are in dir1 but not in dir2: comm -23 <(ls dir1 | sort) <(ls dir2 | sort).
  2. Write a pipeline that finds the 3 largest files in /var/log, showing only filename and size.
  3. What does exec > logfile.txt 2>&1 do at the start of a script? How would you restore stdout after using it?

Interview Questions

  1. What is the difference between > and >>?
Show answer

> overwrites — it truncates the file to zero length before writing. >> appends — it adds new content at the end of the existing file. If the file does not exist, both create it.

  1. Why does var=$(command | grep pattern) sometimes not capture what you expect?
Show answer

Pipeline commands run in subshells. If var is set inside a pipeline, it is not visible outside. Capture the entire pipeline output: var=$(command | grep pattern) works correctly; the assignment captures the final stdout of the pipeline.

  1. What does 2>&1 mean?
Show answer

"Redirect file descriptor 2 (stderr) to wherever file descriptor 1 (stdout) is currently pointing." The &1 means "the current destination of fd 1," not the literal string 1. It is used to merge stderr into stdout so both go to the same place.


day04-part2-loops | day05-part2-find-locate