Shell Data Processing - Sed (Stream editor)

Card Puncher Data Processing

About

sed stands for stream editor.

It is a filter program used for filtering and transforming text

It:

In the stream, it can:

It's part of the Gnu utility.

Sed is line-based therefore it is hard for it to grasp newlines and to manipulate eol characters.

Use the utility:

  • or dos2unix

Syntax

# Default
sed 'expression1;...;expressionN' inputFileName > outputFileName
# In place editing - No outputFileName needed
sed -i 'expression1;...;expressionN' inputFileName

where:

  • expression
command/regularExpression/modifier

Script

Using a script file avoids problems with shell escaping or substitutions.

Example script.sed: A sed file script with one command by line and a shebang

#!/bin/sed -f
sedExpression1
sedExpression...
sedExpressionN

Run it:

  • with the f option
sed -f script.sed inputFileName > outputFileName
chmod u+x subst.sed
script.sed inputFileName > outputFileName

Command

Substitution

The Substitution command replace a string

# First occurence Default
sed 's/searchString/replacementString/' inputFileName > outputFileName
# All Occurences thanks to the g at the end
sed 's/searchString/replacementString/g' inputFileName > outputFileName
# In place editing - No outputFileName needed
sed -i 's/searchString/replacementString/g' inputFileName
# to use backslash characters. tab by arrow and end of line by reverse p
sed 's/\t/→/g;s/$/¶'

where: in the expression 's/searchString/replacementString/':

  • s stands for “substitution”.
  • searchString: the search string, the text to find.
  • replacementString: the replacement string
  • g stands for global (ie replace all occurence)
  • i is an option to edit the file directly - no need of outputFileName (a temporary output file is created in the background)
  • $ is the single quote format that allows backslash characters

Delete

The d (delete) command delete lines (to delete a word, substitute it with nothing)

# line
sed '/regularExpression/d' inputFileName
# word

Example:

  • delete lines that are either blank or only contain spaces
sed '/^ *$/d' inputFileName
  • delete word (ie substitute with empty)
s/yourword//g

Others

  • N add the next line to the pattern space;
  • P print the top line of the pattern space;
  • D delete the top line from the pattern space and run the script again.

Flow

Flow of control can be managed by:

  • the use of a label (a colon followed by a string)
  • and the branch instruction b.

An instruction b followed by a valid label name will move processing to the block following that label.

Documentation / Reference





Discover More
Bash Liste Des Attaques Ovh
Bash - How to pass arguments that have space in their values ?

This article shows you how to pass arguments that have space characters in their values. Passing several arguments to a function that are stored as a string may not work properly when you have space...
Bash Liste Des Attaques Ovh
Bash - IFS (Field Separator)

The field separator is a set of character that defines one or more field separator that separates (delimit) field (word) in a string. DELIM It's defined in the IFS variable parameters statement...
Card Puncher Data Processing
How to replace in bulk a text in multiple file with a bash pipeline

An step by step that shows you how to create bash pipeline to replace in bulk text in files
Io Input Stream
I/O - Stream

A stream concept at the io level is a file (generally a text file) A stream is an abstract concept for files and io devices which can be read or written, or sometimes both. I/O devices can be interpreted...
Java Fileiomethods
Java - IO - Connection (Stream and Channel)

in Java. In order to perform I/O operations (for example reading or writing), you need to perform a connection. In Java, this connection are modelled through: a stream (java.io package) or a channel...
Kafka Commit Log Messaging Process
Kafka - Installation Standalone / Open Source (Single Broker)

This page shows you how to install kafka from the open source package with a single broker (a single node) Kafka is working with zookeeper to store its data. A zookeeper server must be running before...
Bash Liste Des Attaques Ovh
Linux - File

Linux file management See Using Parameters Expansion Removal From a path string where the file does not exist dirname returns the first parent of an existing path file. ...
Bash Liste Des Attaques Ovh
Sh - Backslash Escape Characters (Whitespace, Tabs, Ends, End of Line, Newline) - Non-printing characters

in bash Words of the form $'string' are treated specially. The word expands to string, with backslash-escaped characters replaced as specified by the ANSI C standard. Backslash escape sequences,...
Bash Liste Des Attaques Ovh
Sh - String Variable

String Variable, see also character string in bash. When calling a function, quote the variable otherwise bash will not see the string as atomic but as an array Sh with Bash “” The...
Card Puncher Data Processing
Shell Data Processing - Awk (grep and sed) - Output filtering

The awk command is a filter that implements a language that is dedicated to text processing and combines the functions of: grep and sed AWK is a (tool|language) for event-based data processing....



Share this page:
Follow us:
Task Runner