This article is part of the article series "Bash One-Liners Explained."
<- previous article next article ->

This is the second part of the Bash One-Liners Explained article series. In this part I'll show you how to do various string manipulations with bash. I'll use only the best bash practices, various bash idioms and tricks. I want to illustrate how to get various tasks done with just bash built-in commands and bash programming language constructs.

See the first part of the series for introduction. After I'm done with the series I'll release an ebook (similar to my ebooks on awk, sed, and perl), and also bash1line.txt (similar to my perl1line.txt).

Also see my other articles about working fast in bash from 2007 and 2008:

Let's start.

Part II: Working With Strings

1. Generate the alphabet from a-z

$ echo {a..z}

This one-liner uses brace expansion. Brace expansion is a mechanism for generating arbitrary strings. This one-liner uses a sequence expression of the form {x..y}, where x and y are single characters. The sequence expression expands to each character lexicographically between x and y, inclusive.

If you run it, you get all the letters from a-z:

$ echo {a..z}
a b c d e f g h i j k l m n o p q r s t u v w x y z

2. Generate the alphabet from a-z without spaces between characters

$ printf "%c" {a..z}

This is an awesome bash trick that 99.99% bash users don't know about. If you supply a list of items to the printf function it actually applies the format in a loop until the list is empty! printf as a loop! There is nothing more awesome than that!

In this one-liner the printf format is "%c", which means "a character" and the arguments are all letters from a-z separated by space. So what printf does is it iterates over the list outputting each character after character until it runs out of letters.

Here is the output if you run it:

abcdefghijklmnopqrstuvwxyz

This output is without a terminating newline because the format string was "%c" and it doesn't include \n. To have it newline terminated, just add $'\n' to the list of chars to print:

$ printf "%c" {a..z} $'\n'

$'\n' is bash idiomatic way to represent a newline character. printf then just prints chars a to z, and the newline character.

Another way to add a trailing newline character is to echo the output of printf:

$ echo $(printf "%c" {a..z})

This one-liner uses command substitution, which runs printf "%c" {a..z} and replaces the command with its output. Then echo prints this output and adds a newline itself.

Want to output all letters in a column instead? Add a newline after each character!

$ printf "%c\n" {a..z}

Output:

a
b
...
z

Want to put the output from printf in a variable quickly? Use the -v argument:

$ printf -v alphabet "%c" {a..z}

This puts abcdefghijklmnopqrstuvwxyz in the $alphabet variable.

Similarly you can generate a list of numbers. Let's say from 1 to 100:

$ echo {1..100}

Output:

1 2 3 ... 100

Alternatively, if you forget this method, you can use the external seq utility to generate a sequence of numbers:

$ seq 1 100

3. Pad numbers 0 to 9 with a leading zero

$ printf "%02d " {0..9}

Here we use the looping abilities of printf again. This time the format is "%02d ", which means "zero pad the integer up to two positions", and the items to loop through are the numbers 0-9, generated by the brace expansion (as explained in the previous one-liner).

Output:

00 01 02 03 04 05 06 07 08 09

If you use bash 4, you can do the same with the new feature of brace expansion:

$ echo {00..09}

Older bashes don't have this feature.

4. Produce 30 English words

$ echo {w,t,}h{e{n{,ce{,forth}},re{,in,fore,with{,al}}},ither,at}

This is an abuse of brace expansion. Just look at what this produces:

when whence whenceforth where wherein wherefore wherewith wherewithal whither what then thence thenceforth there therein therefore therewith therewithal thither that hen hence henceforth here herein herefore herewith herewithal hither hat

Crazy awesome!

Here is how it works - you can produce permutations of words/symbols with brace expansion. For example, if you do this,

$ echo {a,b,c}{1,2,3}

It will produce the result a1 a2 a3 b1 b2 b3 c1 c2 c3. It takes the first a, and combines it with {1,2,3}, producing a1 a2 a3. Then it takes b and combines it with {1,2,3}, and then it does the same for c.

So this one-liner is just a smart combination of braces that when expanded produce all these English words!

5. Produce 10 copies of the same string

$ echo foo{,,,,,,,,,,}

This one-liner uses the brace expansion again. What happens here is foo gets combined with 10 empty strings, so the output is 10 copies of foo:

foo foo foo foo foo foo foo foo foo foo foo

6. Join two strings

$ echo "$x$y"

This one-liner simply concatenates two variables together. If the variable x contains foo and y contains bar then the result is foobar.

Notice that "$x$y" were quoted. If we didn't quote it, echo would interpret the $x$y as regular arguments, and would first try to parse them to see if they contain command line switches. So if $x contains something beginning with -, it would be a command line argument rather than an argument to echo:

x=-n
y=" foo"
echo $x$y

Output:

foo

Versus the correct way:

x=-n
y=" foo"
echo "$x$y"

Output:

-n foo

If you need to put the two joined strings in a variable, you can omit the quotes:

var=$x$y

7. Split a string on a given character

Let's say you have a string foo-bar-baz in the variable $str and you wish to split it on the dash and iterate over it. You can simply combine IFS with read to do it:

$ IFS=- read -r x y z <<< "$str"

Here we use the read x command that reads data from stdin and puts the data in the x y z variables. We set IFS to - as this variable is used for field splitting. If multiple variable names are specified to read, IFS is used to split the line of input so that each variable gets a single field of the input.

In this one-liner $x gets foo, $y gets bar, $z gets baz.

Also notice the use of <<< operator. This is the here-string operator that allows strings to be passed to stdin of commands easily. In this case string $str is passed as stdin to read.

You can also put the split fields and put them in an array:

$ IFS=- read -ra parts <<< "foo-bar-baz"

The -a argument to read makes it put the split words in the given array. In this case the array is parts. You can access array elements through ${parts[0]}, ${parts[1]}, and ${parts[0]}. Or just access all of them through ${parts[@]}.

8. Process a string character by character

$ while IFS= read -rn1 c; do
    # do something with $c
done <<< "$str"

Here we use the -n1 argument to read command to make it read the input character at a time. Similarly we can use -n2 to read two chars at a time, etc.

9. Replace "foo" with "bar" in a string

$ echo ${str/foo/bar}

This one-liner uses parameter expansion of form ${var/find/replace}. It finds the string find in var and replaces it with replace. Really simple!

To replace all occurrences of "foo" with "bar", use the ${var//find/replace} form:

$ echo ${str//foo/bar}

10. Check if a string matches a pattern

$ if [[ $file = *.zip ]]; then
    # do something
fi

Here the one-liner does something if $file matches *.zip. This is a simple glob pattern matching, and you can use symbols * ? [...] to do matching. Code * matches any string, ? matches a single char, and [...] matches any character in ... or a character class.

Here is another example that matches if answer is Y or y:

$ if [[ $answer = [Yy]* ]]; then
    # do something
fi

11. Check if a string matches a regular expression

$ if [[ $str =~ [0-9]+\.[0-9]+ ]]; then
    # do something
fi

This one-liner tests if the string $str matches regex [0-9]+\.[0-9]+, which means match a number followed by a dot followed by number. The format for regular expressions is described in man 3 regex.

12. Find the length of the string

$ echo ${#str}

Here we use parameter expansion ${#str} which returns the length of the string in variable str. Really simple.

13. Extract a substring from a string

$ str="hello world"
$ echo ${str:6}

This one-liner extracts world from hello world. It uses the substring expansion. In general substring expansion looks like ${var:offset:length}, and it extracts length characters from var starting at index offset. In our one-liner we omit the length that makes it extract all characters starting at offset 6.

Here is another example:

$ echo ${str:7:2}

Output:

or

14. Uppercase a string

$ declare -u var
$ var="foo bar"

The declare command in bash declares variables and/or gives them attributes. In this case we give the variable var attribute -u, which upper-cases its content whenever it gets assigned something. Now if you echo it, the contents will be upper-cased:

$ echo $var
FOO BAR

Note that -u argument was introduced in bash 4. Similarly you can use another feature of bash 4, which is the ${var^^} parameter expansion that upper-cases a string in var:

$ str="zoo raw"
$ echo ${str^^}

Output:

ZOO RAW

15. Lowercase a string

$ declare -l var
$ var="FOO BAR"

Similar to the previous one-liner, -l argument to declare sets the lower-case attribute on var, which makes it always be lower-case:

$ echo $var
foo bar

The -l argument is also available only in bash 4 and later.

Another way to lowercase a string is to use ${var,,} parameter expansion:

$ str="ZOO RAW"
$ echo ${str,,}

Output:

zoo raw

Enjoy!

Enjoy the article and let me know in the comments what you think about it! If you think that I forgot some interesting bash one-liners related to string operations, let me know in the comments also!

This article is part of the article series "Bash One-Liners Explained."
<- previous article next article ->

Comments

Simon Permalink
July 03, 2012, 04:03

Didn't know about the stuff in #10 and #11 (i.e the [[...]] syntax. That could be useful...

Gaurav Permalink
July 03, 2012, 05:48

Hi,

its all are very useful and interesting. I would like to add two more things..... the cmnd#3 the same o/p can be produced by

seq -w 1 10 it will give 01 02 03 04 05 06 07 08 09 10, but it

will not work if we write seq -w 1 9

one more method to check whether a number is negative or not

there is one nice one liner.....

v=-76.67 ; if [ "${v/${v#?}}" = "-" ]; then echo "number is negative"; else echo "number is positive"; fi

PANDEESWARAN Permalink
July 03, 2012, 13:21

Hi,
What's the usage of the operator "<<<".
i have never come across this in bash context.

PANDEESWARAN Permalink
July 03, 2012, 13:42

i have checked the usage as "here-string operator"..

thanks

Edgars Permalink
July 06, 2012, 22:03

Regarding to uppercase/lowercase, you can also use tr, like:
echo 'UPPERCASESTRING' |tr '[:upper:]' '[:lower:]'

and vice-versa. tr is actually a quite powerful tool when it comes to string manipulations.

Mikael Auno Permalink
July 12, 2012, 12:42

You're explanation as to why echo does not interpret -n as a flag/an option is incorrect.

Notice that "$x$y" were quoted. If we didn't quote it, echo would interpret the $x$y as regular arguments, and would first try to parse them to see if they contain command line switches. So if $x contains something beginning with -, it would be a command line argument rather than an argument to echo:

The quoting only ensures that -n foo is treated as one string and not split on the space. echo will still see the - and test whether it is a valid flag/options. It just happens to be that "n foo" is not a valid flag/option to echo, and so it is treated as a regular string.

The quoting is only known to Bash, which is in charge of parsing the command line into strings and doing variable substitution. Bash (generally, echo just happens to be a built in function in bash) knows nothing about what is regular argument to a command and what is a flag/an option.

Hopefully, the following examples will illustrate the differences:

$ ls
bar  bas  foo

$ ls --reverse
foo  bas  bar

$ ls "--reverse"
foo  bas  bar

$ ls --rev erse 
ls: cannot access erse: No such file or directory

$ ls "--rev erse"
ls: unrecognized option '--rev erse'
Try `ls --help' for more information.
Alyn R. Tiedtke Permalink
August 02, 2012, 10:08

Will we see a book of this series soon ?? Would make an excellent companion to your awk, sed and Perl books ?

August 13, 2012, 20:19

Definitely. 2-3 more months and this book will be out!

August 19, 2013, 20:43

Hi Peter, i am waiting for your book.

AltairIV Permalink
January 14, 2013, 09:47

The regex feature can also extract substrings with the BASH_REMATCH array, bash's implementation of backreferencing.

It's also usually better to store the pattern in a separate variable first, then you don't have to struggle to escape everything. Just be sure NOT to quote it inside the test brackets, or it will be treated as a literal string.

$ str='foo12345bar67890'
$ re='[^0-9]+([0-9]+)[^0-9]+([0-9]+)'

$ [[ $str =~ $re ]] && x=${BASH_REMATCH[1]} y=${BASH_REMATCH[2]}

$ echo "$x/$y
12345/67890
AltairIV Permalink
January 14, 2013, 10:27

Oh, yeah. There's also another, poorly documented, parameter expansion that reverses the case, whatever it is.

$ str=FOObar
$ echo "${str~} ${str~~}"
fOObar fooBAR

Just like the others, a single tilde reverses only the first character, while two of them applies to the whole string.

Finally, all of the case change expansions can include an optional character or bracket list, which will restrict it to applying only if the characters in the string match.

$ echo "${str~F} ${str~[a-f]} ${str~~[bBfF]} ${str~~[^a-f]}"
fOObar FOObar fOOBar foobaR
centurian Permalink
February 07, 2013, 14:53

At #9 "replace a part of a string" there is a pitfall for the unaware: if we use variables instead of literals in place of the search string and their values contain the special character * then, we have results that depend of the setting of nullglob! See my answer there: http://stackoverflow.com/questions/525592/find-and-replace-inside-a-text-file-from-a-bash-command/14753895#14753895

Thanks.

scavenger Permalink
February 18, 2013, 09:26

thank you bash for running a subshell when piping, now we cannot read anymore multiple variables at the same time !

grep -w regexp file | read var1 var2 var3

there is no solution to replace this KSH functionality. The 'read <<<$(command)' solution is bourne and korn shell incompatible.

September 03, 2013, 14:07

Thanks for sharing useful information.

September 20, 2013, 16:19

Hi bloger, thanks for sharing useful information. May i know your book is released?

October 29, 2013, 09:43

In these article is explained how to do the easy way of working simple tricks..........

October 29, 2013, 13:12

Thanks for sharing great post. I am waiting for your book.

March 20, 2014, 02:26

Nice posts,...I feel really happy to have seen your webpage and look forward to so many more entertaining times reading here brokersforex :)

John Permalink
November 05, 2014, 21:15

Hi Peteris,

Thank you very much for all the effort you put into this awesome writeups!
Bash (4) is just amazing but how do I remember all those neat tricks? :)

Just two little things...

You're saying that:
The format for regular expressions is described in man 3 regex.
but it's in "man 7". :)

And I also found a little typo:
You can access array elements through ${parts[0]}, ${parts[1]}, and ${parts[0]}.
Should read
You can access array elements through ${parts[0]}, ${parts[1]}, and ${parts[2]}.

Thank you!
John

Leave a new comment

(why do I need your e-mail?)

(Your twitter name, if you have one. (I'm @pkrumins, btw.))

Type the word "unix": (just to make sure you're a human)

Please preview the comment before submitting to make sure it's OK.

Advertisements