Lately I’ve been working with a lot of static HTML files with lots of repeating text structures. In the past I’ve talked about editing multiple files with Emacs. This approach works very well when the number of multiple files and text matches in each file is manageable or you need to make sure every match to replace is correct, since you need to confirm pressing y
on every text match in each file.
In other cases, like the one I had to solve, you can have 84,000 text files where each file can have more than 5 matches. This case, doing it with Emacs wouldn’t reduce much time. It also helps that the pattern I was looking for was consistent without me needing to check every match.
So to do a quick search and replace recursively in multiple files, another “old school” tool comes very handy.
GNU Sed
Quoting from the GNU Sed project page:
Sed (streams editor) isn’t really a true text editor or text processor. Instead, it is used to filter text, i.e., it takes text input and performs some operation (or set of operations) on it and outputs the modified text. Sed is typically used for extracting part of a file using pattern matching or substituting multiple occurrences of a string within a file.
The way to tell sed to do a search and replace on some given text, the syntax is the following:
sed -n -e 's/regex/text/g' filename
The -n
switch makes Sed not to output its results to the standard output and overwrite the file with the results. The -e
switch specifies that the following string is a command to perform on the file. The regex part is the regular expression to use for searching in your text. The text part is the text you want to replace your search with.
So Sed receives streams of text as input, makes some operations on it and outputs the results. This way of seeing it, makes it very obvious to understand that the natural way to use it is through bash calls using pipes.
The find tool will help us get a list of all the files that we need to pipe into sed. In the same way we used find from within Emacs, we can call it from bash:
$ find path/to/folder -iname "filenamepattern"
So a combination of find with sed can be used in the following way:
$ find myprojectfolder -iname "*.html" | sed -n -e 's/searchregex/replacementtext/g'
As easy as that, and you have edited 84,000 files with one single line of bash.
Hope its useful for anyone. It has been very useful to me. If you have other methods or other sed tips, I’d like to know in the comments.