Search and replace recursively in multiple files

Lately I’ve been working with a lot of static HTML files with lots of repeating text structures. In the past I’ve talked about editing multiple files with Emacs. This approach works very well when the number of multiple files and text matches in each file is manageable, since you need to confirm pressing “y” on every text match in each file.

On other cases, like the one I had to solve, you can have 84,000 text files where each file can have more than 5 matches. This case, doing it with emacs wouldn’t reduce much time. For these kind of cases, an “old” tool is very handy.

GNU Sed

Quoting from the GNU Sed project page, sed is:

Sed (streams editor) isn’t really a true text editor or text processor. Instead, it is used to filter text, i.e., it takes text input and performs some operation (or set of operations) on it and outputs the modified text. Sed is typically used for extracting part of a file using pattern matching or substituting multiple occurrences of a string within a file.

The way to tell sed to do a search and replace on some given text, the syntax is the following: sed -n -e 's/regex/text/g' filename

The -n switch makes Sed not to output its results to the standard output and overwrite the file with the results. The -e switch specifies that the following string is a command to perform on the file. The regex part is the regular expression to use for searching in your text. The text part is the text you want to replace your search with.

So Sed recieves streams of text as input, makes some operations on it and outputs the results. This way of seeing it, makes it very obvious to understand that the natural way to use it is through bash calls using pipes.

The find tool will help us get a list of all the files that we need to pipe into sed. In the same way we used find from within Emacs, we can call it from bash: find path/to/folder -iname "filenamepattern"

So a combination of find with sed can be used in the following way: find myprojectfolder -iname "*.html" | sed -n -e 's/searchregex/replacementtext/g'

As easy as that, and you have edited 84,000 files with one single line of bash.

Hope its useful for anyone. It has been very useful to me. If you have other methods or other sed tips, I’d like to know in the comments.

Related posts:

  1. Emacs tip: How to edit multiple files on several directories in less than a minute
  2. command line tools for web developers
  3. Post to WordPress blogs with Emacs & Org-mode

About Gabriel Saldaña

Web developer and free software advocate.
This entry was posted in GNU/Linux Free Software & Open Source, Tutorials & Tips and tagged , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>