Categories
GNU/Linux Free Software & Open Source Tutorials & Tips

Quick search and replace recursively in multiple files

Lately I’ve been working with a lot of static HTML files with lots of repeating text structures. In the past I’ve talked about editing multiple files with Emacs. This approach works very well when the number of multiple files and text matches in each file is manageable or you need to make sure every match to replace is correct, since you need to confirm pressing y on every text match in each file.

In other cases, like the one I had to solve, you can have 84,000 text files where each file can have more than 5 matches. This case, doing it with Emacs wouldn’t reduce much time. It also helps that the pattern I was looking for was consistent without me needing to check every match.

So to do a quick search and replace recursively in multiple files, another “old school” tool comes very handy.

GNU Sed

Quoting from the GNU Sed project page:

Sed (streams editor) isn’t really a true text editor or text processor. Instead, it is used to filter text, i.e., it takes text input and performs some operation (or set of operations) on it and outputs the modified text. Sed is typically used for extracting part of a file using pattern matching or substituting multiple occurrences of a string within a file.

The way to tell sed to do a search and replace on some given text, the syntax is the following:

sed -n -e 's/regex/text/g' filename

The -n switch makes Sed not to output its results to the standard output and overwrite the file with the results. The -e switch specifies that the following string is a command to perform on the file. The regex part is the regular expression to use for searching in your text. The text part is the text you want to replace your search with.

So Sed receives streams of text as input, makes some operations on it and outputs the results. This way of seeing it, makes it very obvious to understand that the natural way to use it is through bash calls using pipes.

The find tool will help us get a list of all the files that we need to pipe into sed. In the same way we used find from within Emacs, we can call it from bash:

$ find path/to/folder -iname "filenamepattern"

So a combination of find with sed can be used in the following way:

$ find myprojectfolder -iname "*.html" | sed -n -e 's/searchregex/replacementtext/g'

As easy as that, and you have edited 84,000 files with one single line of bash.

Hope its useful for anyone. It has been very useful to me. If you have other methods or other sed tips, I’d like to know in the comments.

Categories
Events GNU/Linux Free Software & Open Source personal

Going to DrupalCon 2010

Drupalcon SF 2010

So thanks to Justia, I’ve spent the last week at Mountain View, or what is known as the Silicon Valley area. Next week I’ll be attending the Drupal Conference, Drupalcon 2010, in San Francisco, California.

Its been great working and hanging out in this area, specially when things in Mexico are not as easy these days. Its good to have peace and tranquility in a nice safe neighborhood for a while.

I’m really looking forward to all the Drupal sessions. I’ve been away from Drupal development for a little more than 6 months now, so its going to be interesting to get back into the mindset. I’ve been doing a lot of scripting lately and its been also good because I went back to the basics. I’ve also made things even more interesting or challenging by going back into using old tools like GNU Sed, GNU Awk and the like. I say they are old because nowadays I don’t see much people commenting about using them.

Sometimes developers forget about all these great tools, or new programmers never learn them. A task that nowadays someone would normally write a whole script to execute it, it has taken me one line of a bash command. This is a huge productivity boost, so I’ll be posting some of these old but often forgotten tools and tips I’ve picked up.

So if someone else is going to attend the DrupalCon in San Francisco, I’d be glad to say hi and maybe have a beer or two and hang around the town, I’ve heard its fascinating.