Categories
GNU/Linux Free Software & Open Source Programming & Web Development

command line tools for web developers

Computer Data Output

Many people are typically afraid of the terminal. Yes, it might look scary for some, retro for others, but for the practical busy programmer, the terminal is the best tool you can have.

Lately for my day job, I’ve been required to work with lots of static web pages, as I’ve mentioned on several of my previous posts. So for my daily tasks, I’ve been using a lot of command line tools on the terminal that make my work a lot faster.

Here are some of the tools that I’ve been using and how I’ve used them:

  • find

    Helps me list and filter certain types of files for processing. For example find . -name *.html This will give me a list of all files with .html extension under the current directory and subdirectories.

  • sed

    GNU sed is very handy to do all kinds of text manipulation without having to write a whole script about it. For example one common task would be search and replace a text or regular expression pattern on a file. Example: sed -e "s/My Search/My replace/g" myfile.html

  • xargs

    This is a ‘piping’ command, it will take the output of one tool and place it as arguments for the subsequent tool in the line. Example: find . -name *.temp -print0 | xargs sed -n -e "s/Hello/Goodbye/g" This will find all .temp files, then on each of them will search the word “Hello” and will replace it with the word “Goodbye”.

  • tidy

    When you have a bunch of legacy HTML code or “messy” (X)HTML documents you must parse, a good idea is to first clean up the code before working with it. Tidy is a command line tool that will help you with the task of cleaning, reformatting and indenting any messy (X)HTML document. It even does a good job cleaning MS Word generated HTML files!

  • GNU make

    This is an “old school” tool, for the ones that grew up with web development and away from C/C++ development. Make is used to automate certain tasks and in a given order, checking for dependancies. In the web development process, I use make to automate repetitive tasks, such as deploying files to the testing server, making a tag in my version control system and publishing the site on the production server, cleaning up temporary files, and so on. So I write a Makefile with these tasks, and every time I have to upload my code to the testing server I only execute something like make upload and it will do the task. For example, cleaning up all temporary files on my project would involve me doing: find . -name *.temp | xargs rm -rf. I can create a Makefile with the following:

    clean:
    find . -name *.temp | xargs rm -rf

    then every time I need to cleanup my codebase, I simply run make clean Hope you get the idea 😉

  • git

    My preferred version control system for the past 4 years has been Git. Its a distributed version control system that is very simple and very practical to use because its extremely fast and doesn’t get in your way while programming. It has lots of features and tools for the everyday tasks and its a very good practice to version control *all* your projects, even if you’re the only developer of them. Since its distributed, you don’t need to setup a server for it and you can replicate your repository on any media and with as many copies you like. Version controlling your projects will save you from troubles like accidentally deleting files, or, using local code branches, you can easily experiment new features without affecting your main “stable” version of your code. There’s a lot to say about version controlling and Git and I guess I haven’t written about it before (strange since its a big topic for me), so I guess I’ll put more of these details for another post. Just take my advice, use git and version control all your projects. You’ll thank me later.

  • rsync

    Rsync is a great tool to synchronize files and directories from one location to another. This can be on the same machine or on different (remote) machines. The typicall use of rsync is for automated backups. You can use it as so, or you can also use it to mirror your website on another folder or machine. I use it to deploy my files on the testing and/or production servers. This way I don’t have to be worried about forgetting to upload a file, the whole project can be synchronized with one single command on multiple machines. You can configure rsync to connect through ssh (more on this below) to move your files around in a secure, encrypted file transfer.

  • ssh & scp

    You definately don’t want your files to be going through the network in plain sight. I know, some might say: “who cares?” but really, its better to be paranoid and careful about your data. You never know. So the best way to transport your files from one machine to another is through a secure encrypted channel. This is what SSH does for you. With ssh you can connect securely to your server’s command line to execute command there, or you can securely copy files from one machine to another using scp.

There might be several other tools that I use daily but these are the ones more present in my mind as I’ve been using them a constantly.

What command line tools do you use for your web development tasks? Do you have other ideas on which the tools listed here can be used? Send me your comments, this might get interesting and useful for all of us.

Categories
GNU/Linux Free Software & Open Source Tutorials & Tips

Quick search and replace recursively in multiple files

Lately I’ve been working with a lot of static HTML files with lots of repeating text structures. In the past I’ve talked about editing multiple files with Emacs. This approach works very well when the number of multiple files and text matches in each file is manageable or you need to make sure every match to replace is correct, since you need to confirm pressing y on every text match in each file.

In other cases, like the one I had to solve, you can have 84,000 text files where each file can have more than 5 matches. This case, doing it with Emacs wouldn’t reduce much time. It also helps that the pattern I was looking for was consistent without me needing to check every match.

So to do a quick search and replace recursively in multiple files, another “old school” tool comes very handy.

GNU Sed

Quoting from the GNU Sed project page:

Sed (streams editor) isn’t really a true text editor or text processor. Instead, it is used to filter text, i.e., it takes text input and performs some operation (or set of operations) on it and outputs the modified text. Sed is typically used for extracting part of a file using pattern matching or substituting multiple occurrences of a string within a file.

The way to tell sed to do a search and replace on some given text, the syntax is the following:

sed -n -e 's/regex/text/g' filename

The -n switch makes Sed not to output its results to the standard output and overwrite the file with the results. The -e switch specifies that the following string is a command to perform on the file. The regex part is the regular expression to use for searching in your text. The text part is the text you want to replace your search with.

So Sed receives streams of text as input, makes some operations on it and outputs the results. This way of seeing it, makes it very obvious to understand that the natural way to use it is through bash calls using pipes.

The find tool will help us get a list of all the files that we need to pipe into sed. In the same way we used find from within Emacs, we can call it from bash:

$ find path/to/folder -iname "filenamepattern"

So a combination of find with sed can be used in the following way:

$ find myprojectfolder -iname "*.html" | sed -n -e 's/searchregex/replacementtext/g'

As easy as that, and you have edited 84,000 files with one single line of bash.

Hope its useful for anyone. It has been very useful to me. If you have other methods or other sed tips, I’d like to know in the comments.