Categories
GNU/Linux Free Software & Open Source Tutorials & Tips

Quick search and replace recursively in multiple files

Lately I’ve been working with a lot of static HTML files with lots of repeating text structures. In the past I’ve talked about editing multiple files with Emacs. This approach works very well when the number of multiple files and text matches in each file is manageable or you need to make sure every match to replace is correct, since you need to confirm pressing y on every text match in each file.

In other cases, like the one I had to solve, you can have 84,000 text files where each file can have more than 5 matches. This case, doing it with Emacs wouldn’t reduce much time. It also helps that the pattern I was looking for was consistent without me needing to check every match.

So to do a quick search and replace recursively in multiple files, another “old school” tool comes very handy.

GNU Sed

Quoting from the GNU Sed project page:

Sed (streams editor) isn’t really a true text editor or text processor. Instead, it is used to filter text, i.e., it takes text input and performs some operation (or set of operations) on it and outputs the modified text. Sed is typically used for extracting part of a file using pattern matching or substituting multiple occurrences of a string within a file.

The way to tell sed to do a search and replace on some given text, the syntax is the following:

sed -n -e 's/regex/text/g' filename

The -n switch makes Sed not to output its results to the standard output and overwrite the file with the results. The -e switch specifies that the following string is a command to perform on the file. The regex part is the regular expression to use for searching in your text. The text part is the text you want to replace your search with.

So Sed receives streams of text as input, makes some operations on it and outputs the results. This way of seeing it, makes it very obvious to understand that the natural way to use it is through bash calls using pipes.

The find tool will help us get a list of all the files that we need to pipe into sed. In the same way we used find from within Emacs, we can call it from bash:

$ find path/to/folder -iname "filenamepattern"

So a combination of find with sed can be used in the following way:

$ find myprojectfolder -iname "*.html" | sed -n -e 's/searchregex/replacementtext/g'

As easy as that, and you have edited 84,000 files with one single line of bash.

Hope its useful for anyone. It has been very useful to me. If you have other methods or other sed tips, I’d like to know in the comments.

Categories
Events GNU/Linux Free Software & Open Source personal

Going to DrupalCon 2010

Drupalcon SF 2010

So thanks to Justia, I’ve spent the last week at Mountain View, or what is known as the Silicon Valley area. Next week I’ll be attending the Drupal Conference, Drupalcon 2010, in San Francisco, California.

Its been great working and hanging out in this area, specially when things in Mexico are not as easy these days. Its good to have peace and tranquility in a nice safe neighborhood for a while.

I’m really looking forward to all the Drupal sessions. I’ve been away from Drupal development for a little more than 6 months now, so its going to be interesting to get back into the mindset. I’ve been doing a lot of scripting lately and its been also good because I went back to the basics. I’ve also made things even more interesting or challenging by going back into using old tools like GNU Sed, GNU Awk and the like. I say they are old because nowadays I don’t see much people commenting about using them.

Sometimes developers forget about all these great tools, or new programmers never learn them. A task that nowadays someone would normally write a whole script to execute it, it has taken me one line of a bash command. This is a huge productivity boost, so I’ll be posting some of these old but often forgotten tools and tips I’ve picked up.

So if someone else is going to attend the DrupalCon in San Francisco, I’d be glad to say hi and maybe have a beer or two and hang around the town, I’ve heard its fascinating.

Categories
Emacs GNU/Linux Free Software & Open Source

3 methods on how to backup your Emacs file

Data dump by swanksalot on flickr
The emacs personalization file (dotemacs) is a very important resource for every Emacs user. Typically found at ~/.emacs, this file contains elisp code all the personalization of Emacs to accommodate each user. Its so important that it basically represents your Emacs “personality”.

To loose your .emacs file can mean loosing a lot of hours of tweaking and personalizing GNU Emacs through a bunch of collected-through-time snippets. So, being a very valuable asset, having a good method to back it up is a must have.

Here are 3 common methods people use to keep their Emacs file safe:

Simple backup

The most simple thing to do is to manually make copies of the file on a different directory, another partition on the same hard drive, an external hard drive, or a USB key. Also works well when having multiple computers and copying the same .emacs file on each of them. Using rsync to back it up periodically is a good idea, and it can be used to backup all your other elisp code for common modes (typically at ~/.emacs.d/) you use too.

A good option would be to back it up to an online storage service like Drop.io or even Amazon S3.

Version control

The standard and most common way to store your emacs customizations is by saving them on a file named .emacs placed on your home folder. But this is difficult to setup on a version control system since version control systems check things under directories. So this would mean you would be version controlling your whole home folder, which wouldn’t be a bad idea on some cases but on others would be a mess to maintain.

Fortunately there’s another way: at startup, Emacs also looks for a file called init.el on a hidden folder named .emacs.d/ in your home folder when the typical ~/.emacs file is not found. This way, you can easily set your preferred version control system to track changes on that folder. This has the advantage that any other Emacs modes or code you have can be stored and tracked too. This way, whenever you have a clean install, your Emacs setup and modes are just a checkout away from getting done.

On some setups, tracking changes on the whole ~/.emacs.d/ directory may not be a good option. So, to track changes on only your .emacs file can be achieved by moving your init.el file to a folder inside the home elisp directory and will look like this: ~/.emacs.d/dotemacs/init.el and make a symbolic link to it in ~/.emacs.d/ This way I can version control the “dotemacs” directory very easily.

Distributed version control

Many people use SVN as their preferred version control system, which backs up your data into a central location. But using a distributed version control system like Git, Mercurial or Bazaar is a better option. DVCSs let you setup multiple locations where to backup your code repository, so you don’t have a single point of failure. So you can version control your dotemacs file and back up the changes history on many places like Github, Gitorious, Launchpad or any other code hosting service, plus several other remote locations like multiple machines, a NAS or external drives with complete history of your changes.

Do you know other methods? How do you keep from loosing your dotemacs file?

Data dump image by swanksalot on Flickr
Categories
GNU/Linux Free Software & Open Source personal

Goodbye ACM Crossroads, Hello GNU

ACM Crossroads logo
I’ve been the web editor for the ACM Crossroads student magazine for the past 4 years. Since I’m no longer a student, the time has come to step out of that position and let someone else take the job.

So since issue 16.2 the ACM Crossroads website is in charge of Malay Bhattacharyya of the Indian Statistical Institute, Kolkata and Srinwantu Dey of the University of Florida. I wish them the best since being an ACM Crossroads editor was a great experience. Met a lot of good friends and interesting people and learned a lot about online publishing and workflows. I’ll continue to support the ACM with any help they need, but I’m no longer in charge of anything.

GNU logo
As one thing ends, another has to begin. So at LibrePlanet 2009 I met Rob Myers and asked to volunteer as a GNU.org webmaster. My contributions so far have been small, but constant, maintaining some minor things here and there and helping cleaning up the spam. I hope to get more involved with GNU this year and make great contributions.

Categories
GNU/Linux Free Software & Open Source Interesting random stuff Programming & Web Development

Random links from my bookmarks

I’d like to share my bookmarks from time to time. I think sometimes random browsing can be very fruitful and sometimes even productive.

This week on my delicious bookmaks, I’d like to share:

I hope you find these links interesting or usefull as they’ve been for me.

Categories
Emacs GNU/Linux Free Software & Open Source personal

Mac OS X from a GNU/Linux User

Snow Leopard
The Mac OS X slogan I’ve heard from several mac fanboys is “it just works”. Well, being a GNU/Linux user for quite some time and coming to OS X, that is not the case for me. There’s a lot of little things that “just don’t work” on my particular usage.

Recently I’ve been given a 17″ Macbook Pro for use at my job. My first impression was “wow, nice solid hardware” and that has turned to be very true. But after a while of fiddling with the operating system and doing actual work as I’m used to, lots of little things started to annoy me.

Developer tools

First, I’ve been told that OS X is the best platform for developers. Well, to begin with, basic development tools are not installed by default. You have to install all Xcode tools (about 3.1 GB) just to get gcc, make and related basic tools, off the CD plus a bunch of other unknown things. The installer doesn’t detail much on what its installing.

Getting and updating software

Then, there is no repositories support by default. You have to install Macports or Fink, or download each of your software packages by hand, so upgrading all your apps depends entirely on each provider, except for the Apple applications. So this tells me the software upgrade program is exclusively for Apple apps and no third party software can access this upgrading system. It would be a good idea if the software updater had an API or something that other software vendors can use it to notify upgrades.

A curious thing for me is the fact that lately when the software updater updates the Safari web browser and other trivial applications, it asks for a full system restart. I don’t know why OS X, a BSD Unix based system, needs a restart when you upgrade such a non-critical application like the browser, but that reminds me a lot of Windows asking to restart for every single piece of software installed.

PHP and extensions

At my job we use PHP 5.2.8 and a bunch of extensions. Although OS X comes with Apache and PHP already, there’s no easy way to install all the extensions we use, so we have to compile the damn thing and all its dependencies. It has taken us a whole day just setting this up, and some co-workers just quit trying and went through the option of developing on a virtual machine with GNU/Linux. Some even cried. I got it all good and running, but when I upgraded to Snow Leopard, all my settings were reverted so I had to start again.

GNU

Emacs

For most of my tasks I use Emacs, but there’s a bunch of choices and versions on how to install it, but none is very consistent. If you install emacs from Fink, you don’t get finder actions to open files on Emacs. If you don’t install from Fink, then when installing other packages, like Auctex, will need to download Emacs from Fink. Then you have redundancy. So the solution here is to install your elisp files manually on your elisp folder.

Ctrl, meta and alt keys are messed up. Important for any Emacs user and also for a command line power user.

Other software

Basic office apps, like the typical word processor, spreasheed and presentations programs are not available by default, which is something you have for granted on most GNU/Linux distributions.

No GPG, wget, latex and other basic tools you take for granted on any GNU/Linux or BSD system.

Finder annoyances

Hidden folders (those starting with a dot) are not easy to browse on the file navigator (Finder). To view hidden files in Finder, you need a hack. There’s no easy menu option for it.

You cannot overwrite by drag and dropping a hidden folder like ~/.emacs.d/ if it already exists. It first asks you for the administrator password, then it tells you it will not change an “invisible” folder. The only way I could get around it was by using the terminal.

Also Finder has no “one directory up” button, so to move one directory up, you must enable the navigation bar that appears at the bottom of every window. But this is not very intuitive to do. Also, if you are on a file chooser dialog, this bottom navigation does not appear, so there’s no way “up”.

Finder always puts a .DS_Store and a ._MacOSX file and folder on everything you browse, being an external hard drive or usb drive or anything and you can’t disable that behavior. So I typically end up with my thumbdrives and backup drives filled with this files. Also if you compress (zip) a directory using the Finder menu option, the resulting zip also contains these files.

Finder cannot be used as an FTP or SCP client like Konqueror or Nautilus via the location bar. Although you can use the “connect to server…” option.

Conclusions

Well… not much to conclude here. I guess I just have to get used to “the mac way” of things until I get back home to my nice Debian system.

Have you migrated from GNU/Linux to OS X? I’d like to know your experiences and recommendations.

Snow Leopard foto is Creative Commons by Captain Chickenpants
Wildbeet foto is Creative Commons by Arno & Louise