Retrieving Google’s Cache for a Whole Website

Some time ago, as some of you noticed, the web server that hosts my blog went down. Unfortunately, some of the sites had no proper backup, so some thing had to be done in case the hard disk couldn’t be recovered. My efforts turned to Google’s cache. Google keeps a copy of the text of the web page in it’s cache, something that is usually useful when the website is temporarily unavailable. The basic idea is to retrieve a copy of all the pages of a certain site that Google has a cache of.
Continue reading Retrieving Google’s Cache for a Whole Website

Generating URL List from Access Log (access_log)

I had to parse an access_log of a website, in order to generate a sitemap. More precisely, a list of all URLs in the site. After playing around I’ve found a solution using sed, grep, sort and uniq. The good thing that each of this tools is available by default on most Linux distributions.
Continue reading Generating URL List from Access Log (access_log)

NVRM: not using NVAGP, kernel was compiled with GART_IOMMU support

For the past several weeks I had a strange problem. Sometimes when I booted my computer, it would refuse to start the X server and would give the following error in dmesg:

NVRM: not using NVAGP, kernel was compiled with GART_IOMMU support!!
NVRM: failed to allocate stack!

The weird thing about it is that normally if I rebooted the computer it would magically work again. So this error only showed up once-in a while and seemed to disappear at will. Today, it happened again, so I decided to fix it.
Continue reading NVRM: not using NVAGP, kernel was compiled with GART_IOMMU support

Understanding load average – A Practitioner Guide

The term “load average” is used in many Linux/UNIX utilities. Everybody knows that the numbers the term “load average” refers to, usually three numbers, somehow represent the load on the system’s CPU. In this post I’ll try making this three numbers clearer and understandable.
Continue reading Understanding load average – A Practitioner Guide

rzip vs. bzip2 – A short comparison

I decided to benchmark rzip against bzip for my backup needs. The benchmark was performed on a 89M tar archive of a directory which I regularly backup using my Amazon S3 backup script. The directory contains mostly LaTeX, PDF and Open Office files, so this benchmark may reflect very different results than what you will get if you will test it on other kinds of files.
Continue reading rzip vs. bzip2 – A short comparison

usb 1-4: device descriptor read/64, error -71

When I try to connect my Sansa Clip MP3 player to the linux box I see the following error in dmesg:

usb 1-4: device descriptor read/64, error -71

and the device recognition fails. The player’s battery gets reloaded but I can’t mount it and transfer songs.
Continue reading usb 1-4: device descriptor read/64, error -71

Start Trac on Startup – Init.d Script for tracd

As part of a server move, I went on to reinstall Trac. I’ve tried to install it as FastCGI but I failed to configure the clean URLs properly. I got the clean URLs to work if the user access them, but Trac insisted on addeing trac.fcgi to the beginning of every link it generated. So I’ve decided to use the Trac standalone server, tracd.

The next problem I faced was how to start the Trac automatically upon startup. The solution was to use an init.d script for stating Trac. After some searching, I didn’t find an init.d script for tracd that were satisfactory (mostly poorly written). So I went on an wrote my own init.d script for tracd.
Continue reading Start Trac on Startup – Init.d Script for tracd

Clean URLs (Permalinks) for WordPress on Lighttpd

I’ve moved my blog in the last few days to a new bigger dedicated server (as well as some other sites I own). After doing some benchmarks (I plan to post those soon) I’ve decided to switch to Lighttpd. While the exact migration notes are the topic of another post, I can say that I’m fairly satisfied with the move.

After setting up the server, I started moving the blog. Importing the files and the database was pretty straight forward. But when I thought every thing is ready and I transfered the domain to the new server I’ve found out that none of my inner pages are accessible. The reason, as it turned up pretty quickly, is that the WordPress depends on Apache’s mod_rewrite to create the clean URLs (the so called permalinks). This actually posed two problems:

  1. WordPress depends on Apache’s mod_rewrite.
  2. WordPress used .htaccess files for the clean URLs configuration

Continue reading Clean URLs (Permalinks) for WordPress on Lighttpd

radio.py-0.5 – An Easy Interface for Listening to Radio under Linux

This new release of radio.py brings more predefined stations and the much wanted recording feature. radio.py is a python wrapper for mplayer, designed to provide an easy-to-use interface for listening to radio from the command line. And indeed using radio.py is very easy, just pass the station name.

radio.py Classic FM

To read more about radio.py and the existing features go to radio.py – a Wrapper Script for Listening to Radio in Linux.

New stations in this release include Ram FM, Classic FM, Radio Caroline and update to all the radioIO stations. So overall this version of radio.py comes with 81 predefined stations. To see the full list of recognized station run radio.py --list. If your favorite station is still missing you can add via configuration files, ans described in here. If you will send a comment with the name of the stations and its website, I’ll add it to the next release.

The other important new feature is the ability to record radio streams to mp3 directly from radio.py. This is done using the --radio command-line switch. For example the following

radio.py CNN --record cnn.mp3

will record the radio stream of CNN to a file called cnn.mp3. To stop recording just press ‘q’. This option also be used with the --sleep and --wake-up to time your recordings. For example if you want to record a show that start in 30 minutes and is 60 minutes long you should do

radio.py BBC1 --record bbc1.mp3 --wake-up 30 --sleep 60

You can download the new version from here. Installation is pretty straight forward, just untar the archive and put the radio.py some where in your path (e.g. /usr/local/bin/) and the package is installed.

As always if you want new stations added to the next release, send a comment with the station details (at least name and website).

UPDATE 14/12/2008: I’ve changed the download link to point to radio.py‘s SourceForge project page.

WordPress Backup Script

This is a small script I’ve written to automate my server-side backups of my blogs. It creates a backup of both the database and the actual WordPress files.

#!/bin/bash

# (C) 2008 Guy Rutenberg - http://www.guyrutenberg.com
# This is a script that creates backups of blogs.

DB_NAME=
DB_USER=
DB_PASS=
DB_HOST=

#no trailing slash
BLOG_DIR=
BACKUP_DIR=


echo -n "dumping database... "
mysqldump --user=${DB_USER} --password=${DB_PASS} --host=${DB_HOST} ${DB_NAME} \
 | bzip2 -c > ${BACKUP_DIR}/${DB_NAME}-$(date +%Y%m%d).sql.bz2
if [ "$?" -ne "0" ]; then
	echo -e "\nmysqldump failed!"
	exit 1
fi
echo "done"


echo -n "Creating tarball... "
tar -cjf ${BACKUP_DIR}/${BLOG_DIR##*/}-$(date +%Y%m%d).tar.bz2 ${BLOG_DIR}
if [ "$?" -ne "0" ]; then
	echo -e "\ntarball creation failed!"
	exit 1
fi
echo "done"

Continue reading WordPress Backup Script