Make Offline Mirror of a Site using `wget`

Sometimes you want to create an offline copy of a site that you can take and view even without internet access. Using wget you can make such copy easily:

wget --mirror --convert-links --adjust-extension --page-requisites 
--no-parent http://example.org

Explanation of the various flags:

  • --mirror – Makes (among other things) the download recursive.
  • --convert-links – convert all the links (also to stuff like CSS stylesheets) to relative, so it will be suitable for offline viewing.
  • --adjust-extension – Adds suitable extensions to filenames (html or css) depending on their content-type.
  • --page-requisites – Download things like CSS style-sheets and images required to properly display the page offline.
  • --no-parent – When recursing do not ascend to the parent directory. It useful for restricting the download to only a portion of the site.

Alternatively, the command above may be shortened:

wget -mkEpnp http://example.org

Note: that the last p is part of np (--no-parent) and hence you see p twice in the flags.

Bootstrap: Combining input-append and input-block-level

If you have a button appended to an input control in Bootstrap, and you want it fill the entire width, it’s not sufficient to add the input-block-level to the input itself but this CSS class also needs to be added to the surrounding .input-append div. For example:

<div class="input-append input-block-level">
	<input type="text" class="search-query input-block-level" name="q" placeholder="Search">
	<button type="submit" class="btn btn-primary">Search</button>
</div>

Applying .input-block-level to only one of the elments (either the div or the input) just doesn’t work.

Restricting SSH Access to rsync

Passphrase-less SSH keys allows one to automate remote tasks by not requiring user intervention to enter a passphrase to decrypt the key. While this is convenient, is posses a security risk as the plain key can be used by anyone who gets hold of it to access the remote server. To this end, the developers of SSH allowed to restrict via the .ssh/authorized_keys the commands that can be executed of specific keys. This works great for simple commands, but as using rsync requires executing remote commands withe different arguments on the remote end, depending on the invocation on the local machine, it gets quite complicated to properly restrict it via .ssh/authorized_keys.

Luckily, the developers of rsync foresaw this problem and wrote a script called rrsync (for restricted rsync) specifically to ease the restricting keys to be used only for rsync via .ssh/authorized_keys. If you have rsync installed, rrsync should have been distributed along side it. In Debian/Ubuntu machines it can be found under /usr/share/doc/rsync/scripts/rrsync.gz. If you can find it there, you can download the script directly from here. On the remote machine, copy the script, unpacking if needed, and make it executable:

user@remote:~$ gunzip /usr/share/doc/rsync/scripts/rrsync.gz -c > ~/bin/rrsync
user@remote:~$ chmod +x ~/bin/rrsync

On the local machine, create a new SSH key and leave the passphrase empty (this will allow you to automate the rsync via cron). Copy the public key to the remote server.

user@local:~$ ssh-keygen -f ~/.ssh/id_remote_backup -C "Automated remote backup"
user@local:~$ scp ~/.ssh/id_remote_backup.pub user@remote:~/

Once the public key is on the remote server edit ~/.ssh/authorized_keys and append the public key.

user@remote:~$ vim ~/.ssh/authorized_keys

(Vim tip: Use :r! cat id_remote_backup.pub to directly insert the contents of id_remote_backup.pub into a new line). Now prepend to the newly added line

command="$HOME/bin/rrsync -ro ~/backups/",no-agent-forwarding,no-port-forwarding,no-pty,no-user-rc,no-X11-forwarding

The command="..." restricts access of that public key by executing the given command and disallowing others. All the other no-* stuff further restrict what can be done with that particular public key. As the SSH daemon will not start the default shell when accessing the server using this public key, the $PATH environment variable will be pretty empty (similar to cron), hence you should specify the full path to the rrsync script. The two arguments to rrsync are -ro which restricts modifying the directory (drop it if you want to upload stuff to the remote directory) and the path to the directory you want to enable remote access to (in my example ~/backups/).

The result should look something like:

command="$HOME/bin/rrsync -ro ~/backups/",no-agent-forwarding,no-port-forwarding,no-pty,no-user-rc,no-X11-forwarding ssh-rsa AAA...vp Automated remote backup

After saving the file, you should be able to rsync files from the remote server to the local machine, without being prompted to for a password.

user@local:~$ rsync -e "ssh -i $HOME/.ssh/id_remote_backup" -av user@remote: etc2/

To things are needed to be noted:

  1. You need to specify the passphrase-less key in the rsync command (the -e "ssh -i $HOME/.ssh/id_remote_backup" part).
  2. The remote directory is always relative to the directory given to rrsync in the ~/.ssh/authorized_keys file.

Enabling C++11 (C++0x) in CMake

Going over some CMakeLists.txt files I’ve written, I came across the following snippet:

include(CheckCXXCompilerFlag)
CHECK_CXX_COMPILER_FLAG("-std=c++11" COMPILER_SUPPORTS_CXX11)
CHECK_CXX_COMPILER_FLAG("-std=c++0x" COMPILER_SUPPORTS_CXX0X)
if(COMPILER_SUPPORTS_CXX11)
	set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11")
elseif(COMPILER_SUPPORTS_CXX0X)
	set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++0x")
else()
        message(STATUS "The compiler ${CMAKE_CXX_COMPILER} has no C++11 support. Please use a different C++ compiler.")
endif()

Various compiler versions of gcc and clang use different flags to specify C++11 support, namely older ones accept -std=c++0x and newer one -std=c++11. The above snippets detects which is the right one for the compiler being used and adds the flag to the CXX_FLAGS.

View man Pages Properly in gVim

Vim’s ability to display man pages easily using the K mapping often comes handy. It been bothering me for a while, that the same thing doesn’t work properly in gVim, which I use more. The reason is that Vim’s ability to display man pages depends on having a terminal emulator, which just isn’t true for gVim, hence the garbled display of man pages one sees if he tries viewing a man page in gVim.

Today, I found a way around this limitation. It turns out, Vim comes with support for displaying man pages in a split window, and does it perfectly – colors, links and all the necessary stuff. The first line, enables this feature which includes by default the K mapping to open the man page in a new split. The second part, which I find very convenient, makes the regular K do the same in gVim. And unlike the original mapping, it also accepts a count before, so pressing 3K will search the 3 man section of the keyword under the cursor.

" Properly display man pages
" ==========================
runtime ftplugin/man.vim
if has("gui_running")
	nnoremap K :<C-U>exe "Man" v:count "<C-R><C-W>"<CR>
endif

Creating Self-Signed ECDSA SSL Certificate using OpenSSL

Before generating a private key, you’ll need to decide which elliptic curve to use. To list the supported curves run:

openssl ecparam -list_curves

The list is quite long and unless you know what you’re doing you’ll be better off choosing one of the sect* or secp*. For this tutorial I choose secp521r1 (a curve over 521bit prime).

Generating the certificate is done in two steps: First we create the private key, and then we create the self-signed X509 certificate:

openssl ecparam -name secp521r1 -genkey -param_enc explicit -out private-key.pem
openssl req -new -x509 -key private-key.pem -out server.pem -days 730

The newly created server.pem and private-key.pem are the certificate and the private key, respectively. The -param_enc explicit tells openssl to embed the full parameters of the curve in the key, as opposed to just its name. This allows clients that are not aware of the specific curve name to work with it, at the cost of slightly increasing the size of the key (and the certificate).

You can examine the key and the certificate using

openssl ecparam -in private-key.pem -text -noout
openssl x509 -in server.pem -text -noout

Most webservers expect the private-key to be chained to the certificate in the same file. So run:

cat private-key.pem server.pem > server-private.pem

And install server-private.pem as your certificate. If you don’t concatenate the private key to the certificate, at least Lighttpd will complain with the following error:

SSL: Private key does not match the certificate public key, reason: error:0906D06C:PEM routines:PEM_read_bio:no start line 

Preventing Directory Traversal in Python

Consider the following use case:

PREFIX = '/home/user/files/'
full_path = os.path.join(PREFIX, filepath)
read(full_path, 'rb')
...

Assuming that filepath is user-controlled, a malicious user user might attempt a directory traversal (like setting filepath to ../../../etc/passwd). How can we make sure that filepath cannot traverse “above” our prefix? There are of course numerous solutions to sanitizing input against directory traversalthat. The easiest way (that I came up with) to do so in python is:

filepath = os.normpath('/' + filepath).lstrip('/')

It works because it turns the path into an absolute path, normalizes it and makes it relative again. As one cannot traverse above /, it effectively ensures that the filepath cannot go outside of PREFIX.

Post updated: see the comments below for explanation of the changes.

Setting Up RAID using mdadm on Existing Drive

After experiencing a hard-disk failure (luckily no important stuff loss, just some backups), I’ve decided to setup a RAID1 array on my existing Ubuntu 12.04 installation. The important thing was to migrate my existing data to the new RAID array while retaining all the data. The easy solution would have been to setup the array on two new drives and then copy my data over. However, I did not have a spare drive (apart from the new one) to copy my data over while creating the RAID array, so I had to take the trickier way.

I mainly followed François Marier’s excellent tutorial. As I went through it I realized I had to adjust a few things either to make it work on Ubuntu 12.04 or because I preferred another way to do stuff.

I’ve check the steps below using Ubuntu 12.04 on both a physical and a virtual machine (albeit in the dumb order – first I risked my data and then decided to prefect the process on a VM :-)). I think the same steps should apply to other Debian derivatives and more recent Ubuntu versions as well.

Outline

Before diving into action, I want to outline the whole process. In the first step we will create a degraded RAID1 array, which means a RAID1 array with one of the drives missing, using only the new drive. Next we will config the system to be able to boot from the new degraded RAID1 array and copy the data from the old drive to the RAID1 array on the new drive. Afterwards, we will reboot the system using the degraded array and add the old drive to the array, thus making it no longer degraded. At this point, we will update again some configurations to make things permanent and finally we will test the setup.

Make sure you got backups of your important stuff before proceeding. Most likely you won’t need them, like I didn’t, but just in case.

Partitioning the Drive

For the rest of the tutorial, I’ll assume the old disk, the one with existing data, is /dev/sda and the new one is /dev/sdb/. I’ll also assume /dev/sda1 is the root partition and /dev/sda2 is the swap partition. If you have more partitions or your layout is different, just make sure you adjust the instructions accordingly.

The first step is to create partitions on the new disk that match the size of the partitions we would like to mirror on the old disk. This can be done using fdisk, parted or using GUI tools such as Ubuntu’s Disk utility or gparted.

If both disks are the same size and you want to mirror all the partitions, the easiest way to do so is to copy the partition table using sfdisk:

# sfdisk -d /dev/sda > partition_table
# sfdisk /dev/sdb < partition_table

This will only work if your partition table is MBR (as sfdisk doesn’t understand GPT). Before running the second command take a look at partition_table to make sure everything seems normal. If your using GPT drives with more than 2TB, see Asif’s comment regarding sgdisk.

You don’t need to bother setting the “raid” flag on your partitions like some people suggest. mdadm will scan all of your partitions regardless of that flag. Likewise, the “boot” flag isn’t needed on any of the partitions.

Creating the RAID Array

If you haven’t installed mdadm so far, do it:

# apt-get install mdadm

We create a degraded RAID1 array with the new drive. Usually a degraded RAID array is a result of malfunction, but we do it intentionally. We do so, because it allows us to have an operational RAID array which we can copy our data into and then add the old drive to the array and sync it.

# mdadm --create root --level=1 --raid-devices=2 missing /dev/sdb1  
# mdadm --create swap --level=1 --raid-devices=2 missing /dev/sdb2

These commands instructs mdadm to create a RAID1 array with two drives where one of the drives is missing. A separate array is created for the root and swap partitions. As you can see, I decided to put have my swap on RAID as well. There are different opinions on the matter. The main advantage is that your system will be able to survive one of the disk failing while the system is running. The disadvantage is that it wastes space. Performance wise, RAID isn’t better as might be expected, as Linux supports stripping (like RAID0) if it has swap partitions on two disks. In my case, I have plenty of RAM available and swap space is mainly unused, so I guessed I’m better of using RAID1 for the swap as well.

You may encounter the following warning when creating the arrays:

mdadm: Note: this array has metadata at the start and
    may not be suitable as a boot device.  If you plan to
    store '/boot' on this device please ensure that
    your boot-loader understands md/v1.x metadata, or use
    --metadata=0.90
Continue creating array?

Grub 1.99, which is the default bootloader in recent Ubuntu distributions supports booting from partitions with the 1.2 format metadata, so it’s safe to type “y” here.

Next, we need to create a filesystems on the newly created RAID arrays:

# mkfs.ext4 /dev/md/root
# mkswap /dev/md/swap

The following will record your newly created MD arrays in mdadm.conf:

# /usr/share/mdadm/mkconf > /etc/mdadm/mdadm.conf

Preparing to Boot the Array

In this step we shall prepare the system to boot the newly created boot array. Of course we won’t actully do that before copying our data into it.

Start by editing /etc/grub.d/40_custom and adding a new entry to boot the raid array. The easiest way is to copy the latest boot stanza from /boot/grub/grub.cfg and modify it. The boot stanza looks something like this:

menuentry 'Ubuntu, with Linux 3.2.0-56-generic' --class ubuntu --class gnu-linux --class gnu --class os {
        recordfail
        gfxmode $linux_gfx_mode
        insmod gzio
        insmod part_msdos
        insmod ext2
        set root='(hd0,msdos1)'
        search --no-floppy --fs-uuid --set=root 19939b0e-4272-40e0-846b-8bbe49e4a02c
        linux   /boot/vmlinuz-3.2.0-56-generic root=UUID=19939b0e-4272-40e0-846b-8bbe49e4a02c ro   quiet splash $vt_handoff
        initrd  /boot/initrd.img-3.2.0-56-generic
}

First we need to add

insmod raid
insmod mdraid1x

just after the rest of the insmod lines. This will load the necessary GRUB modules to detect your raid array during the bootprocess. If you decided to go for 0.9 metadata earlier (despite my recommendation…) you will need to load mdraid09 instead of mdraid1x. Next we need to modify the root partition. This is done my modifying the UUID (those random looking hex-and-hyphens strings) arguments to the lines starting with search and linux. To find out the UUID for your root partition run

# blkid /dev/md/root

Which will give something like

/dev/md/root: UUID="49b6f295-2fe3-48bb-bfb5-27171e015497" TYPE="ext4"

The set root line can be removed as the search line overrides it.

Last but not least add bootdegraded=true to the kernel parameters, which will allow you to boot the degraded array without any hassles. The result should look something like this:

menuentry 'Ubuntu, with Linux 3.2.0-56-generic (Raid)' --class ubuntu --class gnu-linux --class gnu --class os {
        recordfail
        gfxmode $linux_gfx_mode
        insmod gzio
        insmod part_msdos
        insmod ext2
    insmod raid
    insmod mdraid1x
        search --no-floppy --fs-uuid --set=root e9a36848-756c-414c-a20f-2053a17aba0f
        linux   /boot/vmlinuz-3.2.0-56-generic root=UUID=e9a36848-756c-414c-a20f-2053a17aba0f ro   quiet splash bootdegraded=true $vt_handoff
        initrd  /boot/initrd.img-3.2.0-56-generic
}

Now run update-grub as root so it actually updates the /boot/grub/grub.cfg file. Afterwards, run

# update-initramfs -u -k all

This will make sure that the updated mdadm.conf is put into the initramfs. If you don’t do so the names of your new RAID arrays will be a mess after reboot.

Copying the Data

Before booting the new (degraded) array, we need to copy our data into it. First mount /dev/md/root somewhere, say /mnt/root, and then copy the old data into it.

# rsync -auxHAX --exclude=/proc/* --exclude=/sys/* --exclude=/tmp/* / /mnt/root

Next you need to update /mnt/root/etc/fstab with the UUIDs of the new partition (which you can get using blkid). If you have encrypted swap, you should also update /mnt/root/etc/crypttab.

Last this before the reboot is to re-install the bootloader on both drives:

# grub-install /dev/sda
# grub-install /dev/sdb

Reboot the computer. Hold the “Shift” key while booting to force the Grub menu to appear. Select the new Grub menu-entry you have just added (should be last on the list). After the system finished booting up, verify that you’re indeed running from the RAID device by running mount, which should show a line like this:

/dev/md127 on / type ext4 (rw,errors=remount-ro)

The number after /dev/md doesn’t matter, as long as it’s /dev/md and not /dev/sda or other real disk device.

Completing the RAID Array

If you have made it that far, you have a running system with all your data on a degraded RAID array which consists of your new drive. The next step will be to add the old disk to the RAID array. This will delete any existing data on it. So take a few minutes to make sure that you’re not missing any files (this should be fine as we rsync‘ed the data). Adding the old disk back to the RAID array is done by:

# mdadm /dev/md/root -a /dev/sda1
# mdadm /dev/md/swap -a /dev/sda2

Make sure you are adding the right partitions to the right arrays. These commands instruct mdadm to add the old disk to the new arrays. It might take some time to complete syncing the drives. You can track the progress of building the RAID array using:

$ watch cat /proc/mdstat

When it’s done, it means that your RAID arrays are up and running and are no longer degraded.

Remove the boot stanza we’ve added to /etc/grub.d/40_custom and edit /etc/default/grub to add bootdegraded=true to the GRUB_CMDLINE_LINUX_DEFAULT configuration variable. This will cause your system to boot up even if the RAID array gets degraded, which prevent the bug outlined in Ubuntu Freezes When Booting with Degraded Raid.

Finally update Grub and re-install it:

# update-grub
# grub-install /dev/sda
# grub-install /dev/sdb

We are done! Your RAID array should be up and running.

Testing the Setup

Just getting the RAID array to work is good but not enough. As you probably wanted the RAID array as contingency plan, you probably want to test it to make sure it works as intended.

We make sure that the system is able to work in case on of the drives fails. Shut down the system and disconnect one of the drives, say sda. The system should boot fine due to the RAID array, but cat /proc/mdstat should show one of the drives missing.

To restore normal operation, shutdown the system and reconnect the drive before booting it back up. Now re-add the drive to the RAID arrays.

mdadm /dev/md/root -a /dev/sda1
mdadm /dev/md/swap -a /dev/sda2

Again this might take some time. You can view the progress using watch cat /proc/mdstat.