SQL Dump for MS Access databases (.mdb files) on Linux

I recently had to work with some data that came in a huge Microsoft Access database. Because I like SQLite (and despise Access), I’ve decided to export the data to an SQLite file. The first thing I needed to do was to somehow get all the data out of the db. Being a Linux user, complicates things a bit, but thanks to mdb-tools it’s possible to process the .mdb files without resorting to Windows and buying Access. Using mdb-tools directly can be tedious if you want to export a large db with multiple tables, so when I’ve looked for a way to automate it, I came across Liberating data from Microsoft Access “.mdb” files. This post shows a nice script that dumps every table in a .mdb file to separate CSV file.

While useful, I wanted something that I could easily import into SQLite. So I’ve modified their script to generate an SQL dump of the db. Given a db file, it writes to stdout SQL statements describing the schema of the DB followed by INSERTs for each table. Actually because mdb-tools doesn’t support SQLite as a backend, the dump uses a MySQL dialect, but it should be fine with SQLite as well (SQLite will mostly ignore the parts it can’t process such as COMMENTs). The easiest way to use the script is

$ python AccessDump.py access.mdb | sqlite3 new.db

If the original db contains non-ascii characters, and isn’t encoded in UTF-8, you should set the MDB_JET3_CHARSET environment variable to the correct charset. The dump itself will be UTF-8 encoded.

$ MDB_JET3_CHARSET="cp1255" python AccessDump.py access.mdb | sqlite3 new.db

Continue reading SQL Dump for MS Access databases (.mdb files) on Linux

Conditional Compilation in Autoconf and Automake

While working on my audio based random password generator (you view the source in github), I wanted to do some conditional compilation: Compiling certain parts of the program only in case some option is passed to the configure script. As it usually happens with GNU’s autotools, it kind of hell to do it. Documentation is spread across dozens of sources, each provides only a specific part of what to do. I’m writing it here in the blog, in hope I’ll never have to search how to do so again.
Continue reading Conditional Compilation in Autoconf and Automake

Some Thoughts About Android’s Full Disk Encryption

One of the new features touted by ICS is full-disk encryption (actually it was first available in Android 3). The first look is promising. The android developers went with dm-crypt as the underlying transparent disk encryption subsystem, which is the de-facto way to perform full-disk-encryption in Linux nowadays. This ensures both portability of the encrypted file systems and tried-and-tested implementation. The cipher itself is 128-bit AES in a ESSIV mode, and the encryption key is derived from the password using PBKDF2 (actually it’s the key that encrypts the actual encryption key, allowing fast password changes). So where do I think it went wrong?

Enabling the full disk encryption.

Continue reading Some Thoughts About Android’s Full Disk Encryption

A Note About Open Sound System (OSS)

A while ago I wrote about creating random numbers out of noise gathered from audio device and also created a password generator based on the idea. The implementation was based on Open Sound System (commonly known as OSS). OSS was the defacto way to access audio device couple of years ago, when it hit licensing issues and subsequently replaced by ALSA. As Ubuntu no longer supports OSS (and even the ALSA wrapper for it is in Universe), I’ve decided to re-write the code in some modern alternative.
Continue reading A Note About Open Sound System (OSS)

Fixing virtualenv after Upgrading Your Distribution/Python

After you upgrade your python/distribution (specifically this happened to me after upgrading from Ubuntu 11.10 to 12.04), your existing virtualenv environments may stop working. This manifests itself by reporting that some modules are missing. For example when I tried to open a Django shell, it complained that urandom was missing from the os module. I guess almost any module will be broken.

Apparently, the solution is dead simple. Just re-create the virtualenv environment:

virtualenv /PATH/TO/EXISTING/ENVIRONMENT

or

virtualenv --system-site-packages /PATH/TO/EXISTING/ENVIRONMENT

(depending on how you created it in the same place). All the modules you’ve already installed should keep working as before (at least it was that way for me).

Debugging File Type (MIME) Associations

I’m having less and less time to blog and write stuff lately, so it’s a good oppertunity to catch up with old thing I did. Back in the happy days I used Gentoo, one of irritating issues I faced was messed up file type associations. MIME type for some files was recognized incorrectly, and as a result, KDE offered to open files with unsuitable applications. In order to debug it I wrote a small python script which would help me debug the way KDE applications are associated with MIME types and what MIME type is inferred form each file.

The script does so by querying the KMimeType and KMimeTypeTrader. The script does 3 things:

  • Given a MIME type, show it’s hierarchy and a list of applications associated with it.
  • Given an applications, list all MIME types it’s associated with
  • Given a file, show its MIME type (and also the accuracy, which allows one to know why that MIME type was selected, although I admit that in the two years since I wrote it, I forgot how it works :))

The script is pasted below. I hope someone that still fiddles with less than standard installations, will find it helpful.
Continue reading Debugging File Type (MIME) Associations

Installing culmus-latex on Ubuntu 11.10

After someone complained to me that he can’t install culmus-latex on Ubuntu 11.10, I decided to check the issue. Apparently culmus-latex can’t be installed as-is on Ubuntu 11.10 (and probably other new versions of Debian and Ubuntu). The problem have been reported in few places such as Whatsup, but as I don’t frequent the forum lately, I wasn’t aware of it. Skip bellow if you’re just interested in the workaround.

Technical Details

The problem manifests itself as:

sudo make install
... snipped for brevity ...
mktexlsr: Done.
updmap-sys --enable Map=culmus.map
updmap: This is updmap, version $Id: updmap 14402 2009-07-23 17:09:15Z karl $
updmap: using transcript file `/var/lib/texmf/web2c/updmap.log'
updmap: initial config file is `/var/lib/texmf/web2c/updmap.cfg'
make: *** [install] Error 2

But if you look at updmap’s manpage there is no documentation for the return codes. Also there is no explicit place where it exits with return code 2 in the code. After some straceing I found the culprit in the combination of the set -e in the top of /usr/bin/updmap and the function pickLocalFile in /usr/share/tex-common/debianize-upddmap which overrides certain behaviors in updmap. The pickLocalFile uses the following lines

localfile=""
localfile="`ls $debDirname/*local*cfg 2>/dev/null`"
if [ -n "$localfile" ]; then

To check if there is a local configuration file under /etc/texmf/updmap.d. If such file doesn’t exist, instead of creating one (as the maintainers of debianize-updmap intended) it fails due to the set -e in /usr/bin/updmap. Thus updmap exists with error code 2, instead of completing the installation.

Meanwhile, until the bug is fixed, there is a simple workaround

Workaround

Before installing, execute

sudo touch /etc/texmf/updmap.d/10local.cfg

And now the regular sudo make install installation should finish successfully.

As the problem is a result of a Debian bug, I don’t expect to release a new version of culmus-latex, instead I’ll report the bug to the Debian team.

mechanize – Writing Bots in Python Made Simple

I’ve been using python to write various bots and crawler for a long time. Few days ago I needed to write some simple bot to remove some 400+ spam pages in Sikumuna, I took an old script of mine (from 2006) in order to modify it. The script used ClientForm, a python module that allows you to easily parse and fill html forms using python. I quickly found that ClientForm is now deprecated in favor of mechanize. In the beginning I was partly set back by the change, as ClientForm was pretty easy to use, and mechanize‘s documentation could use some improvement. However, I quickly changed my mind about mechanize. The basic interface for mechanize is a simple browser object, that litteraly allows you to browse using python. It takes care of handling cookies and such and it got similar form-filling abilities to ClientForm, but this time they are integrated into the browser object.

For future reference for myself, and as another code example to mechanizes sparse documentation I’m giving below the gist of the simple bot I wrote:

Continue reading mechanize – Writing Bots in Python Made Simple

Bye Bye OmniCppComplete, Hello Clang Complete

For years OmniCppComplete has been the de facto standard for C++ completion in Vim. But as time progressed, I got more and more annoyed by it’s shortcomings. OmniCppComplete is based on tokenizing provided by ctags. The ctags parsing of C++ code is problematic, you can’t even run it on libstdc++ headers (you need to download modified headers). You want to use an external library? You’ll need to run ctags seperatly on each library. Not to mention it’s inablity to deduce types of anything more than trivial. The core of the problem is that OmniCppComplete isn’t a compiler and you can’t expect something that isn’t a compiler to fully understand code. This what makes Visual Studio’s IntelliSense so great: it uses the Visual C++ compiler for parsing, it isn’t making wild guess at types and what is the current scope – it knows it.
Continue reading Bye Bye OmniCppComplete, Hello Clang Complete