Extracting Data from Akonadi (Kontact)

In older versions of KDE, Kontact used to keep it’s data in portable formats. iCalendar files for KOrganizer and vCard for KAddressBook. But sometime ago Kontact moved to akonadi, a more sophisticated backend storage. By default (at least on my machine) Akonadi uses MySQL (with InnoDB) as the perssistent storage. I didn’t consider it thourghly when moving my data to Gnome, and I got stuck with the data.

To make things worth, somewhere along the update to KDE 4.6, I got some of the data moved to ~/.akonadi.old. Being stuck with the InnoDB tables, I tried the following solutions without much success:

  1. Loading the InnoDB tables to a MySQL server. Didn’t fare good, MySQL complained about weird stuff, and I gave up in search of simpler solution.
  2. I booted a OpenSuse virtual machine with KDE and tried loading my old data. Apparently, my ~/.akonadi folder, contained nothing interesting and Suse’s KDE 4.6 refused to load the data ~/.akonadi.old after I renamed it.

So being upset about Akonadi I did some greping and found strings from my contacts and todo lists in the following files:

Binary file .local/share/akonadi.old/db_data/ibdata1 matches
Binary file .local/share/akonadi.old/db_data/akonadi/parttable.ibd matches
Binary file .local/share/akonadi.old/db_data/ib_logfile0 matches

I opened the files with vim, and found out the contained vCards and iCalendar blobs in them. So instead of directly storing them on the file-system, where they are easily accessible, they are stored in the DB files. I figured it would be easiest to just extract the data from the binary files. I’ve used the following script:

import sys

START_DELIM = "BEGIN:VCALENDAR"
END_DELIM = "END:VCALENDAR"
def main():
    bin_data = sys.stdin.read()
    vcards = []

    start = bin_data.find(START_DELIM)
    while start > -1:
        end = bin_data.find(END_DELIM,start+1)
        vcards.append(bin_data[start:end + len(END_DELIM)])
        start = bin_data.find(START_DELIM, end+1)

    print "\n".join(vcards)



if __name__=="__main__":
    main()

It reads binary files from stdin and outputs iCalendar data that is embedded in it. If you change START_DELIM and END_DELIM to VCARD instead of VCALENDAR, it will extract the contacts’ data.

This migration, had me thinking how important it is that application’s data should be easily portable. It’s a thing, I feel not many projects have high enough on their priorities.

18 thoughts on “Extracting Data from Akonadi (Kontact)”

  1. Sometimes sqlite will open them happily. Not as good as a good flat file for exporting, sqlite databases really feel great and offer easy tinkering for users too 🙂

  2. Just before writing this post, I’ve ranted on the same topic over lunch with a friend, and wandered why haven’t they went with SQLite. I’m, too, a big fan of it. It’s simple, has great tools to browse it (SQLiteman) and it’s supported out-of-the-box in Python. What else can you ask for?

    I think the problem starts, when developers don’t think thoroughly, what is the use for their DB when they pick one. So you end up with MySQL to serve a single user on a desktop machine.

  3. There are actually good reasons to use something better than SQLite — in particular, SQLite performs quite badly when multiple clients use it (and Akonadi serves a single user, but multiple clients — all the PIM parts).

    But on a much deeper level, Akonadi is *not* a backend storage, but only a caching server; your data should still be available in the files where it has always been.

    See: http://techbase.kde.org/Projects/PIM/Akonadi#Where_does_Akonadi_store_my_data.3F (and read more of the FAQ for comments on SQLite and other issues).

  4. @Shai, that FAQ sounds interesting, maybe it’s not Akonadi fault. But, when I take a look at the address books I’ve created post Akonadi, the only thing I find in their folders is a WARNING_README.txt file which reads:

    Important Warning!!!
    
    Don't create or copy vCards inside this folder manually, they are managed by the Akonadi framework!
    

    which had me ranting on Akonadi. Furthermore greping for data which should have appeared in my calendar files which were updated after Akonadi was installed, showed only Akonadi’s DBs.

    Regarding access by multiple clients, I believe, and please correct me if I’m wrong, Akonadi shouldn’t be facing any outstanding load. So, why not serialize the requests to the DB in Akonadi? This will enable using SQLite as a backend and serving multiple clients.

  5. 1) Re grep: I’ve seen similar things happen with address books — that is, Akonadi holding the data and not writing it to the storage backend. Usually, stopping Akonadi and restarting it (you can do this via the Akonadi tray applet) should make it sync out correctly; actually, I haven’t seen this thing happening since KDE4.6. YMMV.

    2) I am not a KDEPIM developer, but I trust them (with my mail and other PIM data…). Which means when they say on SQLite “we tried”, I assume they ran into real problems and not just theoretical ones.

    Guessing: Akonadi has different kinds of clients; there’s KMail etc. on one hand, and there’s the “resources” — data fetchers — on the other. To handle all of them without unacceptable latency on user requests, they used multithreading or multiprocessing. After that, synchronizing database access in order to serialize it is quite hard; add to it being not necessary for anything but SQLite, and the decision to just drop SQLite is a no-brainer.

  6. @Shai, you raise a couple a good points. I guess it would be interesting for me to go through the developer’s mailing lists and learn what problems they ran into.

  7. Good article – thanks for sharing.

    I’m using Kubuntu 11.04 and would like to inspect some Akonadi data, which use the internal MySQL server. When I run MySQL Workbench I can see the Amarok, Digikam and other catalogs, but nothing about Akonadi. Where Akonadi data are stored?

  8. Akonadi places the database in its data dir. So it should be under ~/.local/share/akonadi/db_data/ or something like that.

  9. Thanks, Guy. However, I still cannot see the data. Investigating a little deeper, I see that Akonadi connects to MySQL through the socket .local/share/akonadi/db_misc/mysql.socket, but this file is empty. How do I create an Akonadi connection in a visual MySQL tool like Workbench or Administrator?

    Thanks again for your patience.

  10. The file is empty because it’s a unix socket, it isn’t a regular file, it’s somewhat like a named pipe. I never used a visual administration tool for MySQL, apart from phpMyAdmin, so I don’t really know how to help you.

  11. Let me abuse of your patience: I’m running phpMyAdmin but don’t know how to open the Akonadi database in this tool. Again, I can see other KDE databases, such as Amarok and DigiKam, but no Akonadi…

    Thanks again.

  12. I failed to export my contacts before moving to kubuntu oneiric and kmail 2 and found that the usual kabc files containing my contacts where empty. Your script was a great help to extract the data from the database. Thanks!

  13. Your script does not appear to be a shell script.
    Can you offer any guidance on using it?

  14. Never mind.
    I added a
    #!/usr/bin/python as the first line
    and saved the script as exp.py and it works great.
    the command then to run is:
    exp.py contacts_1.vcf
    where ibdata1 is your data
    and contacts_1.vcf is your new .vcf file.

  15. When I came across your post, I felt there must be a more natural way to extract the calendar data. I spent many hours trying to figure out another solution, searching and grepping, but at the end I gave your approach a try and it worked really well. Very disappointing, however, that Akonadi calendar resources require such a big effort for a simple data migration. Thanks a lot for sharing this.

  16. @Andy, I felt the same way before resorting to this “hackish” solution.

    Good data migration is something lot’s of programs ignore. It’s much nicer to implement “import” functions to your programs than “export”. I can only regret not having enough time to pick the glove and contribute code for these features to Akonadi.

  17. A great post. I’ve modified a script a bit so that it iterates through files and picks only unique vcards, i.e. using a set instead of a list:

    #!/usr/bin/env python3

    extract unique vCards from Akonadi database files

    based on:

    http://www.guyrutenberg.com/2011/08/28/extracting-data-from-akonadi-kontact/

    https://ubuntuforums.org/showthread.php?t=1972435&s=ff1e5c959742784d8ffd0bbb38c98ae0

    usage:

    ./scipt_name akonadi_dir > vcards.vcf

    import sys
    import glob

    database file extensions to loop through

    exts = (“ibd”, “glass”)

    Akonadi directory

    akonadi_dir = sys.argv[1]

    enc = “utf-8”

    vCard delimiters

    start_delim = bytes(“BEGIN:VCARD”, encoding = enc)
    end_delim = bytes(“END:VCARD”, encoding = enc)

    vcards = set()
    for ext in exts:
    for db_filename in (glob.glob(akonadi_dir + “/**/*.” + ext, recursive = True)):
    with open (db_filename, “rb”) as db_file:
    db_file_data = db_file.read()
    start_pos = db_file_data.find(start_delim)
    while start_pos > -1:
    end_pos = db_file_data.find(end_delim, start_pos + 1)
    vcard = db_file_data[start_pos : end_pos + len(end_delim)].decode(enc)
    vcards.add(vcard)
    start_pos = db_file_data.find(start_delim, end_pos + 1)
    print(“\n”.join(vcards))

Leave a Reply

Your email address will not be published. Required fields are marked *