Archive for the ‘Uncategorized’ Category

Encyclopaedia Britannica for Mediawiki

Saturday, August 9th, 2008

Here is a Python script that I cooked up some time ago that feeds the articles from the Encyclopaedia Britannica 2008 Ultimate DVD into Mediawiki, which is a MUCH better interface than the one shipped on the DVD. It supports the Encyclopaedia Britannica proper, the Britannica Books of the Year, the Britannica Student Library, and the index, each of which goes to a separate namespace. To use it, you need to run it from the root of your Mediawiki installation, which you should set up to include some additional namespaces in LocalSettings.php:

$wgAllowExternalImages = true;
$wgExtraNamespaces[100] = 'IndexEntry';
$wgExtraNamespaces[102] = 'BookOfTheYear';
$wgExtraNamespaces[104] = 'YearInReview';
$wgExtraNamespaces[106] = 'Document';
$wgExtraNamespaces[108] = 'BSL';
$wgContentNamespaces[] = 102;
$wgContentNamespaces[] = 106;
$wgContentNamespaces[] = 108;
$wgCapitalLinks = false;

You will also need to change the path to your Britannica DVD at the top of the script. It takes several days to finish, but it can be interrupted at any time and will resume operation when started again. Make sure it is allowed to write the necessary data to your Mediawiki directory.

The script tries its best to transform the highly inconsistent Britannica HTML code to Mediawiki markup, including inline images and diagrams. It even adds additional links to the sparsely linked articles, yielding remarkable results. It is not perfect, however, and I’d like to hear of any improvements you can come up with.

Download: dopidx.py

Migrating to Google Mail

Wednesday, August 16th, 2006

Since I often get annoyed by the peculiarities of whatever operating system I’m using at the moment, I have installed nine on my new computer, so I can switch from time to time. For mail reading, however, this would require setting up mail clients on these nine operating systems, not to mention keeping the mail folders in sync. That’s what made me get a Google Mail account at http://gmail.afraid.org/. Now I only had to pump up my several hundred MBs of mail to my new account. There is one Python-based GUI tool for doing this, but it did not perform well. In the end, a simple and stupid SMTP client (Putmail) and some shell scripting did the trick.