MC's journal

Prickle-Prickle, the 39 day of Chaos in the YOLD 3181

Converting a blog from Blosxom to Pelican

My blog has always been just static files on a web server. From 2009 when I started blogging again I used a slightly hacked Blosxom to generate the blog from text files written in Markdown. Blosxom is a bit dated. There are numerous problems with it. One problem in particular is that it relies on the modification time of the files as the publishing date. I thought it was time for a change.

Enter Pelican, a static blog generator written in Python with numerous themes and plugins. I experimented with it a bit and found it quite nice.

Many of the Pelican themes are huge files of Javascript and CSS based on popular frameworks with even more JS and CSS. I made a very minimalist theme instead.

Of course I wanted to convert my old Blosxom blog entries to Pelican. Some of the things I wanted to keep through the conversion:

  1. Titles should be kept from the original. In Blosxom, the first line of the file is the title. Convert that to a "Title: title" in the new file.
  2. Modifiation time of original file should be inserted into a "Date:" timestamp.
  3. Permalinks should not be changed. I want the URL to the original blog entry to be the same after conversion. This is possible with the "url:" and "save_as:" metadata lines.

This is what I came up with:

#! /usr/bin/env python

import os
import time

newprefix = '/tmp/pelicanblog/'

if not os.path.isdir(newprefix):
        os.makedirs(newprefix)

for name in os.listdir('.'):
    if os.path.isfile(os.path.join('.', name)):
        # Get the modification time of file
        mtime = os.path.getmtime(name)

        with open(name, 'r') as f:
            # Now read first line of file as title
            title = f.readline()
            # Write out our collected metadata to name.md instead of
            # name.txt
            newname = name.replace('.txt', '.md')
            nf = open(newprefix + newname, 'w')
            nf.write('Title: {}'.format(title))
            nf.write('Date: {}\n'.format(time.ctime(mtime)))
            nameurl = name.replace('.txt', '.html')
            nf.write('url: {}\n'.format(nameurl))
            nf.write('save_as: {}\n'.format(nameurl))

            # Copy rest of file from original.
            for line in f:
                nf.write(line)

Run this script while standing in the directory where you keep all your Blosxom blog entries. You'll find the result under /tmp/pelicanblog. Weed out unnecessary files and then copy it all to your Pelican's content directory. Done!

The only thing that didn't go smooth in the transition is the syntax highlightning of code snippets. Sometimes the heuristics guess the wrong programming language. I have marked some of the entries manually.

All in all a rather smooth transition to a modern blogging tool.


Written by MC using Emacs and friends.