Tuesday, August 08, 2006

Parsing TreeLine ouput to TiddlyWiki

I'm investigating the use of TreeLine to generate TiddlyWiki content. There are two reasons:
  • to facilitate reordering of tiddlers in a sequence
  • to allow more arcane plugin code to be generated via specialised TreeLine nodes
The TreeLine output is actually table format HTML with markup in Twee format (basically :: denotes start of a tiddler and tags on title line in square brackets). Sadly the table layout thwarted my attempts to import the HTML saved as text (from a web browser) into the Gimcrack'd conversion utility so I've had to roll my own.

TreeLine's HTML output seems to preserve most wiki output, the most obvious exceptions being angle brackets.

Thus far I've defined three node types in TreeLine: Metadata (which is named thus and contains SiteTitle and SiteSubtitle fields), Page (which corresponds to a tiddler and starts with ::) and its child SubPage, the latter at this stage being flattened and presented within its parent with prepended exclamation marks signifying a wiki-style (sub)heading (there's also a Plugins node but that's not used). In the TreeLine output each node is flagged by an HTML anchor tag and child nodes are wrapped in a div tag. Line breaks are represented as HTML br tags.

TiddlyWiki tiddlers are themselves wrapped in div tags and have the following fields in the div:
  • tiddler="Using TiddlyWiki"
  • modifier="YourName"
  • modified="200511081054", i.e. date/time
  • created="200511081054"
  • tags="main"
Within the tiddler content, line breaks are encoded as \n and single quotes are preserved. Double quotes and angle brackets (as a minimum) are represented as their HTML entity equivalents.

All that is required therefore is to
  • remove the head section from the HTML
  • substitute \n for br tags
  • strip any remaining HTML tags (useful discussion in the Python Cookbook)
  • replace angle brackets etc with their entities using Peter Bengttson's entity fixer (though I just used replace on double quotes for now)
  • for each tiddler (with prepended ::), construct the Tiddler div element and its contents, incrementing the modified date (actually the minutes so a max of 60-100 tiddlers depending on how fussy Tw is), compiling tags as appropriate and ignoring empty lines
  • wrap the tiddlers in the boilerplate TiddlyWiki code (ignoring for the moment the TreeLine Plugins node)
  • Generate a web interface in Karrigell
This is very much a first step but I have working alpha code for this that allows me to start authoring with some confidence that the content can migrate to TiddlyWiki.

0 comments:

Blog Archive