apropos

Arbitrary Jekyll Layouts for Hakyll

I switched this site over to Hakyll recently. It’s very nice: it’s programmatically extensible with Haskell, integrated with Pandoc, provides a pretty complete (and succinct) framework for tweaking things as desired, and is easy to run and compile locally.

This is the entirety of my configuration:

{-# LANGUAGE OverloadedStrings #-}

import Hakyll

main :: IO ()
main = hakyll $ do
  -- Compile templates for future use
  match "_templates/*" $ compile templateBodyCompiler

  -- Detect whether HTML files are standalone or in need of a template
  match ("**.html" .||. "**.htm") $ do
    route idRoute
    compile $ do
      identifier <- getUnderlying
      field <- getMetadataField identifier "layout"
      case field of
        Just _ -> pandocCompiler
        Nothing -> getResourceBody
      >>= loadLayoutTemplate defaultContext
      >>= relativizeUrls

  -- Match all other renderable files and apply their template, if it exists
  match ("**.md" .||. "**.rst" .||. "**.org" .||. "**.adoc") $ do
    route $ setExtension "html"
    compile $ pandocCompiler
      >>= loadLayoutTemplate defaultContext
      >>= relativizeUrls

  -- Additionally copy non-HTML files verbatium
  match ("**.md" .||. "**.rst" .||. "**.org" .||. "**.adoc") $ version "raw" $ do
    route idRoute
    compile getResourceBody

  -- Copy all additional files verbatium
  match "**" $ do
    route idRoute
    compile copyFileCompiler

Well, almost. One thing Hakyll lacks out-of-the-box, coming from Jekyll, is support for dynamic layouts: specifying a template to be used for the compilation process directly from a file itself.

But Hakyll is extensible: and passes front matter metadata information through the compilation process. So, after some time picking up the basics of Haskell and smashing types together into something that compiles, I ended up with the following.

-- Loads the template specified in a post's metadata, if it exists
loadLayoutTemplate :: Context String -> Item String -> Compiler (Item String)
loadLayoutTemplate context item = do
  field <- getMetadataField (itemIdentifier item) "layout"
  case field of
    Just path ->
      let templatePath = "_templates/" ++ path ++ ".html" in
      loadAndApplyTemplate (fromFilePath templatePath) context item
    _ -> return item

Now if a file contains the layout field in its front matter, Hakyll will dynamically choose the correct layout when rebuilding the site. If a layout is specified without a corresponding entry in _templates, Hakyll will throw a “template not found” error. This is implemented as a compiler function: so it can be used anywhere, without restriction on the file path structure.

---
layout: post
title: Arbitrary Jekyll Layouts for Hakyll
---

This completely obliviates the need for Jekyll for me, and decouples my site from GitHub Pages (and from Ruby). How fun! I’m increasingly uncomfortable with hosting my site on GitHub: while I didn’t previously have many qualms about GitHub being bought by the Microsoft mega-corp, they’ve started to be more aggressive about slurping up data for LLM training. I don’t like the idea of my writing being ingested by a stochastic model that may-or-may-not parrot it back up, with no way to delete it were I to want it gone. And as my student GitHub Pro membership is expiring, this comes at a good time.

Pandoc Configuration

I also have some custom flags being passed to Pandoc, to get it to spit out MathML directly and tweak some stuff around the peculiarities of the Markdown syntax I prefer.

import Hakyll hiding (pandocCompiler)
import Text.Pandoc.Options
import Text.Pandoc.Highlighting

-- Pass custom options to the Pandoc compiler
pandocCompiler :: Compiler (Item String)
pandocCompiler = pandocCompilerWith readerOptions writerOptions where
  writerOptions = defaultHakyllWriterOptions {
    writerExtensions = writerExtensions defaultHakyllWriterOptions
      <> pandocExtensions,
    writerHTMLMathMethod = MathML
  }
  readerOptions = defaultHakyllReaderOptions {
    readerExtensions = readerExtensions defaultHakyllReaderOptions
      <> pandocExtensions
      <> extensionsFromList [Ext_lists_without_preceding_blankline]
  }

Future Work

There’s a couple of things I haven’t got quite ironed out.

site watch does not serve HTML files very conveniently. GitHub Pages - and most other web hosts - will serve pages at **/foo.html at both **/foo.html and **/foo. I quite like this: HTML is the standard file format of the web, so the .html suffix is often a little redundant. And it leads to cleaner URLs.

There is no way, to my knowledge, to support template injection in things that are not templates: like how Jekyll’s {% include foo.html %} works. This is running up against the limitations of Pandoc: it has a very clear distinction between templates and source files, and only templates are allowed to use variables / control structures.

This is probably more or less for the best. There possibly might be some way to dogfood this, but it was simple enough for me to restructure anything that used direct includes to call out to a separate template instead. And Pandoc uses $...$ for template syntax: which does conflict with the common use of $...$ for inline LaTeX. Which would make things complicated.