Rasmus Pank Roulund

Coffee & Coding Chronicles

Blogging with Org

I have been revamping my site to leverage Org fully. It is automatically build with Gitlab ci and uploaded to Gitlab Pages whenever I push a new commit via git. Gitlab Pages and ci was what finally made Org desirable for websites, as I don’t have to worry about keeping those dreaded html files up to date.

By no means is this the first blog and page generated by Org. Nicolas Petton’s boeautiful page is made in Org, and Dennis Ogbe’s post on blogging with Org and using ox-publish was a good inspiration.

As Dennis, I one of the main problems of blogging with Org is generating the index. org-publish-org-sitemap is already pretty good, and sorts files according to the #+DATE keywords. All it needs is some better template support. For instance, to make the index more “blog-like”, I’d like to #+include the lead block from each site. Fortunately, whenever this patch is merged, it’s a matter of extending org-publish-sitemap-file-entry-format. Until this is widely available—that is when Org 9 is released (another version was merged) I’ll be using the hack shown below.

Building Org sites with Gitlab ci

With Gitlab ci it is easy to build a static page with one’s favorite static site generator. Needless to say, I’m a big fan of ox-html. It’s in turn easy to build a static site with =ox-publish=.

I keep the publish project for the file, plus all hacks in a single file, publish.el. Thus, I can publish the site with a single command:

emacs --batch --no-init-file --load publish.el --funcall org-publish-all

Thus, I can use the following ci configuration file:

image: pank/docker-ox
       # vipintm/xelatex

before_script:
  - apt-get update; apt-get install -y wget make emacs-nox
pages:
  script:
    - emacs --batch --no-init-file --load publish.el --funcall org-publish-all
  artifacts:
    paths:
      - public
  only:
    - master

I host my own Runner. To use one of the public runners provided by gitlab.com see this bare-bone example repository.

Org configuration

Below is the publish.el I use to publish the site. What follows is an annotation of some of the functions, perhaps mainly to actually use coderef for once.

To avoid keeping the head of the document in publish I use the function in line pank.eu-head that just loads an external, common head file as a string. I can load styles and scripts in the head file.

pank.eu-preamble set the preamble of each sub-project defined in org-publish-project-alist. Actually, org-html-preamble-format should be flexible enough to do this, so I don’t know why I’m not doing it like that.

pank.eu-add-html-lang-tag inserts the language in the opening html tag, as Hyphenator.js requires this. This functionality is already in place in Org-9.

There’s a couple of hacks to make blog articles a bit nicer. First, in line pank.eu-blog-article I define a function that inserts the #+title of the post as a normal headline, which in turns becomes a h2 in the output. It also removes any emphases from the #+title line, as I use a cool css hack from Nicolas’s site, where bold is displayed as block. The function in line pank.eu-blog-index fixes up the blog index. this is a remedy for the poor template support for ox-publish sitemaps. It will be fixed in Org-9.

The actual publishing project org-publish-project-alist.

;; publish.el --- Publish pank.eu using ox-html and ox-publish
;; Author: Rasmus

;; (package-initialize)
;; Initialize locally downloaded Org, if available 

(add-to-list 'load-path "/usr/share/emacs/site-lisp")

(defun pank.eu-get-latest-org ()
  "Download the latest Org if the shipped version is too old."
  (let* ((default-directory "/tmp/")
         (org-dir "/tmp/org-mode/")
         (cgit-url "http://orgmode.org/cgit.cgi/org-mode.git/snapshot/master.tar.gz")
         (htmlize-url "https://raw.githubusercontent.com/hniksic/emacs-htmlize/master/htmlize.el")
         (cgitp (zerop (shell-command (concat "wget -q --spider " cgit-url)))))
    (unless (file-directory-p org-dir)
      (url-copy-file
       (if cgitp cgit-url "http://orgmode.org/org-latest.tar.gz")
       "org.tar.gz" t)
      (shell-command
       (concat "tar xfz org.tar.gz;"
               (if cgitp "mv master org-mode;")
               (concat "cd " org-dir ";")
               "make autoloads")))
    (unless (featurep 'htmlize)
      (url-copy-file htmlize-url (concat org-dir "lisp/htmlize.el") t))
    (add-to-list 'load-path (concat org-dir "lisp/"))
    (add-to-list 'load-path (concat org-dir "contrib/lisp/"))))

(pank.eu-get-latest-org)

(require 'org)
(require 'ox)
(require 'ox-html)
(require 'ox-latex)
(require 'ox-publish)

;; timestamps are no good since the gitlab runner always recreate
;; everything.  Though maybe it just copies over.  In which case
;; timestamps might lead to incremental update.  Which would be
;; better...  maybe timestamps should be used...
(setq org-publish-use-timestamps-flag nil)
(setq user-full-name "Rasmus Pank Roulund")
(setq user-mail-address "rasmus@pank.eu")
;; (setq org-latex-pdf-process '("latexmk -g -xelatex %f"))

;; Set default settings outside of publish to not repeat settings.
;; Can be overwritten on a file basis.
(setq org-export-with-inlinetasks nil
      org-export-with-section-numbers nil
      org-export-with-smart-quotes t
      org-export-with-statistics-cookies nil
      org-export-with-toc nil
      org-export-with-tasks nil)

(defun pank.eu-head (&optional header) ; (ref:pank.eu-head)
  (with-temp-buffer
    (insert-file (or header "head.html"))
    (buffer-string)))

;; HTML settings
(setq org-html-divs '((preamble "header" "top")
                      (content "main" "content")
                      (postamble "footer" "postamble"))
      org-html-container-element "section"
      org-html-metadata-timestamp-format "%Y-%m-%d"
      org-html-checkbox-type 'html
      org-html-html5-fancy t
      org-html-htmlize-output-type 'css
      org-html-head-include-default-style t
      org-html-style-default (pank.eu-head "head.html")
      org-html-head-include-scripts t
      org-html-scripts (pank.eu-head "head-extra.html")
      org-html-doctype "html5"
      org-html-home/up-format "%s\n%s\n")

(defun pank.eu-preamble (&optional title subtitle preamble) ; (ref:pank.eu-preamble)
  "Make a suitable preamble based on the PREAMBLE template, TITLE and SUBTITLE"
  ;; TODO: This is essentially `org-html-preamble-format'.  Port to that.
  (let ((f (lambda (str) (save-match-data
                      (replace-regexp-in-string
                       "\n?</?p>\n?" ""
                       (org-trim (or (org-export-string-as str 'html t) "")) t t)))))
    (with-temp-buffer (insert-file (or preamble "preamble.html"))
                      (search-forward "%TITLE" nil t)
                      (replace-match
                       (funcall f (or title
                                      "@@html:<a href=\"/\">@@Rasmus Pank *Roulund* @@html:</a>@@"))
                       t)
                      (search-forward "%SUBTITLE" nil t)
                      (replace-match (funcall f (or subtitle ""))
                                     t)
                      (buffer-string))))

(defun pank.eu-publish-dir (&optional dir)
  (concat (file-name-as-directory "./public")
          (and dir
               (if (file-name-extension dir) dir
                 (file-name-as-directory dir)))))

(defun pank.eu-add-html-lang-tag (output backend info) ; (ref:pank.eu-add-html-lang-tag)
  "Only the dev. version handles language properly"
  (when (org-export-derived-backend-p backend 'html)
    (replace-regexp-in-string "<html>" (format "<html lang=\"%s\">" org-export-default-language)
                              output)))

(add-to-list 'org-export-filter-final-output-functions
             'pank.eu-add-html-lang-tag)

(defun pank.eu-blog-article (backend) ; (ref:pank.eu-blog-article)
  (let ((file (buffer-file-name (current-buffer))))
    (when (and file (eq backend 'html)
               (string-match-p "blog/$" (file-name-directory file))
               (not (string-match-p "index.org" file)))
      (goto-char (point-min))
      ;; (insert "#+html_head_extra: <link rel=\"stylesheet\" type=\"text/css\" href=\"/css/blog.css\"/>\n")
      (when (re-search-forward "^#\\+title: ?" nil t)
        (let ((title (buffer-substring (point) (line-end-position))))
          (goto-char (line-beginning-position))
          (while (re-search-forward org-emph-re (line-end-position) t)
            (replace-match " \\4 "))
          (end-of-line)
          (insert "\n* " title))))))

(add-to-list 'org-export-before-processing-hook 'pank.eu-blog-article)

(defun pank.eu-blog-index (title list) ; (ref:pank.eu-blog-index)
  (mapconcat
   'identity
   (list
    (concat "#+TITLE: " title)
    (org-list-to-subtree list '(:istart "** "))
    "
#+OPTIONS: title:nil num:nil
#+html_head_extra: <link rel=\"stylesheet\" type=\"text/css\" href=\"/css/blog-index.css\"/>
#+html_head_extra: <link rel=\"stylesheet\" type=\"text/css\" href=\"/css/blog-index.css\"/>")
   "\n\n"))

(defun pank.eu-blog-format-entry (entry style project)
  (when (not (directory-name-p entry))
    (concat
     (format "
[[file:%s][%s]]
#+begin_article-info
#+begin_date
%s
#+end_date
#+begin_tags
%s
#+end_tags
#+end_article-info

#+INCLUDE: \"%s::lead\"

[[file:%s][Read more]]
"
                     entry
                     (org-publish-find-title entry project)
             (format-time-string "%B %e, %Y" (org-publish-find-date entry project))
             (org-publish-find-property entry :keywords project 'html)
             entry
             entry))))


(defvar pank.eu-attachments (regexp-opt '("jpg" "jpeg" "gif" "png" "svg"
                                          "ico" "cur" "css" "js" "woff" "html" "pdf")))
(defvar pank.eu-basedir (file-name-directory (or load-file-name buffer-file-name)))
(defvar pank.eu-postamble "<p>Last updated: <span class=\"date\">%C</span></p>")
(defvar pank.eu-blog-title "Coffee & Coding Chronicles")

(setq org-publish-project-alist ; (org-publish-project-alist)
      (list
       ;; Various misc files in the root
       (list "pank.eu--org"
             ;; Publish details 
             :base-directory pank.eu-basedir
             ;; :exclude-tags '("noexport" "abstract")
             :base-extension "org"
             :recursive t
             :exclude (regexp-opt (list "public" "cv" "blog" "README.org"))
             :publishing-function '(org-html-publish-to-html)
             :html-postamble pank.eu-postamble
             :publishing-directory (pank.eu-publish-dir)
             ;; Org fiddling
             :auto-sitemap nil
             ;; html fiddling
             :html-preamble (pank.eu-preamble nil "PhD candidate in economics")
             )
       ;; The CV is its own project for now... 
       (list "pank.eu--cv"
             :base-directory (concat pank.eu-basedir "cv/")
             :html-preamble (pank.eu-preamble
                             nil ;; "Curriculum Vitæ"
                             (format "Curriculum Vitæ [[%s][@@html:%s@@]]"
                                     "file:../cv.pdf"
                                     (concat
                                      "<img id=\"cv-pdf-icon\" src=\"/images/pdf.svg\" "
                                      "alt=\"PDF CV\" title=\"Download CV as PDF\"></img>")))
             :html-postamble pank.eu-postamble
             :publishing-directory (pank.eu-publish-dir "cv")
             :publishing-function '(org-html-publish-to-html
                                    (lambda (plist filename pub-dir)
                                      (org-latex-publish-to-pdf plist filename
                                                                (pank.eu-publish-dir))
                                      (rename-file
                                       (concat (pank.eu-publish-dir)
                                               (file-name-base filename) ".pdf")
                                       (concat (pank.eu-publish-dir)
                                               "cv.pdf")
                                       t))))
       ;; Publish the blog
       (list "pank.eu--blog"
             :base-directory (concat pank.eu-basedir "blog/")
             :publishing-directory (pank.eu-publish-dir "blog")
             :publishing-function 'org-html-publish-to-html
             :exclude ".*draft.*"
             :with-title nil
             :html-preamble (pank.eu-preamble nil
                                              (format "[[https://www.pank.eu/blog/][%s]]"
                                                      pank.eu-blog-title))
             :html-postamble pank.eu-postamble
             :sitemap-filename "index.org"
             ;; :sitemap-file-entry-format "* [[file:%l][%t]]  ; (file-entry)
;; #+include: \"%f::lead\"

;; [[file:%l][Read more]]"
             :auto-sitemap t
             :sitemap-title pank.eu-blog-title
             :sitemap-filename "index.org"
             :sitemap-function 'pank.eu-blog-index
             :sitemap-format-entry 'pank.eu-blog-format-entry
             :sitemap-style 'list
             :sitemap-sort-files 'anti-chronologically)
       ;; Move static files, maybe move into
       (list "pank.eu--static"
             :base-directory pank.eu-basedir
             :exclude (regexp-opt '("public" "head.html" "head-extra.html"))
             :base-extension pank.eu-attachments
             :publishing-directory (pank.eu-publish-dir)
             :publishing-function 'org-publish-attachment
             :recursive t)
       (list "pank.eu" :components '("pank.eu--org"
                                     "pank.eu--cv"
                                     "pank.eu--blog"
                                     "pank.eu--static"))))

;; LaTeX style

(setq org-latex-default-packages-alist
        '(("AUTO" "inputenc" t ("pdflatex"))
          ("AUTO" "polyglossia" nil ("xelatex" "lualatex"))
          ("" "graphicx" t)
          ("" "booktabs" t)
          ("" "microtype" nil)
          ;; Options are not compatible with beamer?
          ("unicode, psdextra,hidelinks" "hyperref" nil)))

(add-to-list 'org-latex-classes
             '("koma-article"
               "\\documentclass[fontsize=10pt,
captions=tableheading
]{scrartcl}
\\usepackage{scrpage2}
\\renewcommand*{\\othersectionlevelsformat}[3]{%
\\makebox[0pt][r]{{#3}\\autodot\\enskip}}
\\renewcommand\\labelitemi{\\normalfont\\bfseries\\textendash}
\\renewcommand\\labelitemii{\\normalfont\\bfseries\\textbullet}"
               ("\\section{%s}" . "\\section*{%s}")
               ("\\subsection{%s}" . "\\subsection*{%s}")
               ("\\subsubsection{%s}" . "\\subsubsection*{%s}")
               ("\\paragraph{%s}" . "\\paragraph*{%s}")
               ("\\subparagraph{%s}" . "\\subparagraph*{%s}")))


(defun pank/org-guess-textsc (content backend info)
  "Automatically downcase and wrap all-caps words in textsc.
The function is a bit slow...

TODO: make the function work with headlines, but without doing it
on subsequent text.
"
  (if (org-export-derived-backend-p backend 'latex 'html)
      (let* (case-fold-search
             (latexp (org-export-derived-backend-p backend 'latex))
             (wrap
              (if latexp
                  (cons "\\textsc{"  "}")
                (cons "<span class=\"small-caps\">"  "</span>"))))
        (replace-regexp-in-string
         ;; words with an uppercase
         ;; (rx bow (or (*? any) (1+ upper) (*? any) ) eow)
         ;; (rx bow (1+ any) eow)
         "\\w+"
         (lambda (str)
           (if (or (string-equal str (downcase str))
                   (string-equal str (capitalize str)))
               ;;(member str (list (downcase str) (capitalize str)))
               ;; (string-match-p (rx bow (? upper) (1+ lower) eow)  str)
               str
             (replace-regexp-in-string
              "[[:upper:]]+"
              (lambda (x)
                (concat (car wrap) (downcase x) (cdr wrap)))
              str t t)))
         content t t))
    content))

(add-to-list 'org-export-filter-plain-text-functions
             'pank/org-guess-textsc)