adding an RSS feed of recent activity to my org-roam digital garden with org-publish
So, I wanted to add an RSS feed for activity in my digital garden.
Why?
Mainly so that I could include a widget on my WordPress site that would surface the latest changes from the garden. Because that tends to get more action than my stream, at the moment, but it's a bit hidden away.
Also, potentially, I could add the RSS feed to a Mastodon bot, which could be kind of fun.
How?
I'm using org-roam to write and org-publish to publish my digital garden. So I need something that works with that setup.
ox-rss exists. However, it expects one file with a heading per entry for in order to produce its RSS feed. That's not how org-roam works - you have one file per entry.
So you need to get something from your org-roam files in a format for ox-rss to work with. Luckily you can (ab)use org-publish's sitemap functionality.
I think this is the first post that described how to do that: Org mode blogging: RSS feed
I found a few other posts that seem to use a similar setup, they look usually to be based off that original one. e.g. Website With Emacs, Blogging with Emacs and Org.
I used that and it works.
("commonplace-rss"
:base-directory ,temp-dir
:base-extension "org"
:publishing-directory ,publish-dir
:publishing-function commonplace/publish-rss-feed
:rss-extension "xml"
:html-link-home ,commonplace/publish-url
:html-link-use-abs-url t
:html-link-org-files-as-html t
:auto-sitemap t
:sitemap-function commonplace/generate-org-for-rss-feed
:sitemap-title "Recent activity in Neil's Digital Garden"
:sitemap-filename "recentchanges-feed.org"
:sitemap-style list
:sitemap-sort-files anti-chronologically
:sitemap-format-entry commonplace/format-rss-feed-entry)
Add a new component to your org-publish-project-alist
.
(defun commonplace/generate-org-for-rss-feed (title sitemap)
"Generate a sitemap of posts that is exported as a RSS feed.
TITLE is the title of the RSS feed. SITEMAP is an internal
representation for the files to include. PROJECT is the current
project."
(let* ((posts (cdr sitemap))
(last-hundred (seq-subseq posts 0 (min (length posts) 100))))
(concat "#+TITLE: " title "\n\n"
(org-list-to-subtree (cons (car sitemap) last-hundred)))))
Tweaks from the original: I take only the last hundred posts from the date ordered list. I was already doing this for my recent changes page. I think in an attempt to speed it up. (Not sure that it does though).
(defun commonplace/format-rss-feed-entry (entry _style project)
"Format ENTRY for the posts RSS feed in PROJECT."
(let* ((title (org-publish-find-title entry project))
(link (concat (file-name-sans-extension entry) ".html"))
(pubdate (format-time-string (car org-time-stamp-formats)
(org-publish-find-date entry project))))
(format "%s
:properties:
:rss_permalink: %s
:pubdate: %s
:end:\n"
title
link
pubdate)))
This is used to format each entry that goes into the org file that's generated. I've not made any tweaks to this.
(defun commonplace/publish-rss-feed (plist filename dir) "Publish PLIST to RSS when FILENAME is recentchanges-feed.org. DIR is the location of the output." (if (equal "recentchanges-feed.org" (file-name-nondirectory filename)) (org-rss-publish-to-rss plist filename dir)))
This is the publishing fucntion that is you set up to be called from the particular component for building your RSS feed in your org-publish-project-alist
.
Some notes:
I had to (require 'ox-rss)
at the top of my publish.el file. And I also had to include it in spacemacs additional packages.
The original uses rss.org
as the name of the generated org page that the RSS xml file is built from. But I already have a page called rss.org - it's the page in my digital garden about RSS. So I changed the name to recentchanges-feed.org
. You can use whatever name you like for an RSS feed file.
You'll note above that my :base-directory
is a temporary directory. I'm playing with this as a way to only built the recent changes RSS off the most recent files that have change. These are copied into the temp dir before the org publish process runs, with:
rm tempdir/*
find . -mtime -28 -name "*.org" -not -path "./tempdir/*" -exec cp --parents -r -p '{}' tempdir \;
This is to avoid processing thousands of org-roam files just to build the recent changes list.
Some issues to be resolved
Backend confusion
A filter function that I have running on the html backend is sticking its stuff in here, which breaks the RSS file.
(defun commonplace/filter-body (text backend info)
(when (org-export-derived-backend-p backend 'html)
(concat "<div class='e-content'>" text "</div>")))
I'm guessing the rss backend piggybacks on the html backend or something.
Yeah looks like it: https://github.com/emacsmirror/ox-rss/blob/master/ox-rss.el#L119
(defun commonplace/filter-body (text backend info)
(when (org-export-derived-backend-p backend 'html)
(unless (org-export-derived-backend-p backend 'rss)
(concat "<div class='e-content'>" text "</div>"))))
^ sorted it.
Subfolders
It doesn't currently produce the correct URL in the RSS feed for my journal pages, which are in a journal
subfolder.
Duplicate IDs
Somehow org-roam seems to think that the IDs for various pages are those for the entries in the RSS org file, not the actual pages themselves.
Some info on that here: https://org-roam.discourse.group/t/possible-to-ignore-directories-within-the-org-directory/2454
(setq org-roam-file-exclude-regexp
(concat "^" (expand-file-name org-roam-directory) "/tempdir/")
This seems to have resolved the issue.
To be honest, having tempdir as a subdir of the current dir is causing lots of problems. Should try to just put it somewhere else.