git.m455.casa

m455.casa

clone url: git://git.m455.casa/m455.casa


html/archive/2020/all-about-my-awful-rss-feed-generator.html

1 <!DOCTYPE html>
2 <html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
3 <head>
4 <meta charset="utf-8" />
5 <meta name="generator" content="pandoc" />
6 <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
7 <title>All about my awful RSS feed generator</title>
8 <style>
9 code{white-space: pre-wrap;}
10 span.smallcaps{font-variant: small-caps;}
11 span.underline{text-decoration: underline;}
12 div.column{display: inline-block; vertical-align: top; width: 50%;}
13 div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
14 ul.task-list{list-style: none;}
15 </style>
16 <style>
17 body {
18 line-height: 1.5;
19 font-family: sans-serif;
20 font-size: 18px;
21 margin: 20px auto;
22 max-width: 630px;
23 }
24
25 a {
26 color: blue;
27 }
28
29 code, pre {
30 background-color: #fddee3;
31 font-size: 14px;
32 }
33
34 pre {
35 padding: 25px 25px;
36 overflow: auto;
37 }
38
39 pre code {
40 white-space: pre;
41 }
42
43 img {
44 max-width: 100%;
45 }
46
47 table {
48 border-collapse: collapse;
49 }
50
51 table caption {
52 font-weight: bold;
53 margin: 10px 0px;
54 text-align: left;
55 }
56
57 th, td {
58 border: 1px solid #000;
59 padding: 4px;
60 }
61
62 blockquote {
63 border-left: 3px solid #000;
64 padding-left: 10px;
65 }
66
67 .border {
68 border: 1px solid #000;
69 margin: 25px 0px;
70 padding: 5px 25px;
71 }
72
73 @media only screen and (max-width: 700px) {
74 body {
75 margin: 10px;
76 }
77 }
78
79 @media (prefers-color-scheme: dark) {
80 body {
81 background-color: #111;
82 color: #eee;
83 }
84 a {
85 color: #009fff;
86 }
87 code, pre {
88 background-color: #111;
89 color: #fd6363;
90 }
91 pre {
92 padding: 15px 25px;
93 }
94 blockquote {
95 border-left: 3px solid #666;
96 }
97 .border, th, td {
98 border: 1px solid #666;
99 }
100 }
101 </style>
102 </head>
103 <body>
104 <main>
105 <h2 id="all-about-my-awful-rss-feed-generator">All about my awful RSS feed generator</h2>
106 <p>2020-08-11 00:00</p>
107 <p>So, I joined the RSS bandwagon not too long ago. Right now, I’m just using Thunderbird’s built-in RSS feed manager to follow friends. I used to, and still do for some friends who don’t have an RSS feed, keep a text file called <code>~/Documents/links/friends-websites.txt</code>. This file contains line-separated links to friends’ homepages, which I would check most mornings while waking up with a cup of coffee.</p>
108 <p>Something about following friends’ personal homepages is way more appealing than feed-based social media sometimes. It’s relaxing, doesn’t require attention, it doesn’t have notifications, and I don’t even need an account to follow people. How great is that? I guess it’s like having a newspaper full of my friends’ beautiful discourse.</p>
109 <p>… Okay, RSS feeds are still feeds hahaha.</p>
110 <p>So, naturally, I wanted friends to have the same convenient access to my homepage activity as I did to theirs, so I began to research how RSS worked. I didn’t really know at all to be honest, and the XML examples on Wikipedia confused the hell out of me.</p>
111 <p>After staring at the examples for a while, I kind of got the gist of what they were and their structure.</p>
112 <p><a href="https://en.wikipedia.org/wiki/RSS#Example">The example on Wikipedia</a> indicated that I didn’t need much. In its example, it had a title, a description, a link, a build date, a publishing date, and a “ttl”–whatever the hell that is.</p>
113 <p>These elements seemed like they should exist at top of my RSS feed, inside a <code>&lt;channel&gt;</code> element, and each of them should only occur once.</p>
114 <p>Below the <code>&lt;channel&gt;</code> element, there were a few <code>&lt;item&gt;</code> elements, and inside each <code>&lt;item&gt;</code> element there was a <code>&lt;title&gt;</code>, <code>&lt;description&gt;</code>, <code>&lt;link&gt;</code>, <code>&lt;guid&gt;</code>, and a <code>&lt;pubDate&gt;</code> element.</p>
115 <p>At this point I was starting to understand more about how RSS worked.</p>
116 <p>After reading the Wikipedia page on RSS, I went to check <a href="https://www.rssboard.org/rss-specification#requiredChannelElements">the official RSS standard</a> to see what it had to say. To my luck, it had listed “required channel elements”, which said you only need the following elements in the <code>&lt;channel&gt;</code> element:</p>
117 <ul>
118 <li><code>&lt;title&gt;</code>, which contains the name of the RSS feed</li>
119 <li><code>&lt;link&gt;</code>, which contains a link to your website, not the RSS feed file</li>
120 <li><code>&lt;description&gt;</code>, which contains a phrase describing your feed</li>
121 </ul>
122 <p>With this information I made a few definitions in Racket that would later populate the <code>&lt;title&gt;</code>, <code>&lt;link&gt;</code>, and <code>&lt;description&gt;</code> elements:</p>
123 <pre><code>#lang racket/base
124
125 (require racket/file
126 racket/string)
127
128 (define title &quot;m455&#39;s blog&quot;)
129 (define homepage-url &quot;https://m455.casa&quot;)
130 (define description &quot;A blog about programming, documentation, and anything that interests me.&quot;)</code></pre>
131 <p>Next, I had to figure out which element were required inside of the <code>&lt;item&gt;</code> element. There was no section that was called “Required …”, but I did manage to find this phrase from the <a href="https://www.rssboard.org/rss-specification#hrelementsOfLtitemgt">Elements of &lt;item&gt; section</a>:</p>
132 <blockquote>
133 <p>All elements of an item are optional, however at least one of title or description must be present.</p>
134 </blockquote>
135 <p>According to this, I only needed a <code>&lt;title&gt;</code> <strong>or</strong> a <code>&lt;description&gt;</code> element.</p>
136 <p>This was fine, because I could use the <code>&lt;title&gt;</code> element as the title of any given blog post.</p>
137 <p>Conveniently, the specification also mentioned that you could have a <code>&lt;link&gt;</code> element inside of the <code>&lt;item&gt;</code> element. This was great, because this would mean I could include a link for each blog post, so users could click a link in an RSS feed, instead of referencing the title text and searching for it manually.</p>
138 <p>After reading that, I had decided to make a test file based on the requirements I gathered:</p>
139 <pre><code>&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; ?&gt;
140 &lt;rss version=&quot;2.0&quot;&gt;
141
142 &lt;channel&gt;
143 &lt;title&gt;m455&#39;s blog&lt;/title&gt;
144 &lt;link&gt;https://m455.casa&lt;/link&gt;
145 &lt;description&gt;A blog about programming, documentation, and anything that interests me.&lt;/description&gt;
146
147 &lt;item&gt;
148 &lt;title&gt;This is test1&lt;/title&gt;
149 &lt;link&gt;https://m455.casa/posts/this-is-a-test1&lt;/link&gt;
150 &lt;/item&gt;
151
152 &lt;item&gt;
153 &lt;title&gt;This is test2&lt;/title&gt;
154 &lt;link&gt;https://m455.casa/posts/this-is-a-test2&lt;/link&gt;
155 &lt;/item&gt;
156
157 &lt;/channel&gt;
158 &lt;/rss&gt;</code></pre>
159 <p>I took this over to the <a href="https://validator.w3.org/feed/#validate_by_input">w3schools RSS validator</a> and decided to test it.</p>
160 <p>Unsurprisingly, the validator returned errors:</p>
161 <ul>
162 <li>“item should contain a guid element”</li>
163 <li>“Missing atom:link with rel=”self"</li>
164 </ul>
165 <p>I clicked the help link beside the guid-related error, and the help documentation said that all I needed to do was add in a <code>&lt;guid&gt;</code> element. This is great, but I had no clue what I was supposed to be populating the <code>&lt;guid&gt;</code> element with.</p>
166 <p>After some research, which was basically a bunch of Wikipedia rabbit holes, I found out that the <code>&lt;guid&gt;</code> just needs to be a “unique identifier” for each <code>&lt;item&gt;</code> in a <code>&lt;channel&gt;</code>, so what better unique identifier for an item than the link to the item itself!</p>
167 <p>After modifying my already-modified RSS feed, I threw it at the RSS validator again:</p>
168 <pre><code>&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; ?&gt;
169 &lt;rss version=&quot;2.0&quot;&gt;
170
171 &lt;channel&gt;
172 &lt;title&gt;m455&#39;s blog&lt;/title&gt;
173 &lt;link&gt;https://m455.casa&lt;/link&gt;
174 &lt;description&gt;A blog about programming, documentation, and anything that interests me.&lt;/description&gt;
175
176 &lt;item&gt;
177 &lt;title&gt;This is test1&lt;/title&gt;
178 &lt;link&gt;https://m455.casa/posts/this-is-a-test1&lt;/link&gt;
179 &lt;guid&gt;https://m455.casa/posts/this-is-a-test1&lt;/guid&gt;
180 &lt;/item&gt;
181
182 &lt;item&gt;
183 &lt;title&gt;This is test2&lt;/title&gt;
184 &lt;link&gt;https://m455.casa/posts/this-is-a-test2&lt;/link&gt;
185 &lt;guid&gt;https://m455.casa/posts/this-is-a-test2&lt;/guid&gt;
186 &lt;/item&gt;
187
188 &lt;/channel&gt;
189 &lt;/rss&gt;</code></pre>
190 <p>… and that seemed to get rid of the guid-related error! Did I do it right? Who knows!</p>
191 <p>Next up was that mysterious “Missing atom:link with rel=”self"" error.</p>
192 <p>I clicked the help link beside the error, and it gave me the following solution:</p>
193 <blockquote>
194 <p>If you haven’t already done so, declare the Atom namespace at the top of your feed, thus: <code>&lt;rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"&gt;</code></p>
195 <p>Then insert a atom:link to your feed in the channel section. Below is an example to get you started. Be sure to replace the value of the href attribute with the URL of your feed. <code>&lt;atom:link href="http://dallas.example.com/rss.xml" rel="self" type="application/rss+xml" /&gt;</code></p>
196 </blockquote>
197 <p>The first suggestion just required you to add the <code>xmlns:atom=...</code> at the top of the RSS feed, but the second suggestion took a bit of fiddling around to figure out.</p>
198 <p>It turns out all I needed to do was provide a link to the RSS feed itself.</p>
199 <p>So, I, yet again, modified my RSS feed to test against the RSS validator:</p>
200 <pre><code>&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; ?&gt;
201 &lt;rss version=&quot;2.0&quot; xmlns:atom=&quot;http://www.w3.org/2005/Atom&quot;&gt;
202
203 &lt;channel&gt;
204 &lt;title&gt;m455&#39;s blog&lt;/title&gt;
205 &lt;link&gt;https://m455.casa&lt;/link&gt;
206 &lt;description&gt;A blog about programming, documentation, and anything that interests me.&lt;/description&gt;
207 &lt;atom:link href=&quot;https://m455.casa/feed.rss&quot; rel=&quot;self&quot; type=&quot;application/rss+xml&quot; /&gt;
208
209 &lt;item&gt;
210 &lt;title&gt;This is test1&lt;/title&gt;
211 &lt;link&gt;https://m455.casa/posts/this-is-a-test1&lt;/link&gt;
212 &lt;guid&gt;https://m455.casa/posts/this-is-a-test1&lt;/guid&gt;
213 &lt;/item&gt;
214
215 &lt;item&gt;
216 &lt;title&gt;This is test2&lt;/title&gt;
217 &lt;link&gt;https://m455.casa/posts/this-is-a-test2&lt;/link&gt;
218 &lt;guid&gt;https://m455.casa/posts/this-is-a-test2&lt;/guid&gt;
219 &lt;/item&gt;
220
221 &lt;/channel&gt;
222 &lt;/rss&gt;</code></pre>
223 <p>… which turned out to be valid!</p>
224 <p>There was an error saying “Self reference doesn’t match document location”, but that was because it was trying to follow the link to the RSS feed when the RSS didn’t exist yet. To make sure it did work, I added the test feed to my live website to see if it validated using the URL validator, instead of the direct-input validator, and it did!</p>
225 <p>Now I knew what it took to create a bare-minimum, valid RSS feed.</p>
226 <p>This meant I could finally start the fun part: The programming of the RSS feed generator :D</p>
227 <p>For my definitions, I came up with</p>
228 <pre><code>(define title &quot;m455&#39;s blog&quot;)
229 (define homepage-url &quot;https://m455.casa&quot;)
230 (define description &quot;A blog about programming, documentation, and anything that interests me.&quot;)
231 (define reference-file &quot;pages/posts.md&quot;)
232 (define feed-file &quot;feed.rss&quot;)
233 (define feed-file-output (string-append &quot;output/&quot; feed-file))
234 (define feed-file-url (string-append homepage-url &quot;/&quot; feed-file))</code></pre>
235 <p>The <code>feed-file-url</code> creates a <code>https://m455.casa/feed.rss</code> link, while the <code>feed-file-output</code> creates the location of where my <code>feed-file</code> is to be generated: <code>output/https://m455.casa/feed.rss</code></p>
236 <p>The reason I needed the location of my <code>feed-file</code> is because the RSS-generator script exists in the same directory as the <code>output/</code> directory, along with <code>posts/</code>, <code>pages/</code>, <code>images/</code>, etc.</p>
237 <p>There is one special Markdown file I need to parse/transform, which is the <code>posts.md</code> Markdown file, which exists inside of the <code>pages/</code> directory. This file contains a bulleted list of titles and links to all of my blog posts, which would soon be converted into an RSS feed.</p>
238 <p>I chose this file because it has the two pieces of information I need for each <code>&lt;item&gt;</code> element in my RSS feed:</p>
239 <ul>
240 <li>The title of the blog post</li>
241 <li>The link to the blog post</li>
242 <li>and because the <code>&lt;guid&gt;</code> element will be populated with the same information as the <code>&lt;link&gt;</code> element, I didn’t have to worry about finding data to populate the <code>&lt;guid&gt;</code> element with</li>
243 </ul>
244 <p>You can see what the whole <code>posts.md</code> file looks like below:</p>
245 <pre><code># Posts
246
247 * [Thoughts on technical writing and accidentally gatekeeping communities](/posts/thoughts-on-technical-writing-and-accidentally-gatekeeping-communities.html)
248 * [Having fun with Lisp(s)](/posts/having-fun-with-lisps.html)
249 * [Public Unix server etiquette](/posts/public-unix-server-etiquette.html)
250 * [What I like about the Scheme community](/posts/what-i-like-about-the-scheme-community.html)
251 * [What are social Unix servers?](/posts/what-are-social-unix-servers.html)
252 * [Redirecting your GitHub Pages website to a Dat url](/posts/redirecting-your-github-pages-website-to-a-dat-url.html)
253 * [Setting up graphical applications in Windows Subsystem for Linux](/posts/setting-up-graphical-applications-in-windows-subsystem-for-linux.html)
254 * [Setting up a Chinese input method on GNU/Linux](/posts/setting-up-a-chinese-input-method-on-gnulinux.html)
255 * [A quick guide to pronouncing Chinese words](/posts/a-quick-guide-to-pronouncing-chinese-words.html)
256 * [Interpreting second language speakers](/posts/interpreting-second-language-speakers.html)
257 * [Learn to read and type Chinese: A primer for the people of the internet](/posts/learn-to-read-and-type-chinese.html)</code></pre>
258 <p>I decided that all I would need to do to convert this file into an RSS feed is:</p>
259 <ul>
260 <li>Remove the <code># Posts</code> title</li>
261 <li>Remove the <code>*</code> bullet points</li>
262 <li>Extract the title, which exists between the <code>[</code> and <code>]</code>, and store it in a local definition</li>
263 <li>Remove the <code>/posts/</code> bit from links</li>
264 <li>Extract the link, which exists between the <code>(</code> and <code>)</code>, and store it in a local definition</li>
265 <li>Remove the brackets and parentheses around the title and links</li>
266 </ul>
267 <p>The removal of items can be emulated by searching for a string and replacing it with <code>""</code>.</p>
268 <p>The extraction of the link and title information can be done with regex.</p>
269 <p>The rest of my script just needs to create string templates that are formatted, populated, and then stitched together.</p>
270 <p>One thing I really enjoyed about using Racket for this little project was that I could use string blocks, which allowed me to type in string values in a very free-form manner.</p>
271 <p>You can see what I mean by “free-form” below. Basically, everything between <code>#&lt;&lt;string-block</code> and <code>string-block</code> is treated as a string. New lines, tabs, etc. are all rendered as well, so it’s almost the same experience you would get if you were to type text into a plain-text file.</p>
272 <pre><code>(define rss-header
273 (format
274 #&lt;&lt;string-block
275 &lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; ?&gt;
276 &lt;rss version=&quot;2.0&quot; xmlns:atom=&quot;http://www.w3.org/2005/Atom&quot;&gt;
277
278 &lt;channel&gt;
279 &lt;title&gt;~a&lt;/title&gt;
280 &lt;link&gt;~a&lt;/link&gt;
281 &lt;description&gt;~a&lt;/description&gt;
282 &lt;atom:link href=&quot;~a&quot; rel=&quot;self&quot; type=&quot;application/rss+xml&quot; /&gt;
283
284 string-block
285 title
286 homepage-url
287 description
288 feed-file-url
289 ))</code></pre>
290 <p>The <code>~a</code>s are all populated with the <code>title</code>, <code>homepage-url</code>, <code>description</code>, and <code>feed-file-url</code> definitions, just the same as you would populate a string with <code>(format ... title homepage-url description feed-file-url)</code>.</p>
291 <p>Even though my RSS feed’s title, homepage link, and description aren’t directly connected to my website generator’s source files, it’s still fun to have a default RSS template that I can pass around for future websites I create, as long as I follow the same format as <code>posts.md</code>. Yeah, it’s bad design, but that’s why this RSS-feed generator is beautifully awful haha.</p>
292 </main>
293 </body>
294 </html>