Upgrading this server with SSI

A/N: As of May 2025, I am de-templating all of the html pages on this site so that we can remove all of the changes described in this post. The current state is fragile and annoying, and we don't want to be relying on it any more. To anyone reading this and considering doing it, don't! At least, don't use the template approach described here.

Something I’ve been meaning to do for a while is set up Server Side Includes (SSI) on this site. SSI is used to insert the contents of one page into another; for example, if you have a common snippet of HTML that makes the header for your webpage, you can just have a file with that HTML in it, and have your other files include it - when someone makes a request for one of those pages, the server will give them the appropriate file, but it will stick the contents of that header in first, and your page looks all nice.

			include.html:
			<h1>This line is included!</h1>
			
			index.html:
			<html>
			<body>
			<h1>This line is in the file!<h1>
			<!--# include file="include.html" -->
			</body>
			</html>
			
			When someone looks at index.html on your website, they get:
			<html>
			<body>
			<h1>This line is in the file!<h1>
			<h1>This line is included!</h1>
			</body>
			</html>

This is especially cool because it all happens on the server itself; the user requests one page, and gets one page back, so the site stays light and easy and quick to load.

Despite using this server for a lot, I don’t have a good handle on how it’s configured internally - I mostly leave that to my co-admin, who’s written a bit about it here. But today they were busy with other things, so I was left to try to figure it out or die trying, which is honestly something I should do more of.

General Structure

The usage that originally occurred to me here is as shown in the example above; create mostly-complete pages, and use common included snippets for prestructured sections, like a footer at the bottom of the page. However, various online sources (link) suggested doing things the other way - when page X is requested, return a common template file and use SSI to insert the contents that should be on page X.

On the surface, this seems like a good idea - it means you can just create new pages with their content only, rather than adding a handful of include lines to each one. In hindsight, I’m not sure this was a good idea - quite possibly, I will end up going back to the former version at some point.

Forging ahead with this plan, though; the linked page suggests inserting content into index.html, but there are a couple of problems with this in my case. Firstly, we use index as the default page returned for a directory name (that is, going to [link] returns the file [file]), which isn’t even that unusual a thing. More specifically, I use index pages as a way to navigate to other pages, so like… I can’t just go and shove other content into them. So instead, the plan is to add a file called template.html, and insert content into, then return that.

Config changes

The premise

The config we use for finding the correct file to return looks like this:

			# load files, or redirect to dir/ if there's no specific file
			location ~ ^/~(.+?)(/.*)?$ {
				try_files /home/$1/public_html$2 /home/$1/public_html$2.html @redirect;
			}

Elle has written more about how this works, but functionally, when you request a page, this server will look for the file at the exact URL you are looking at. If that doesn’t work, it will stick “.html” on the end and try that file, which is the path you’re probably using to read this, assuming you followed a link to it - see how the URL of this page doesn’t have html at the end? The file is called 002.html, so it needs to have that added.

The behaviour I am adding is that, if a file called template.html exists in the same folder as the target file, to return that instead. Fortunately, try_files tries things in the order they’re written, so I just need to add the check for template.html at the start of that line.

Regex Changes

We use regex (regular expressions) to mutate a request URL into a file location to search.

			location ~ ^/~(.+?)(/.*)?$
			matches paths:
			catgirl.software/~emma/games
			catgirl.software/~name/folder/another/file

Brackets here create capture groups - when a URL matches this condition, `$1` now refers to the part of that string that matched the first set of brackets (red), and `$2` is the part that matched the second set (blue).

This doesn’t make a distinction between the file and its location, though; finding the template.html file I am looking for requires removing the end of that path, but that separation doesn’t currently exist. So I had to rewrite our regular expression a little, and add a third capture group.

			# load files, or redirect to dir/ if there's no specific file
			location ^/~([^/]?)/?(.*/)*([^/]*)$ {
				set $inc $request_uri;
				try_files /home/$1/public_html/$2template.html /home/$1/public_html/$2$3 /home/$1/public_html/$2$3.html @redirect;
			}
			matches path:
			catgirl.software/~name/folder/another/third/page
			and looks for files:
			/home/name/public_html/folder/another/third/template.html
			/home/name/public_html/folder/another/third/page
			/home/name/public_html/folder/another/third/page.html
			in that order.

Now, when trying to access the page at this domain/~name/folder/file, the server first looks for this domain/~name/folder/template.html, and tries to load it. The line above, set $inc $request-uri, creates a variable that template.html can use to request the correct content for itself. If the server can’t find template.html, it proceeds the same way it previously did, and returns the page that was requested (if it can find it).

Having done this, I made a couple of small files to try it out, and promptly discovered that I had fucked it up entirely.

Troubleshooting

First issue - All the shit was broken. Couldn't find any pages, even the ones that shouldn't have been affected because they didn't have an associated template. Turns out, I had removed the ~ character which specifies a particular location to match should use regex, so it was trying to actually match that mess as a URL.

			location ~ ^/~([^/]?)/?(.*/)*([^/]*)$ {
				set $inc $request_uri;
				try_files /home/$1/public_html/$2template.html /home/$1/public_html/$2$3 /home/$1/public_html/$2$3.html @redirect;
			}

Second issue: for any page in a folder that contained template.html, css didn't exist at all. This was the second issue I noticed, but it was immediately superceded by issue number three, which was this.

A screenshot of a webpage, repeating the lines 'A Template Title!' and 'Back to Home' endlessly.

This is actually a weirdly simple problem, though it revealed things that I didn't know about nginx. What's happening here is:

I request the page called "include"
The server sees this request, sees that there is a related template page, and returns that, after setting the variable that page will use to get content
The page, attempting to get content to fill its include section, requests the page called "include"
The server sees this request, sees that there is a related template page, and returns that again - and again, when that copy requests content for itself, and again, and so on

It is the last step here that was new to me, I didn't realise it would use this system for its own internal page requests as well. The solution I came up with is to add a label to the end of any file inclusion request, and add a different regex that would catch these requests specifically.

			location ~ ^/~([^/]?)/?(.*/)*([^/]*)-inclusion$ {
				try_files /home/$1/public_html/$2$3 /home/$1/public_html/$2$3.html @redirect;
			}
			
			location ~ ^/~([^/]?)/?(.*/)*([^/]*)$ {
				set $inc $request_uri;
				try_files /home/$1/public_html/$2template.html /home/$1/public_html/$2$3 /home/$1/public_html/$2$3.html @redirect;
			}

Having done this, the solution to the css problem was also apparent - the server was seeing that a page was requesting styles.css, looking for that file, noticing that template.html existed, and returning that instead. The solution here is similar to the last, adding another section that catches css file requests specifically and finds them without looking for a template first.

			location ~ ^/~([^/]?)/?(.*/)*([^/]*).css$ {
				try_files /home/$1/public_html/$2$3 /home/$1/public_html/$2$3.html @redirect;
			}
			
			location ~ ^/~([^/]?)/?(.*/)*([^/]*)-inclusion$ {
				try_files /home/$1/public_html/$2$3 /home/$1/public_html/$2$3.html @redirect;
			}
			
			location ~ ^/~([^/]?)/?(.*/)*([^/]*)$ {
				set $inc $request_uri;
				try_files /home/$1/public_html/$2template.html /home/$1/public_html/$2$3 /home/$1/public_html/$2$3.html @redirect;
			}

And then... it worked! At last! And it was glorious. Or at least, I thought it was pretty cool.

Post-mortem

That feels like a bit of a dramatic term for it, but the project is conceptually finished for now, so it's the correct one.

There are a couple of problems I have with the current implementation - it feels kind of gross to be catching requests for a page and then not actually returning that page, even if the content is there and this is just a way that a lot of websites do things. More concretely, it bugs me that inclusion works now, but in any folder that uses a template, anything that wants to include something else needs to have "-inclusion" stuck at the end of the file name. That kind of finicky shit feels like a sign of a bad solution, and it digs at me to leave it that way. I also currently have no way to let a page *not* use a template, if one exists in the folder - I have some ideas for how to change that, but they're even more gross and janky.

I will likely come back to this some time in an attempt to resolve these issues; for now, though, it works. Unfortunately it works server side, so, uh... idk, you'll just have to trust me?