How to built a simple template engine with Python and regex
Here is the v1.0.0 of Mathew templating engine, very simple and ugly yet does the job ๐งโโ๏ธ
Prologue
As I mentioned previously I want to create a static content creation system. The first step is A Template Engine. Rather than building a fully featured template engine, I am planning on what is just needed, in this major iteration.
I have also saved some bonus features for later major iterations ๐.
With that being said, this major iteration(namely v1.0.0) will have 2 basic features:
Including external templates into another, OR Inheritance, I guess ๐ค
Looping over a dataset to produce multiple pages
Before anything, we should decide on the syntax. The generic one I have decided looks like...
{ macro_name macro_parameter }
Without further ado, let's go ๐โโ๏ธ
1. Including external templates into another
For this, the syntax would look like this to embed another page called index.html
into base.html
base.html
<html> <head>...</head> <body> <!-- some generic content --> { include content.main } </body> </html>
index.html
<h1>Welcome to SPC</h1>
So what I want to do is to read through base.html and replace the line if {}
is encountered. We could do this in many different ways, but an easy one is the regex way.
regex stands for Regular Expression
The usage of regex with python is much simple than other languages make it seem. If you want me to do a swing-by regex with python, please let me know in the comments.
So to substitute the template we would do something like
import re # import the standard regex library
pattern = r'{\s?\w+\s(\w+.\w+)\s?}' # regex pattern to search for
specimen = """
<html>
<head>...</head>
<body>
<!-- some generic content -->
{ include content.main }
</body>
</html>
"""
replace = "<h1>Welcome to SPC</h1>"
parsed_str = re.sub(pattern, replace, specimen) # using .sub() from library
Now if we write parsed_str
to a file, will be the page we intended for. Now, let's encapsulate it into a function for modularity and to be DRY. Thus, the function would be
def eval_include(specimen, replacement):
global pattern
return re.sub(pattern, replacement, specimen)
If you are disgusted by the
global
keyword, just so you know, I am coming from assembly language and Cheat-Engine ๐, I am pretty comfortable with it.
Now, an end user might use the library like
from os.path import realpath
from mathew.macros import eval_include
base = ""
with open(realpath("templates/base.html"), "r") as b:
base = b.read()
index_template = ""
with open(realpath("templates/index.html"), "r") as i:
index_template = i.read()
with open(realpath("out/index.html"), "w") as i:
i.write(
eval_include(base, index) # do the templating magic ๐งโโ๏ธ
)
Parsed page can be found in the out/
dir. File discovery and all other stuff will be automated later. For now, let's just focus on one thing.
2. Looping over a dataset to produce multiple pages
Let's say, we have a list of article titles to display on the homepage of the blog page. E.g.
pubslist.html
<section> <h2>Patrician Publications</h2> { include pubsdetail.html } </section>
pubslistitem.html
<article> <h4>{ eval pubs.title}</h4> <span>{eval pubs.cat }</span> <p>{ eval pubs.sum }</p> </article>
and the dataset
{"pubs": [ {"title": "Some 404 content", "cat": "kavik", "sum": "Summary 501"}, {"title": "Some 403 content", "cat": "eric", "sum": "Summary 502"}, {"title": "Some 402 content", "cat": "beric", "sum": "Summary 503"}, {"title": "Some 401 content", "cat": "manuk", "sum": "Summary 504"}, ]}
The dataset can be mapped to python's dict without any logic. The difference between embedding another template from evaluating a variable and creating many pages by just replacing the data in the template appropriately and embedding the end-string to the destination template. Let's do it, shall we?
For evaluating the variable, we could use the Groups feature in the regex. That's what the ()
around the \w+.\w+
in the pattern for. We can easily access the matched string slice by the .group()
method on the match
object returned by re
lib-functions.
str_1 = "Hello 123"
pattern = r'\w+\s(\d+)'
digits = re.finditer(patter, str) # returns aggregation of `match` objects
for digit in digits:
print(digit.group(1)) # 123
Notice we are calling for 1, not 0. Nothing that the lib is 1-index, it is 0-indexed but 0^th index is the entire str, "Hello 123"
Remember the .sub()
method, its second parameter accepts either str
or a callable
. This callable will get a match
object as an argument for each matched pattern validates. So we can produce dynamic replacements based on each match
like...
# retriving the key from template string
key = m.group(1) # == pubs.title
key = key.split(".") # == ["pubs", "title"]
key = key[1] # == "title"
# evaluating the variable with i^th record from dataset
re.sub(
pattern, # the pattern
lambda m: dataset["pubs"][i][key]
)
If
lambda
is mysterious for you, it is a way to define an anonymous or inline function in python
Defining functions for lib API be
# map each datumset
def __eval_map(string, data):
global pattern
return re.sub(
pattern, lambda m: data[m.group(1).split(".")[1]], string
)
# parse the batch of dataset
def parse_template(template, data):
return [
__eval_map(template, datum)
for datum in data
]
parse_template
returns aggregated results using list comprehension syntax, if you are unfamiliar with the syntax let me know in the comment
So accessing the key to evaluate is just as breezy as...
from os.path import realpath
from mathew.macros import parse_template, eval_include
specimen = """
<article>
<h4>{ eval pubs.title}</h4>
<span>{eval pubs.cat }</span>
<p>{ eval pubs.sum }</p>
</article>
"""
dataset = {
"pubs": [
{"title": "Some 404 content", "cat": "kavik", "sum": "Summary 501"},
{"title": "Some 403 content", "cat": "eric", "sum": "Summary 502"},
{"title": "Some 402 content", "cat": "beric", "sum": "Summary 503"},
{"title": "Some 401 content", "cat": "manuk", "sum": "Summary 504"},
],
}
# parse each `<article>` tag for each list item
parsed_str = parse_template(specimen, dataset["pubs"])
# join the `<article>` tag-group
pubs_list_items = "".join(parsed_str)
pubs_list_template = ""
with open(realpath("templates/pubslist.html"), "r") as p:
pubs_list_template = p.read()
# parse the `pubs_list` itself
parsed_list = eval_include(pubs_list_template, pubs_list_items)
# write the final file with base
with open(realpath("out/pubs.html"), "w") as i:
i.write(
eval_include(base, parsed_list)
)
Final pubslist.html
will be in out/
directory.
Done?
Not quite so. Did you notice the fact, that we still have to read the template string manually, have the data populate in a specific format and the parsing of the template is still manual.
These are for later. For now, we have a simple working template engine that does the job I intended it for. I am happy with it.
Another thing, keen eyes might have noticed is the macro_name
in the template does nothing, in fact, if you swap include
with eval
or anything, as long as the latter part is valid, the script does its job. This is a bad design but the worst part is our eval_include
allows only one template. Gotta fix that!
Epilogue
I guess I don't have anything further, so I will just sign off, this is BE signing off.
Cover by Suzy Hazelwood