home

The Unfolding of a Content Management System

A sequence in eighteen steps, with drawings — for Google App Engine, Flask and Datastore, in the manner of List and Items (2008), after Christopher Alexander.

by Greg Bryant

Scroll, and it unfolds

In 2008 I tried to grow the smallest possible program — a list and its items — as a smoothly unfolding sequence: a series of steps, each one leaving a working, end-to-end system behind it, each one chosen because the most subsequent steps would depend on it. The idea came from Christopher Alexander, who argued in The Nature of Order that living structure is not assembled but differentiated — that a whole becomes more whole by a sequence of transformations, each of which preserves and deepens the structure that already exists.

This document tries the same discipline on something less trivial: a complete, good, small content management system — approximately Google Sites, but simpler and better in one particular way — for Google App Engine, in Python and Flask, with Datastore beneath and HTML, CSS and JavaScript in front. The particular way: alongside the pages grown in its own editor, it can accept whole HTML pages — scripts, styles and all — transplanted from elsewhere, served exactly as given, never touched by the editing tool.

Three rules govern the sequence. First: after every step the program runs, end to end, in the habitat where it will live. Second: each step is the differentiation that the most subsequent steps depend upon — which is why the order, not the code, is the hard part. Third: when a step turns out to have been placed wrong, you do not patch; you re-run the sequence with the order corrected. This sequence was re-run three times before it was written down, and the corrections are confessed at the end.

Beside each step, a drawing. It is not a diagram; it does not label anything. It tries to show what the system feels like at that moment of its growth — the way a seed, a gate, a tree, a house feel like stages of one continuous thing.

I · The Seed

Step 1 · I · The Seed

The seed

There is a smallest living whole: a program that runs, end to end, in the habitat where it will spend its life.

Before any decision about pages or data, the habitat. A configuration file, one route, one sentence of HTML — deployed. This is not scaffolding to be thrown away later; it is the organism at its smallest. Every subsequent step will be a differentiation of this seed, never a replacement of it.

Why this is first: the second rule of the sequence — always end-to-end — is only checkable if end-to-end exists, and it must exist in the real habitat, because App Engine's routing, its static handling, its environment variables are all decisions the seed makes on our behalf. A program that has never lived in production has never lived. In 2008 the seed was webapp's handler map; today it is Flask's route table. Same organ.

# app.yaml
runtime: python312
entrypoint: gunicorn -b :$PORT main:app

# main.py — the whole program
from flask import Flask
app = Flask(__name__)

@app.route("/")
def home():
    return "<p>A site will unfold here.</p>"

The entire system at step one. It deploys.

Step 2 · I · The Seed

The name and the thing

Every piece of content has a name, and the name is a URL; the deepest act of the system is to answer a name with a thing.

Here the essential nature of a CMS is decided: it is a mapping from paths to pages. One catch-all route; a dictionary standing in for the database that does not exist yet. The dictionary is not a mock — it is the real interface, in embryo.

Why this comes before storage: the shape of the name determines the shape of everything downstream — storage keys, editing addresses, feeds, sitemaps, redirects. Had we chosen numeric identity (?page=42), every machine-facing step would fight it forever. Choosing the path is the key now means the datastore entity can later be keyed by the path itself — the key we pass, in the 2008 article's terms — and it makes renaming a visible, important event, from which step eleven will one day grow.

PAGES = {"": "<p>A site will unfold here.</p>",
         "about": "<p>It is early days.</p>"}

@app.route("/", defaults={"path": ""})
@app.route("/<path:path>")
def view_page(path):
    if path not in PAGES:
        abort(404)
    return PAGES[path]

A database of the smallest possible kind.

Step 3 · I · The Seed

The ground

Content outlives the process that serves it.

The dictionary becomes Datastore: a Page entity, keyed by its path. This is the trunk of the system — everything after this step grows out of the Page's shape — and it could not come sooner, because until step two we did not know the key.

One deliberate seam: the program talks to store.get_page and store.put_page, never to the datastore library directly. On App Engine the ground is Datastore; on a laptop with no cloud in sight it is a single JSON file. This is not indirection for its own sake — a sequence must remain demonstrable on any machine, or no one can walk it. And note what is resisted: metadata. Only path, body, and timestamps. We do not yet know which metadata matters; we will know at step eight, when its consumers appear on the horizon.

@dataclass
class Page:
    path: str          # canonical; "" is home; the path IS the key
    body: str = ""
    created: str = field(default_factory=now)
    updated: str = field(default_factory=now)

# main.py now reads the ground instead of the dict:
page = store.get_page(path)

The trunk: few fields, on purpose.

II · The Loop

Step 4 · II · The Loop

The loop closes

Whatever the reader can see, a writer can change, through the same living system.

A form; a GET that shows it filled; a POST that saves and redirects to the page it just made. Create-link → form → entity → render — the same cycle the 2008 article ended on, the one developers walk so many times that a time-lapse of it would make you dizzy.

Why the loop closes before anything is made beautiful: from this step onward the system can grow itself. Every later feature will be exercised by creating pages through the front door rather than by seeding a database by hand. A CMS whose editing arrives late gets an editor bolted to its side; a CMS whose editing arrives now grows around its editor, the way a tree grows around whatever touches it early.

@app.route("/edit/<path:path>", methods=["GET", "POST"])
def edit(path):
    page = store.get_page(path)
    if request.method == "POST":
        if page is None:
            page = store.Page(path=canonical(request.form["path"]))
        page.body = request.form["body"]
        page.updated = store.now()
        store.put_page(page)
        return redirect("/" + page.path)
    return render_template("edit.html", page=page)

Read and write, one loop, end to end.

Step 5 · II · The Loop

The two kinds of page

Some pages are grown inside the system; some are transplanted whole. Both stand on the same ground, and the tool never touches a transplanted one.

This is the step the whole sequence was reordered for. Every page acquires a kind: managed or raw. A raw page is an entire document — its own doctype, its own scripts and styles — stored exactly as pasted and served exactly as stored, byte for byte, under its own content type. And because a transplanted page may bring companions, raw pages may be of any type — a stylesheet at /demo/style.css, a script beside it — so the transplant arrives with its own roots attached.

Why so early, when the rendering system barely exists? Because this decision forks the serving pathway at its root. Discovered late (as it was, in the first draft of this sequence), rawness becomes a hack — a flag consulted in a dozen places, escaping fights, template exceptions. Discovered now, it is one clean branch, and everything built afterwards is built knowing there are two kinds. This is precisely what it means for an early step to be the one that the most subsequent steps depend upon.

if page.kind == "raw":
    # Transplanted whole: served exactly as stored, never wrapped.
    return Response(page.body, content_type=page.content_type)

# ...only grown pages continue on toward templates and chrome.

One clean branch at the root of the serving pathway.

Step 6 · II · The Loop

The chrome

The site's frame is worn by grown pages and never imposed on transplanted ones.

Now, and only now, the base template: masthead, a place where navigation will grow, a quiet footer. The chrome could not be defined correctly before step five, because its correct definition is something only managed pages wear. Had the chrome come first — and it is the great temptation, because it is the visible part — rawness would have had to be carved out of it afterwards.

Notice one line in the template's first breath: the viewport meta tag. Device-friendliness is not a feature to be added near the end (step fourteen will only deepen it); it is a property you either preserve from birth or retrofit with grief.

<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>{% block title %}{{ site_title }}{% endblock %}</title>
</head>
<body>
<header class="chrome"> ... </header>
<main>{% block content %}{% endblock %}</main>
<footer class="chrome"> ... </footer>
</body>
</html>

The frame — defined, after step five, as optional clothing.

III · The Boundary

Step 7 · III · The Boundary

The gate

The write pathway, now that it is valuable, acquires a boundary.

A login, a session, and @writer_required on every route that mutates. There is a genuine tension here, better stated than smoothed over: the security instinct says the gate should be step zero; the unfolding instinct says you cannot wrap a pathway that does not exist. Both are right, and the resolution is about habitat, not order-among-features: the gate must exist before the first deployment that strangers can reach. Until now the organism lived in its egg — a development machine. It is about to be announced. So the gate is now.

The mechanism is one password and one session flag, and the seam is one function, is_writer() — so the whole thing can be replaced by Identity-Aware Proxy in front of the write paths without touching a single handler.

def writer_required(view):
    @wraps(view)
    def wrapped(*args, **kwargs):
        if not is_writer():
            return redirect(url_for("login", next=request.path))
        return view(*args, **kwargs)
    return wrapped

The boundary wraps pathways that already exist.

Step 8 · III · The Boundary

Names, times, and states

Every page carries its own record: what it is, who wrote it, when, and whether it is ready.

Title, description, author. Created, updated, published. A status — draft or published — and an article flag for pages that are news rather than furniture. Drafts are visible to the writer and invisible to everyone else, which changes writing more than any editor feature will: pages can now be grown slowly in place.

Why exactly here — after the loop, before the fabric: metadata is worthless before there is an editor to enter it, and it exists for its consumers — navigation, the home page, the feed, the news sitemap, the structured data — which are steps nine through thirteen, all of them about to eat this same meal. Deferred, each consumer invents private fields and the model silts up (the first draft of this sequence proved it). One rule with long consequences: published is stamped once, the first time a draft goes public, and never again. The feed and Google News both depend on publication time being a fact, not an opinion.

new_status = form.get("status", "draft")
if new_status == "published" and not page.published:
    page.published = store.now()   # stamped once, a fact forever
page.status = new_status
page.updated = store.now()

Publication time is a fact, not an opinion.

IV · The Fabric

Step 9 · IV · The Fabric

The tree

Paths imply a tree; the tree becomes visible as navigation.

notes/first-post was always a child of notes; we merely stop ignoring it. Breadcrumbs are computed from the path's own segments — the hierarchy costs nothing to store because it was stored at step two, inside the shape of the name. The navigation menu is made of pages that ask to be in it: an in_nav checkbox, a decision, not an automatism. (Google Sites automates its sidebar, and thereby fills it with junk; a navigation should be edited the way a page is.)

Why after metadata: navigation entries need titles. Why before the front door: the home page is only the most important node of a tree, and the tree must exist first.

def breadcrumbs(path):
    crumbs, segs = [], path.split("/")
    for i in range(len(segs) - 1):
        partial = "/".join(segs[:i + 1])
        p = store.get_page(partial)
        crumbs.append({"path": partial,
                       "title": (p.title if p and p.title else segs[i])})
    return crumbs

The hierarchy was stored at step two, inside the name.

Step 10 · IV · The Fabric

The front door

The site has a face, and the face shows its freshest growth.

The home page: the site's description, then published pages, newest first by publication time. It is almost no code — a sort and a template — and that is the point worth dwelling on. When a step is nearly effortless, that is the sequence paying you back: the effort was spent at steps three and eight, in the right order. When a step is unexpectedly hard, that is the sequence telling you that an earlier step was too trivial — and in this method, that is a signal to start again, not to push through.

@app.route("/")
def home():
    pages = [p for p in store.published_pages()
             if p.path and p.content_type == "text/html"]
    return render_template("home.html", pages=pages, **chrome())

Nearly effortless — the sequence paying you back.

Step 11 · IV · The Fabric

Not breaking

A name once given is never silently taken away.

Rename a page, and its old path is remembered on the entity itself; requests to the old name answer 301 — moved permanently — to the new one. And a 404 page that behaves like a person: it says what it does not know and points back to the front door.

Why now: renaming only becomes a real event once paths, editing and publishing all exist. And it must come before the next phase — before the site starts speaking to machines — because crawlers remember names far longer than people do. Cool URIs don't change; when they must, the change itself is recorded, and the system keeps the promise the web made.

if new_path != page.path and page.path:
    page.old_paths = list(set(page.old_paths + [page.path]))
    store.delete_page(page.path)
    page.path = new_path

# and in the serving pathway, when nothing is found:
moved = store.find_redirect(path)
if moved and moved.status == "published":
    return redirect("/" + moved.path, code=301)

Old names live on, as the entity's own memory.

V · The Voice

Step 12 · V · The Voice

The feed

The site can be read without being visited.

RSS 2.0: articles, newest first, twenty at most. Look at what the feed is made of — title, link, publication date, description — and notice that every one of them already exists. The feed stores nothing new; it speaks something new. It is a projection of step eight through one small function.

This is the shape of the whole phase that begins here: the organism has stopped growing new organs and has started developing voices.

for p in pages:
    if not p.article:
        continue
    url = f"{base_url()}/{p.path}"
    items.append(f"<item><title>{e(p.title)}</title>"
                 f"<link>{url}</link>"
                 f"<pubDate>{rfc822(p.published)}</pubDate>"
                 f"<description>{e(p.description)}</description></item>")

Nothing new stored; something new spoken.

Step 13 · V · The Voice

Speaking to machines

The site describes itself in each dialect the crawlers speak.

Four artifacts in one step, because they are one organ grown in four dialects: a sitemap of everything; a Google News sitemap of articles from the last forty-eight hours (News reads only the fresh — the cutoff is in the protocol, so it is in the code); NewsArticle structured data in each article's head; Open Graph metadata on every managed page.

And a quiet vindication: building all four, not one field had to be added to the model. That is the proof that step eight came at the right moment. If you ever build this phase and find yourself reaching back to add fields, the sequence is speaking to you: next time, the metadata step belongs earlier.

cutoff = datetime.now(timezone.utc) - timedelta(hours=48)
for p in pages:
    if not (p.article and p.published):
        continue
    if parse(p.published) < cutoff:
        continue          # Google News reads only the fresh
    urls.append(news_url_element(p))

The protocol's 48-hour rule, visible in the code.

Step 14 · V · The Voice

One page, many windows

One page must live well in every window that looks at it.

The typographic pass: a fluid single column with a readable measure, images that shrink to fit, a print stylesheet, visible focus for keyboards. Notice how little there is to do, and why: the viewport tag has been present since step six, and the layout is one column because the content's structure is one column. In this sequence, device-friendliness is not a feature at all — it is the absence of a mistake, preserved from birth.

Raw pages are on their own here, deliberately. A transplanted page keeps its own physiology, and the system respects it to the byte.

main { max-width: 42rem; margin: 0 auto; padding: 1rem 1.25rem; }
article img { max-width: 100%; height: auto; }

@media (max-width: 640px) { html { font-size: 16px; } }
@media print { header.chrome nav, footer.chrome { display: none; } }

Not a feature: the absence of a mistake, preserved.

VI · The Hand

Step 15 · VI · The Hand

The writing surface

The tool takes the shape of the kind of page it edits.

For managed pages, an editable canvas and a short, honest toolbar — bold, headings, lists, quotes, links — which writes the very HTML the page will serve. What you edit is what is stored is what is served: no shadow format, no translation layer to leak. For raw pages, a monospace field and a promise: nothing will ever be done to your bytes.

Why the rich surface arrives this late: an editor is ornament upon a loop, and the loop (step four) had to be sound first — polish applied to an unsound loop is polish that will be redone. And honestly: the surface is the most tempting thing to build early, because it is the most visible. A sequence exists partly to resist exactly that temptation.

form.addEventListener("submit", function () {
  bodyField.value = currentKind() === "raw" ? rawBody.value
                                            : canvas.innerHTML;
});

Two surfaces, one field: the body travels as it will be served.

Step 16 · VI · The Hand

Images

Pictures enter pages through the same gate as words.

Pick a file; it is posted to /media/; the returned address drops into the canvas where the cursor stood. Media lives beside the pages, behind the same storage seam — when images grow heavy, Cloud Storage slides in behind that seam without a single handler changing.

Why after the surface: the image button needs a surface to sit on. Why not much later: a Google-Sites-like tool without pictures is not approximately that tool at all.

fetch("/media/", { method: "POST", body: formData })
  .then(r => r.json())
  .then(j => document.execCommand("insertImage", false, j.url));

Upload, receive a name, place the thing the name names.

VII · The Whole

Step 17 · VII · The Whole

The memory of having answered

The system remembers what it has already said, and says “nothing has changed” as cheaply as possible.

ETags computed from the content itself; a conditional request answered with 304; caching headers for anonymous readers; a small render cache, emptied entirely on every save. Invalidation — the famously hard problem — is one call in one place, and its correctness is easy to see.

Why the cache is nearly the last step: a cache is a memory of behavior, and the behavior had to finish forming. A cache added early is a bug generator with a speedup attached, because every subsequent step must remember to invalidate it. Premature optimization is not wrong because optimization is wrong; it is wrong because it is out of sequence.

etag = hashlib.sha1(body.encode()).hexdigest()[:16]
CACHE[key] = (etag, body, content_type)
if request.if_none_match and etag in request.if_none_match:
    return Response(status=304)      # nothing has changed

# and on every save, in one place:
cache_clear()

Invalidation as one call in one place.

Step 18 · VII · The Whole

The whole, and the re-reading

When the last step is taken, the first is still visible: the program reads as the story of its own growth.

Read the finished route table from top to bottom. It is the 2008 handler map, grown up — and it is also a table of contents of this very sequence, because the pathways stand in roughly the order they emerged: the page, the loop, the gate, the media, the voices. Each part exists for a reason you can point to; each reason is a dependency you can trace. That is what Alexander meant by organized complexity — not a pile of features, but a record of a growth, the way the rings of a tree are still present in its trunk.

The last act of the sequence is to re-read the sequence itself — which is where this document ends, below, with the corrections that had to be made before the order was right.

GET  /<path>          the name answered with the thing   (2, 5)
GET  /                 the front door                    (10)
G/P  /edit/<path>      the loop                          (4, 15)
POST /delete/<path>    ... and its shadow
G/P  /login            the gate                          (7)
POST /media/           pictures                          (16)
GET  /feed.xml         the first voice                   (12)
GET  /sitemap.xml      speaking to machines              (13)
GET  /news-sitemap.xml ...

The route table as a table of contents of the growth.

Coda

Re-reading the sequence

Now the confession, which is the most important part, because a sequence you have not had to correct is a sequence you have not examined.

Raw pages moved from step eleven to step five. In the first draft, transplanted pages arrived after navigation, as a feature among features. Building that draft, the chrome and the templates had quietly assumed that every page wears the frame — escaping rules, a wrapping layout, a title slot — and rawness had to be carved out of them with exceptions in a dozen places. That is exactly the smell of a trivial decision made early ("all pages are templated" — made implicitly, which is the worst way) damaging important decisions later. Moved to step five, the two kinds of page became one clean branch at the root of the serving pathway, and everything after was built knowing there are two kinds.

Metadata moved before its consumers. Navigation and the home page originally came first, and each invented private fields — a nav_title here, a home_blurb there — which then had to be merged when the feed arrived wanting a third set. The test that the corrected order is right: in steps twelve and thirteen, which live entirely on metadata, not one field had to be added to the model.

The gate would not sit still. Authentication was step three in one draft (it felt responsible) and step twelve in another (it felt in the way). Both drafts were answering the wrong question. The right question is about habitat, not order-among-features: the gate must exist before the first deployment that strangers can reach. In this sequence that moment is step seven, after the write pathway exists and is worth protecting, before anything is announced to the world.

And the unfoldings not taken, which are the proof that the structure is alive: search (a projection of pages, like the feed — it attaches at step twelve's shoulder); multiple writers (the author string differentiates into a reference — in the 2008 article's language, a key to pass); page history (the updated timestamp unfolds into a list of versions); themes (the chrome differentiates, as the pages did at step five). Each of these attaches at a point in the existing structure you can name. When a proposed feature has no such point — when it would have to be smeared across the system — that is the system telling you it belongs to a different sequence, or that an earlier step is still too trivial.

Alexander called transformations like these structure-preserving: each one takes the centers that exist and makes them stronger. Read the finished route table again and you can see every step still present in it, the way the rings of a tree are still present in the trunk. That is what organized complexity looks like when the order was right — not a pile of features, but the record of a growth.

The package

The finished program travels with this document as unfolding-cms.zip. It runs anywhere:

pip install Flask
python3 main.py
# http://localhost:8080 — log in at /login, password: unfold

On a laptop it keeps everything in one JSON file; deployed with gcloud app deploy it stands on Datastore, through the same seam. The sequence in brief is in UNFOLDING.md; the drawings and their generator are in drawings/.