Skip to Content


Rhizomic Software Architecture

This is a broad-level overview of an approach to micro-services that leans heavily on structured data to create a system of linked data in order to facilitate intelligent heterogeneity across a fragmented content infrastructure.


Collected Notes


Any node in the system can directly connect to any other node in the system.


The nodes in our system are micro-services, content management systems, digital asset managers, identity providers, and audience experiences.


Micro-services include systems like Agora for products and Encore for backlinks.


Contrast with macro-services, which are things like Hermano for structured data, content api for entry data, and sbn for everything.


Splitting apart the macro-services into many micro-services allows for greater performance, stability, and domain-driven data structures.


A distributed system requires a thoughtful approach to both querying data from across the system, as well as emitting data into the system.


A universal event stream layer can synchronize and coordinate data transfer from an originating service to whichever services require that data.


A Linked Data approach can allow services to reference each other without reifying data.


Assume two micro-services; one for Entries and one for Authors. Each Entry and each Author is identified by a URL that resolves each nodes data. An Entry has an array of Authors, each identified by the URL of the Author node. An Author has an array of Entries, each identified by the URL of the Entry.


  "name": "Sarah Jeong",
  "profile": "",
  "bio": "Sarah Jeong is smart and cool.",
  "contributedTo": [

  "hed": "Bluesky showed everyone’s ass",
  "dek": "In many cases, literally.",
  "byline": [


Connections can be traversed by resolving the URLs. If a URL needs to know the titles of the entries that Sarah Jeong has published, it can request the data from the array of URLs under contributedTo.


Relationships between nodes can be aggregated into a single micro-service. This service can track the relationship author:sarahjeong contributedTo anthem:someuuid or anthem:someuuid hasProduct agora:someproductid and make those relationships queryable.


A network of entity relationships can be explored with complex queries, for instance “find all authors who have contributed to the same stories this author has contributed to” or “find all authors who have contributed to entries that mention this product”.


A micro-service like Encore is CMS agnostic. Any CMS can access the API for manipulating the scoped backlinks structure.


Entries across CMSs have different data structures, but many of the concepts are the same. It’s possible to translate these concepts between data structures, provided there is an approach to handle concepts that simply don’t exist from one to the other.


With JSON-LD, the data in any JSON blob can be separated from the format of that blob, and cast into different formats.


const context = {
  id: "@id",
  name: "uri:uuid:thing-name",
  uri: "@id",
  title: "uri:uuid:thing-name"

const objects = {
  "@context": context,
  "@graph": [
      id: "0x01",
      name: "Widget"
      uri: "thing-two-slug",
      title: "Fidget"

const harmonize = async () => {
  let expanded = await jsonld.expand(objects)
  let terser = await jsonld.compact(expanded, {})
  return terser

return await harmonize()


  "@graph": [
      "@id": "0x01",
      "uri:uuid:thing-name": "Widget"
      "@id": "thing-two-slug",
      "uri:uuid:thing-name": "Fidget"


No service within the system should need to refactor or make serious changes in its architecture to participate in the linked data system.


The first step in integrating with the Linked Data system is for a micro-service to expose a node’s data as JSON at a URL.


Any other service that wants to reference external data should store the external node’s URL as an ID. The external node data could be reified into the services data model, or could be resolved in the services runtime logic, or pulled from a cache.


A universal event system could be used to update caches across the system, as well as track when relationships between service nodes change and record those changes in the relationship tracking service.


A universal query service like Tower can easily resolve from micro-services by integrating with those services APIs – either REST or GraphQL


The worst-case query across Tower would be something that hops back and forth across services multiple times; for instance “get stories from this author, get authors from those stories, get stories from those authors”. This cross-boundary query can be done in a single request with the relationship tracking service.


Micro-services can have their own stand-alone UIs for manipulating their data, apart from any CMS integration.


Micro-services can provide a simple CRUD API or a web component that allows for a CMS to directly integrate with them rather than send users to an external UI


Any service in the system can be removed without disrupting the service, or any other service can rely on a subset of the entire service.


Cache layers for each service provide resilience to outages, latency, and other problems.


Each micro-service runs on it’s own K8s cluster, allowing for workloads to be scaled up under load.


The macro-services can be split up into a collection of micro-services.


The Hermano macro-service can split into Map, Venue, Product, Game, and other domain-specific micro-services.


The macro-services do not need to be split up before realizing efficiency improvements.


The SBN macro-service can be split into audience data, identity management, authors, link sets, hub layouts, taxonomies, communities/networks, and RSS feed filters.


Micro-services can provide web components that render the data they expose. Any given audience layer could query any given service node – say, a Product – and attach the web component provided by that product.


// Product Service
  "name": "Widget",
  "id": "",
  "simpleComponent": "",
  "complexComponent": "",


This approach to linked data and components is very similar to the structure of Clay.


Encore was a service that was created quickly, and was a light lift. Most of the work was in setting up the systems that manage its deployment, and deciding on its architecture. Now that that is settled, it should be possible to rapidly spin up additional services in the same model.


The Content API macro-service can be broken into smaller services for signaling and managing Operational Transforms, and storing and querying entry data.


Operational Transform are a form Conflict-free Replicated Data Type that allows for multiple authors editing a single text field at a given time. They are how Google Docs works.


Complexity in the Anthem CMS is currently in managing how components interact with the Autosave system, and creating user interfaces for those components.


Adding a single key:value pair into Anthem autosave currently involves editing the document schema, adjusting the content api graphQL schema, and adjusting the Tower graphQL schema. This can be simplified.


The process of users pushing their document edits into the OT and CRDT system is called Autosave.


Creating new components with autosaving data fields in the Anthem CMS can be largely automated with scripts.


It should be possible to create new entry body component types and entry metadata fields, exposed to downstream consumers, in under an hour in the Anthem CMS.


Autosave works by accepting a PATCH request to the endpoint{id}/contents


The request JSON has some metadata:

  "content_hash": "5b8b1aaf01498da24d86bddb6c55408e",
  "delta": […],
  "schema_version": 4,
  "sequence": 17,
  "user_id": 8465140,
  "uuid": "2b50230a-4085-4a95-8340-61f844fe4e3f"

This contains which schema we need, what user is doing the thing, and the entry uuid. The content hash is presumably of the deltas.


The delta array contains the operational transforms themselves:

"delta": [
      "o": {
        "ops": [
            "retain": 17
            "insert": " signature plz"
      "p": [
      "t": "rich-text"

o is the operations, p is the path through the document tree to the proper key, and t is the format, either rich text or plain text.


The Autosave service needs to collect every operational transform sent to it. It does not need a schema, but it does need the author id and the node id. Autosave then handles the intersection of the data pipeline.


Autosave allows clients to connect via web socket, and sends operational transforms it receives from its PATCH endpoint down to those clients.


When autosave receives an operational transform for a given field on a given document, it resolves all the operational transforms for that field, then emits an event with the document id, the field path, and the new computed value.


Autosave becomes a micro-service thats decoupled from any given CMS or datastore, allowing it to serve any editing experience within the entire system.


Autosaves own datastore can be domain-specific. Types include documents, paths, and transforms, and versions.


A Taxonomy micro-service is entirely independent from any expression of content or any particular CMS.


A Collections micro-service is a generic CMS-agnostic feed-building tool that can be used to create feeds that power RSS, activity pub, or any component that consumes a feed. Decisions about what feeds exist, what source query (groups, entry type, etc) and what stories are “pinned” in the feed happen there, in a destination-neutral way.


The Taxonomy micro-service can implement the IPTC SKOS vocabulary, or we can layer our own vocabulary over the top of it and extend it for our own uses.


It may be worth forking the IPTC triplestore and hosting our own instance of it.


The Taxonomy micro-service only needs to know the associations between entry identifiers and IPTC identifiers. The relationships between those identifiers can be nuanced – for instance a given entry could have a subject, mentions, or references relationship with any IPTC identifier.


Feed building from the Taxonomy Service can leverage the transitive nature of the IPTC identifier relationships to capture the full web of entries associated with an identifier. As in, get all stories with this subject, as well as all stories with subjects that are transitively narrower than this subject. This reduces data reification and free-form tag population (ie, every entry must have every relevant tag).


A structured-data store like SKOS is an example of a rhizomic system – allowing arbitrary cross-connections of heterogeneous nodes while allowing the seamless transition along the spectrum of smooth-to-striated, and able to cast a given network of flat connections into an arbolic tree structure.


A proof-of-concept app ecocystem could be interesting. We would have


A data-store layer would provide API access to entry data via REST and GraphQL. It would accept events from the Autosave service, and store them on the appropriate documents. It would act as a consumer client app for the rhizome system, structuring data into a shape for consumption by a presentation app further downstream.


We have no idea what ideas are going to work to achieve goals.


Anthem can create experimental components that can be re-used across a community. These can be created incredibly quickly, and have minimal risk. If they are successful, they can be formalized easily into official components and distributed across the entire organization.


Experimental components (mc) deliver their own front-end HTML and require no adjustments to schema, client app, or any other system.


Experimental components (kv) accept a json blob as a schema, which generates a UI in the entry body compose. The generated form accepts values according to the schema, and delivers the JSON blob downstream to the consumer apps.


Flyvbjerg tells us that successful big systems are often made from an agglomeration of small, modular pieces. His example is the solar farm — a project where solar cells aggregate into solar panels, panels into arrays, and arrays into the farm — which reliably comes in on-time and under-budget.


Alexander has a nuanced understanding of modularity – no two things in a living system are truly identical, but instead have minor variations that allow them to respond to local conditions.


Flyvbjerg notes that a project to build schools across Nepal was accomplished under budget and ahead of schedule, thanks to the modular nature of the classrooms built using local processes. He notes that timelines and budgets could have been improved further by prefabricating classroom modules and delivering them to the site. Alexander tells us that this would have in fact damaged the project, but removing the ability for local labor to respond to local labor with local solutions.


Flyvbjerg and Alexander both identity a need for the “construction site” to become an “assembly site”.


Flyvbjerg endorses the removal of the situational context, and the abstraction of the site by creating detailed computer models. Alexander endorses transforming the site into a model of itself, a machine of maximum information designed to simulate its own next step.


Key to Alexanders way of working, which is supported by Flyvbjerg insistent of “maximum virtual model”, is engaging in what Easterling calls “medium design” — the act of planning and designing a process that the work follows.


Flyvbjerg identifies conceptual labor become necessary during the delivery of a project as a key indicator of that project blowing its budget and timeline. The solution to this is to do as much conceptual labor as you can at the front of the project, leaving conventional labor to occur in a single lump.


Alexander’s work on process suggests that front-loading all of a projects conceptual labor is not feasible, and deprives the project of critical information that can only be known through the doing of the project. The hard part becomes understanding which segments of a project will require conceptual labor, and ordering them so that conceptual labor builds off the previous steps instead of undermining them.


Deleuze and Guatari identify an information system structure they call the “rhizome” that is juxtaposed against the traditional tree structure of hierarchy. They lay forth the principles of the rhizome as connection, heterogeneity, multiplicity, signification of rupture, cartography, and decalomania.


Connectivity in a rhizome system is the principle that any given section of the system can and must connect to any other given section. Lateral exchange of information is a necessity.


Heterogeneity in a rhizome system is the principle that there may be many different kinds of things. Deleuze and Guatari show that the rhizome in fact as no nodes, and it composed only of lines. The “nodes” in a rhizome are assemblages of lines, dense knots of lines that come together in the form of a bulb or tuber.


These principles can be stated in a software architecture model as a rejection of centralized, tree-shaped data structures and flows. Instead, data and systems can and should be cross-compatible, able to exist independently or within a network of endless other segments of the system. The data and structure of the system must devolve entirely into edges, while services and documents for the tuber-like aggregation of those edges.


The rhizome is an assemblage of statements, with no subject or object and no central order or control. Language and vocabulary create pressure systems of order which can seize power or be disrupted.


In concrete terms, our software architecture has scalar typed values and relationships. Scalar values may be related to each other and be related to other relationships. Relationships themselves can be related to the relationships. Any relationship is directional, with an implied inverse.


Relationships and values can be interested across varying contexts, with the underlying information they contain preserved, destroyed, translated, or transformed.


A micro-service for the domain of “Authors” is an example of lines and edges aggregating into a useful tuber or bulb within the Rhizome.


An ‘Authors’ micro-service would create a domain vocabulary for speaking about what we want to communicate about an author. This would include statements about their name, their biography, images, articles they have contributed on, anything relevant to the domain of the author.


The Author micro-service exposes the basic CRUD operations on that data for any other service that wants to operate in it, either editing and updating the data or considering and rendering it. The micro-service may have it’s own stand-alone interface that may be exposed as a dynamic island to other services.


A micro-service may implement the Autosave model of service interaction, allowing for multiple concurrent edits of any given piece of information, along with complete version control and attribution of changes.


A micro-service can have it’s data consumed and translated into any given context for any other service, given that a context-mapping resolution is written for that moment of connection.


Data in the author service is considered canonical for its domain. Ideally, that data is not reified into other services.


Micro-services can be queried against as a single, federated endpoint. In this way, the collected services can be treated as a single large service which maintains relationships and connections across them.


The federated query endpoint can be accessed from any of the individual services.


Autosave is an implementation for a service. It works by defining a document. Each document has a stack of versions, each referencing the one before. Versions are immutable, and each references a static collection of operational transforms. Each document has a draft. aDrafts are mutable, and each has a stack of Operational Transforms that define its difference from the version at the head of the document. As operational transforms come in, autosave re-emits them to other clients listening over a web socket connection. When the draft is published, the draft is attached to the head of the versions and a new draft is created.


The read API’s for an autosave service are oriented around requesting a documents head, it’s draft, or a specific version and it’s transforms.


The write API’s for an autosave service are in creating new documents and patching autosave transforms to a draft.


The strength of the rhizome system is in embracing heterogeneity and the loosening of control. It allows for many agents within a system to control their own domains, it can grow, shrink, spread, and incorporate new systems and ideas easily. It creates a framework for maximum interoperability and minimum lock-in to structure, shape, and past decisions.


The weakness of the rhizome system is in its chaotic complexity. Understanding the whole system becomes difficult, and reasoning through complex third order consequences may be challenging.


Methods of addressing the weaknesses int he rhizome system hinge around re-imagined ways of working that are individually smaller, individually simpler.


Alongside working small, the rhizome should be thought large. Models of the system should exist within the system, and allow for testing and understanding changes that can happen to the system. The rhizome requires maintenance and thought at every step along the specificity gradient, not just at the code.


The assemblage is a body without organs, it builds upon itself through density and multiplicity.


Individual editors of individual publications don’t want or need to be constrained by


The purpose of this system is twofold:

  1. Track and expose complex relationships between items in order to facilitate queries.
  2. Reduce reification of canonical data across systems to allow for abstracted re-use of items across sites, verticals, and content management systems.

For example, The Verge has a new format of short entries called Quickposts. These often link out to the wider internet, and have some additional contextual information. Staff at The Verge want to know if any previous Quickpost has linked to any given URL in order to avoid duplicating content. A variety of other properties have the same question when it comes to the use of images to illustrate stories. A common question is “how many other stories have used this image”.

As an example of reducing reification, our Agora system has a collection of Products that many different stories can reference. Ideally, this data lives in a single record in Agora, and is referenced whenever it is needed by any story inside any Content Management System. Another example of data reification today is with Authors. The data for any given Author is reified across all the stories that author has contributed too, making updates to that data large, slow affairs. Additionally, an Author in Anthem is unable to be reused in either either Pinnacle or Clay.


The Rhizome consists of a large number of separate services that must interact with each other. These include;