Understanding event-sourcing using the Booster Framework
OK, that statement is not entirely accurate. I had been exposed to the concept before (the idea of appending events to a log and using them to reconstitute state) but not in a way related to information systems. Where had I encountered such a concept? Every time I logged in to my bank's website and looked at my account.
You see, in banking, it is vital to keep a record of every event (the most obvious being withdrawals and deposits) that occurred in an account; you don't expect to look at your bank statement and have it simply tell you you have a total of $0 in your account. You want to know how and when you earned or spent money so you can see how you got to $0. Imagine you open an account and start with $0; days later, your first paycheck is deposited and you have $100. You then go to the ATM and withdraw $50 for groceries; a few days later, you withdraw another $50 to pay back a friend who loaned you money. Your total balance after those transactions? $0.
This method of record-keeping and reconstituting state has been around for centuries; it just wasn't called event-sourcing until recently. The bank account ledger is just one example, but there are many others in other areas, such as addendums in legal contracts.
Record-keeping and state in traditional software systems
Event-sourcing appears to be a natural way of modeling data in information systems, but it seems that most of us who learned about storing data when learning about building information systems in software engineering school only learned the CRUD (create, read, update, and delete) method (ALVYS co-founder and CTO Leo Gorodinski even went so far as to say that someone could've come up with event-sourcing by paying no attention to the software engineering literature).
The traditional CRUD style of persisting data and only the current state has its historical reasons; there was a time when computer storage was expensive, so storing every event was unaffordable. But today, in the age of cloud computing, this is not a problem anymore (Adam Dymitruk, creator of Event Modeling, has a great explanation of this on the Event Modeling blog).
An example of CRUD vs. event-sourcing
Let's compare the different approaches with a very simple example: an address book or a list of contacts.
Traditional CRUD approach
For most people, the most obvious way of storing these contacts would probably be in a relational database in a contacts table that would look like something like this:
Want to add a new contact? Just run an INSERT SQL query. Has your contact changed their address or phone number? Execute an UPDATE SQL query and update those fields. Want to know where a person with a specific ID lives? Execute a SELECT Address query for that ID. This works pretty well most of the time, but what if you need to know the previous addresses where someone lived and not their current address? All those UPDATE queries changed those fields in that ID's row, and the old data is lost forever. That means finding the previous addresses is not possible.
Instead of updating a row on a database and storing the latest state, with event-sourcing, you would record everything that happens as an event. Let's say that instead of a relational database, we store the events as JSON objects in a NoSQL database, like MongoDB. (However! I am not saying you need to use a NoSQL database for event-sourcing. You can create an event-sourced system using a traditional, relational database, but the schema-less nature of NoSQL databases makes it easier to explain how events are stored.) Let's look at a sample event stream for one of the contacts represented as an array of JSON objects:
If we were to query this contact's phone and address, we would see that John Doe, with ID "90125", lives at 2120 South Michigan Avenue, Chicago, IL, and his phone number is 867–5309. But storing this stream of events lets us also easily find out that on March 21, 2021, John Doe changed his phone number from 634–5789 to 867–5309 and then changed his address from 22 Acacia Avenue to 2120 South Michigan Avenue on January 1, 2022. With the traditional CRUD approach that many developers (myself included) have been accustomed to, this information would've been lost.
For the example I just used, the business value of modeling the data around events and storing them is not evident; you pretty much only care about where your contact currently lives and their current phone number, but think of an e-commerce application. Imagine that every time a person modifies their shopping cart, an event is registered with the items that were added to or removed from the cart, and a final event is registered when the person proceeds to checkout. All these events happening between the customer adding the first item to their cart and proceeding to checkout provide a gold mine for analytics. Do people start removing items once they cross a certain price threshold? Which items do people start removing? Do people save items for later? All of these insights can then be translated to better business decisions; do we offer a discount once people cross that price threshold so they don't start removing items from the cart? Do we lower the prices of all the items on our site?
From a business perspective, this is one of the most evident examples of the advantages of modeling your system around events. Another great benefit of this approach is that it provides an audit log right out of the box; this benefit is of particular importance in financial applications. There are also advantages from a technical perspective in terms of scalability, resilience, and decoupling since different applications could tap into a shared event store and derive their own models (I recommend reading Martin Fowler's article on event-sourcing). Of course, like all things in life, nothing is perfect, and there are things you need to watch out for when implementing an event-sourced system, such as eventual consistency and versioning (these topics merit their own articles).
Using Booster to implement an event-sourced application
Seeing as most developers are not accustomed to implementing event-sourced systems, it might appear to most of them that it's technically challenging, especially from the infrastructure and security perspective, and not worth putting forth the effort, especially for those who want to ship their product quickly and require a shorter time-to-market. Luckily, the Booster team has come up with an opinionated framework for coding event-sourced applications using TypeScript, removing many of the barriers to entry of getting such an application up and running quickly. It doesn't just provide you a good starting point to structure your code, but it even generates the whole infrastructure in your cloud provider of choice, ready for production from minute 1.
Booster's cornerstones are event-sourcing, CQRS, and elements of domain-driven design. I'm not going to do a deep dive on this—that's what Booster's documentation is for (and I strongly recommend reading it). Here's the abridged version of what a developer needs to code when building an app with Booster:
- Commands: User actions to interact with the application (e.g., AddContact, UpdatePhoneNumber, UpdateAddress)
- Events: Things that happened (e.g., ContactCreated, PhoneNumberChanged, AddressChanged)
- Entities: Representation of a domain entity's state (e.g., Contact)
- Read Models: Cached data the different entities optimized for read operations
Let's look at the code for our sample address book, starting with the commands. Here's what the code for an AddContact command looks like:
The @Command class decorator is used to define a command. The class constructor must have the fields that will be part of the request (a GraphQL mutation, as we'll see later) to submit the command. The authorize: 'all' property in the decorator is part of Booster's authorization mechanism. We're not going to talk about that in this tutorial; you can take a look at the documentation to learn more about that.
The command's handle function can do all the necessary logic and validation before registering an event; in this case, it's a new ContactCreated event. Let's look at the code for that event:
To define an event, a class decorator is also used; in this case, it's the @Event decorator. The code for events is pretty straightforward. The event's structure is defined by its properties in the constructor, and in the entityID function, we need to define which of the properties is the unique identifier for the domain entity affected by this event.
The code for the UpdatePhoneNumber and UpdateAddress commands, as well as for the PhoneNumberChanged, and AddressChanged events follow a similar pattern, so I'm going to skip them here, but you can take a look at them in the GitHub repo.
Now we need to define our domain entity. In the case of the address book, we're managing a bunch of contacts; these contacts all have details for people's first and last names, addresses, phone numbers, and unique IDs to distinguish contacts from each other. The events we've defined make up the source of truth for our system, and we'll use them to reconstitute the state for the different entities. Let's look at the code:
Just like commands and events, we annotate our class with an @Entity decorator to define an entity, and in the constructor, we define the properties we want our Contact entity to have. Then, we need to define a reducer function for each event. You see, the state is reconstituted by taking all the events in an event stream, going from oldest to newest, and applying a function on the state of the entity up to the point of that event and the event itself. For example, take the reducePhoneNumberChangedfunction, which reduces the PhoneNumberChanged event; it's going to return the same first name, last name, and address as the currentContact (i.e., the state of the entity up to that event) and change the phone number to the one registered in the event. Now that we can successfully reconstitute the state of an entity, let's expose them as read models for the world to view them! To do this, we define a read model that projects an entity. Let's look at the code for the ContactReadModel:
Once again, we use decorators, and in our read model class constructor, we define the structure or the properties we'll return to the user when they run a GraphQL query on this read model.
And that's it! The cool thing about Booster is that we can now deploy directly to our cloud provider of choice without having to think about setting up API endpoints, databases, etc., because in Booster, all that is inferred from the code. Once your app is deployed (more on that in the documentation), you can run GraphQL queries and mutations like the following:
1. Add a new contact:
2. Change that contact's phone number:
3. Query that contact's current info:
You can check out the complete code on GitHub and try it out yourself!
Event-sourcing gives us a new way to model data in information systems that's different from the CRUD way of thinking many software developers are accustomed to. This new method offers benefits in terms of scalability, resiliency, decoupling, and access to information previously lost in systems that only handle the latest state.
Booster provides an easy way to harness the benefits of event-sourcing by way of an opinionated framework for developing applications with TypeScript and cloud infrastructure that's inferred from the code.
I encourage you all to try out Booster and modeling your systems around events. Learn more by visiting Booster's website, GitHub repo, or join the conversation on the Booster Discord server!
I have a confession to make. Before I joined The Agile Monkeys and the Booster project, I had no clue about event-sourcing...