Turning the database inside-out (2015)
(martin.kleppmann.com)188 points by andection 4 days ago | 95 comments
188 points by andection 4 days ago | 95 comments
mkleczek 2 days ago | root | parent | next |
I use the architecture you described as the go-to architecture for most of so called "business" applications (with PostgreSQL as dbms though).
The missing pieces are:
- incremental materialized view maintenance
- (bi)temporal primary and foreign keys (upcoming in Pg18)
Lately I found out about DBSP and Feldera which looks very promising as it is based on sound theory: https://github.com/feldera/feldera?tab=readme-ov-file#-theor...
dagss 2 days ago | root | parent |
Great link, thanks!
What do you do postgres for consuming old events and smoothly transitioning to consuming new events? Anything like an event ID allocated at commit time that is usable? As I talk about in sibling comment..
hans_castorp 2 days ago | root | parent | prev | next |
> perhaps it is somehow built into postgres?
Postgres has a built-in listen/notify mechanism. The problem with that is, that it doesn't guarantee delivery and if no process is listening, notifications will be lost.
Most solutions that need something like that use "logical decoding" these days. That's the built-in change data capture exposed as a public API as part of the logical replication.
dagss 2 days ago | root | parent |
Yes, listen/notify is something very different. We would often write new projections that consumes events from years back and until today.
You want sequence numbers that indicate the event's position in a partitioned log.
Something like "int identity" except that the int is assigned during commit, so that you have guarantee that if you see IDs 5 and 7, then 6 will never show up, so that each consumer can store a cursor of its progress of consuming the table which is safe against inserts.
I was hoping to do it using CDC, but Microsoft SQL has a minimum 1 minute delay on CDC which destroys any live data usecase. Perhaps postgres allows listening to the replication log with lower latency?
andyferris 2 days ago | root | parent | next |
> Perhaps postgres allows listening to the replication log with lower latency?
Yes, I think that's what the "logical decoding" referred to. Postgres can emit a "logical" version of the WAL (something with a stable spec writtten down so that other services can stream and decode it). My understanding was that "logical replication" was designed for low latency situations like creating read replicas.
I haven't heard of the logical log being preserved for "years back" but that's an interesting case...
dagss 2 days ago | root | parent |
That is OK, guess I would write a job to listen to the logical WAL and use it to do an update that writes an event sequence number.
valiant55 2 days ago | root | parent | prev | next |
I'm not aware of any one minute minimum delay for CDC. We currently are running an on prem SQL Server -> CDC -> Debezium -> Azure Event Hub -> Azure Function App back to on prem SQL Server and that has 5-10 second delay from source system transaction commit till update/insert.
williamdclt 2 days ago | root | parent | prev |
> Something like "int identity" except that the int is assigned during commit, so that you have guarantee that if you see IDs 5 and 7, then 6 will never show up
I don't think that's possible, nor is it something you should actually need.
If two transactions tx1 and tx2 are concurrent (let's say tx2 begins after tx1 and also finishes after tx1), then tx2 has done some work without access to tx1's data (as tx1 hadn't committed yet when tx2 began). So either:
- tx1's data is relevant to tx2, so tx2 needs to abort and retry. In which case the sequence number doesn't _need_ to be assigned at commit time, it can be assigned at any time and will be increasing monotonically between related transactions. - tx1's data is irrelevant to tx2, in which case the ordering is irrelevant and you don't need to assign the sequence number as late as commit-time.
The "relevance" is what partition keys encode: if tx1 and tx2 are potentially conflicting, they should use the same partition key. It doesn't enforce that sequence numbers increase monotonically within a physical partition, but it enforces that they do for a given _partition key_ (which is what should matter, the key->partition assignment is arbitrary).
> Perhaps postgres allows listening to the replication log with lower latency?
Pretty sure it does, you can listen to the WAL which is as instant as it gets. We were doing that in a previous company: a process (debezium) would listen to the WAL for a specific "events" table and write to Kafka. The main downside is that the table isn't an outbox, it keeps growing despite the events having been pushed to Kafka.
dagss 2 days ago | root | parent |
You explain why I don't need them for a very different usecase than what I refer to.
My point is I want a new primitive -- a pub/sub sequence number -- to avoid having Kafka around at all.
What Kafka does is "only" to generate and store such a sequence number after all (it orders events on a partition, but the sequence number I talk about is the same thing just different storage format). So also you do need it in the setup you describe, you just let Kafka generate it instead of having it in SQL.
Assuming your workload is fine with a single DB, the only thing Kafka gives you is in fact assigning such a post-commit row sequence number (+API/libraries building on it).
This is the mechanism used to implement pub/sub: Every consumer tracks what sequence number it has read to (and Kafka guarantees that the sequence number is increasing).
That is what mssql-changefeed linked above is about: Assigning those event log sequence numbers in the DB instead. And not use any event brokers (or outboxes) at all.
For postgres I would likely then consume the WAL and write sequence numbers to another table based on those...
It may seem clunky but IMO installing and operating Kafka just to get those pub/sub sequence numbers assigned is even clunkier.
oa335 2 days ago | root | parent |
It sounds like the Log Sequence Number in Postgres is what you are looking for. If you subscribe to a Postgres publication via logical replication, each commit will be emitted with a monotonically increasing LSN.
EvanAnderson 2 days ago | root | parent | prev | next |
> The main issue is "listening to new data in a SQL table".
You may want to take a look at Service Broker[0]. It's the idiomatic messaging and queuing bit of SQL Server. It's a bit of an obscure feature and has a bit of a steep learning curve. If I were trying to implement what you're doing it would be the tool I'd reach for.
[0] https://learn.microsoft.com/en-us/sql/database-engine/config...
agumonkey 2 days ago | root | parent | prev | next |
I would love to know if other people in the industry (beside hickey/datomic) use the immutable log/stream + integrators. From my small experience in enterprise app: auditability and time travelling are always bolted on good old sql tables/snapshots after the fact and the pain is already baked in.
mike_hearn 2 days ago | root | parent | next |
Depends which industry. If you look at a lot of non-tech industry then they'll use a commercial DB with all those features in place already, rather than hacking up their own data layer. A few years ago I spent some time in the enterprise finance space, and learned some unfashionable tech you don't see talked about on Hacker News much. It left me with a new appreciation for what goes on there. A staggering amount of time spent in tech startups is spent on solving and resolving problems that you can buy off the shelf solutions for and have been able to for a long time.
After all, this talk is now 10 years old but appears to be describing features that have been around for much longer. Take your average bank - it will have a bunch of Oracle databases in it. Those already have every feature discussed in this thread and in the talk:
• Incremental materialized view maintenance (with automatic query rewrite to use it, so users don't have to know it exists).
• Exposing logical commit logs as an API, with tooling (e.g. GoldenGate, LogMiner, query change notifications).
• Time travelling SELECT (... AS OF).
• Lots of audit features.
• Integrated transactional and scalable MQ (no need for Kafka).
My experience was that faced with a data processing problem, enterprise devs will tend to just read the user guide for their corporation's database, or ask for advice from a graybeard who already did so. They go write some SQL or an Excel plugin or something old school, ship it, close the ticket, go home. Then a few years later you look at HN and find there's a whole startup trying to sell the same feature.
timacles 2 days ago | root | parent | next |
Who besides Oracle offers this stuff though?
Yeah Oracle has a bunch of nice features, it also costs a gajillion dollars that no one besides a large enterprise can afford.
zbentley 2 days ago | root | parent | next |
Debezium is popular in this space, though it does bring more tools into the CDC/CQRS stack: https://debezium.io/
mike_hearn 2 days ago | root | parent | prev | next |
I got curious about this and took a look at some pricing calculators. The results are pretty counter-intuitive.
Compared to PostgreSQL that you run yourself, it's expensive because PostgreSQL you run yourself costs nothing if you assume your time is free. But, how many people want to run it themselves? Especially as Postgres isn't much fun to admin (fiddling with vacuuming, setting up replication by hand, managing major version upgrades etc and you may not be able to scale this way).
So in reality a lot of companies and especially startups these days pay Amazon to run the database for them, and so then the cost question is how much more does it cost for a cloud hosted Oracle DB vs a cloud hosted Postgres DB?
Well, an 8 vCPU hosted RDS Postgres in AWS with 32 GB of memory and 100 GB of storage plus another 200 GB of backup storage - so one less powerful than a local DB on my laptop - costs $1,200/month in US East. That's expensive! AWS doesn't let you scale CPU and RAM independently, so I tried to pick something in the middle. For only 100 GB of data you probably don't need 4 physical cores.
So then I checked the OCI (Oracle Cloud) price calculator and specced out a similar database. I picked autonomous serverless (i.e. fully managed), transaction processing+mixed, autoscaling with 8 ECPUs and same amount of primary/backup storage. They don't let you spec RAM independently, I guess because it's a shared DB so RAM usage is transient and not a VM allocation. The cost came to ~$800/month - that's significantly cheaper than RDS Postgres despite that Oracle DBs have drastically more features. Many of which are optimizations that can reduce your database load anyway, so presumably you need more Postgres cores to match the equivalent performance if those features are used smartly (honestly I haven't ported an app between postgres and oracle so I don't have experience with this).
This is pretty surprising. I'd have expected an Oracle DB to cost more, not less. Auto-scaling is part of it (cost is double RDS if you turn that off), but then again, this is possible because Oracle has more multi-tenant and resource isolation features to begin with so it's reasonable to share a database server and overcommit CPU. With AWS it's a full VM so you have to stop the db server manually if you want to save money. Also OCI is a cheaper cloud than AWS as it has less brand recognition I guess. This feels a bit like Amazon is exploiting people's mental defaults. Lots of devs think AWS and Postgres are the only cloud+db combination that is reasonable to consider, and apparently they charge on that basis?
I haven't specced out what a hosted bare metal cluster would cost. You can't cluster Postgres in the same way anyway (multi-write master with full SQL, no sharding).
agumonkey 2 days ago | root | parent | prev |
time based snapshots are in datomic and also possible in other dbms (maybe via extensions)
for the rest i don't know
guskel 2 days ago | root | parent |
XTDB (inspired by datomic) also has bitemporal queries.
agumonkey 2 days ago | root | parent |
thanks I couldn't remember that name
agumonkey 2 days ago | root | parent | prev |
Wouldn't surprise me a bit. Thanks for the detailed comment.
bubbleRefuge 2 days ago | root | parent | prev | next |
In large scale business intergration platforms/apps, you have operational systems like SAP and and Oracle Service Cloud generate/stream raw or business events which are published to message brokers in topics ( orders, incidents, suppliers, logistics, etc). There the data is published , validated, transformed (filtered, routed, formatted, enriched, aggregated, etc) into other downstream topics which can be used to egress to other apps or enterprise data stores/data lakes. Data governance apps control who has access. Elastic search or Splunk for data lineage and debugging. you also have sbservability systems sandwiched in there as well.
082349872349872 2 days ago | root | parent | prev | next |
thoughts from 1992 (Gray+Reuter): https://news.ycombinator.com/item?id=42829878
agumonkey 2 days ago | root | parent |
very interesting, every generation foresees the same solutions somehow
reubenmorais 2 days ago | root | parent | prev |
LinkedIn supposedly: https://engineering.linkedin.com/distributed-systems/log-wha...
dakiol 2 days ago | root | parent | prev | next |
I always thought that the most flexible approach was:
- good old mutable relational tables
- a separate db to store immutable events (could be the same kind of db you use for business transactions or something fancy like big query)
I feel like mixing both into one has more disadvantages
dagss 2 days ago | root | parent |
This is of course the more common approach, I was aware that our approach is not usual which is why I posted.
You don't list the disadvantages so cannot respond to that. But I really like the code resulting from flipping it around. It just fits how I think and I now feel "unsafe" messing around with code that mutates state directly.
The programming style of those mutable relational tables is a bit like mutable objects in Java -- eventually people moved to more stateless code and immutable objects. The same shift doesn't have to happen in the storage layer, but it is what the OP (and I) argue for.
I really enjoy having the tools around to replay any object from history as part of the online backend. For instance, consider if there was a bug so that customer input was buggy in some timeframe. Instead of writing a job that looks through the history of each customer and tries to figure out if you should mutate the data, then roll out that ad hoc mutation while holding your breath -- you can add some rules when fetching the relevant events on lookup-time and change how they reflect the state; i.e. a change that is only read only and only changes deployed code and not data, and roll back by only rolling back the service not by reverting a data change.
oa335 2 days ago | root | parent | prev | next |
“ The main issue is "listening to new data in a SQL table". I wrote this code to achieve it in MSSQL (perhaps it is somehow built into postgres?): https://github.com/vippsas”
Postgres uses “publications” for this purpose. Clients can subscribe to a publication that gets updates to a given table.
fzeindl 2 days ago | root | parent | prev |
I have implemented something similar (updating projection from background and serving the projection automatically via REST) and wrote a high-Level article about it https://www.fabianzeindl.com/posts/the-api-database-architec...
agentultra 2 days ago | prev | next |
This is also known as event sourcing [0] and is a common pattern used inside of databases, in git, in lots of popular software.
I don't generally recommend it for every application as the tooling is not as well integrated as it is in an RDBMS and the data model doesn't fit every use-case.
However, if you have a system that needs to know "when" something happened in an on-going process, it can be a very handy architecture... although with data-retention laws it can get tricky quickly (among other reasons).
arialdomartini 2 days ago | root | parent |
I saw other times Git being related to event sourcing, but the argument is wrong.
Most of the VCSs before Git (RCS, CVS, SVN) used to store deltas and to rebuild the state reapplying them.
The very reason why Git took them by the storm is exactly because, on the contrary, Git does not store deltas but snapshots. Each commit is not there collection of the occurred chances but a complete snapshot of the whole project. Git is very efficient in reusing the blob objects to save space, but it’s still a whole snapshot. The occurred changes are not stored, and they are calculated on demand.
The very opposite of event sourcing, where it’s the state to be calculated and the occurred changes / events to be stored.
Git is really the demonstration that for code versioning state sourcing is way more efficient than event sourcing.
jdkoeck 2 days ago | root | parent |
Git is still event sourced, it’s just there is only one kind of event (a commit), and its payload is the whole state ¯\_(ツ)_/¯
arialdomartini 2 days ago | root | parent |
Eh eh, this is an interesting point of view, but it’s really not like this.
Take the case of the event of “deleting a file”. There has been an interesting discussion between Linus and the orher developers, when Git was being d initially esigned: some of them wanted to capture and track this event. Linus firmly rejected the whole idea of track events, providing very solid arguments
mrkeen a day ago | root | parent |
jdkoeck is right.
Storing-events-and-calculating-state is the dual of storing-state-and-calculating-diffs. It's an implementation detail, and if it works well, who cares?
I personally use git as an event log.
When I commit something, I write out a description of what I changed, not the current state of the whole repository (even if that's what a git hash is supposed to represent).
When I pull or push, I send and receive only diffs, not the state of the multi-GB repo.
When I rebase, I grab the latest diffs that went into master and put them under my feature branch diffs.
When I run 'git show {commit}', I see a diff, not a state.
It doesn't matter which way it is in any case: the evergreen debate about fast-mutations vs slow-immutations is that it's easier and faster to just mutate someone's bank balance in-place, rather than the slow way of appending transactions, or just quickly delete a bad transaction rather than slowly append a refund event to the log.
Taking that debate over to git land, "it would be faster" to just mutate the state of the codebase in some single, central location rather than the "slow way of sending commits" back and forth - regardless of whether those commits are technically states or diffs.
joshlemer 2 days ago | prev | next |
The thing I always get stuck on with these techniques is, how do you handle transactions which perform validations/enforce invariants on data when you’re just writing writes to a log and computing materialized views down the line? How can you do essentially, an “add item to shopping cart” if for example, users can only have max 10 items and so you need to validate that there aren’t already 10 items in the cart?
adamcharnock 2 days ago | root | parent | next |
This all sounds to me very close to the event-sourcing/CQRS/DDD area of thinking. In which case you look at it in two parts:
- Event firing: Here is where you fire an event saying that the thing has happened (i.e. item_added_to_cart, not add_item_to_cart). Crucially, this event states the thing has happened. This isn't a request, it is a past-tense statement of fact, which is oddly important. It is therefore at this point where you must do the validation.
- Event handing: Here you receive information about an event that has already happened. You don't get to argue about it, it has happened. So you either have to handle it, or accept you have an incomplete view of reality. So perhaps you have to accept that the cart can have more than 10 items in some circumstances, in which case you prompt the use to correct the problem before checking out.
In fact, this is typically how it goes with this kind of eventual-consistency. First fire the event that is as valid as possible. Then when handing an 'invalid' event accept that its just got to happen (cart has 11 items now), then prompt the user fix it (if there is one).
Not sure how helpful this is here, but thought it a useful perspective.
williamdclt 2 days ago | root | parent | prev | next |
You're not "just" writing to a log. You always need a view of the state of the world to take decisions that enforce invariants (in this case, the "world" is the cart, the "decision" is whether an item can be added).
What you'd do is, when you receive the "addToCart" command, construct the current state of the cart by reading the log stream (`reduce` it into an in-memory object), which has enough data to decide what to do with the command (eg throw some sort of validation exception). Plus some concurrency control to make sure you don't add multiple items concurrently.
For reading data, you could just read the log stream to construct the projection (which doesn't need to be the same structure as the object you use for writes) in-memory, it's a completely reasonable thing to do.
So at the core, the only thing you persist is a log stream and every write/read model only exists in-memory. Anything else is an optimization.
DDD calls this "view of the world" an "aggregate". Reading the log stream and constructing your aggregate from it is usually fast (log streams shouldn't be very long), if it's not fast enough there's caching techniques (aka snapshots).
Similarly, if reducing the log stream into read models is too slow, you can cache these read models (updated asynchronously as new events are written), this is just an optimization. This comes at the cost of eventual consistency though.
mrkeen 2 days ago | root | parent | prev | next |
This kind of thing only works if your whole universe belongs to your database.
Transactions and conditional-updates work smoothly if it's your customer browsing your shop in your database - up to a point.
But I usually end up with partner integrations where those techniques don't work. For instance, partners will just tell you true facts about what happened - a customer quit, or a product was removed from the catalogue. Your system can't reject these just because your Db can't or won't accept that state change.
nejsjsjsbsb 2 days ago | root | parent | prev | next |
You don't use it for that sort of thing.
But if you did you'd need an aggregatable (commutative) rule.
Like you can't aggregate P99 metrics. (To see why, it is similar to why you can't aggregate P50. You can't because a median of a bunch of medians is not the total median)
So you measure number of requests of latency < 100ms and number of requests. Both of these aggregate nicely. Divide one by the other. Now you get Pxx for 100ms. So if your P99 target was 100ms you set your 100ms target to 99%.
Anyway you'd need something like this for your shopping cart. It is probably doable as a top 10 (and anything else gets abandoned). Top 10 is aggregatable. You just need an order. Could be added to cart time or price.
anonzzzies 2 days ago | root | parent | prev | next |
I was thinking about this as well as these streaming things become more popular. Would you write add-to-cart events and those trigger add-cart events; the latter containing an valid field which will become false after the 10th add-cart. So after that you remove-from-cart which triggers add-cart which then becomes valid again < 11 items? And transactions similarly roll back by running the inverse of what happened after the transaction started. I'm just thinking out loud. I understand you probably wouldn't use this for that, but let's have some fun shall we?
dagss 2 days ago | root | parent | prev | next |
All systems I have worked on like this has some concept of a version number for each entity / aggregate.
So you get that the account_balance was 100$ on version 10 of the account, and write an event that deducts 10$ on version 11.
If another writer did the same at exact same time, they would write an event to deduct 100$ at version 11. There will be a conflict and only one version will win.
This is exactly like any optimistic concurrency control also without event as the primary storage.
Didn't check if the system linked to supports this, I guess it might not? But this primitive seems quite crucial to me.
swiftcoder 2 days ago | root | parent | prev | next |
I assume that shopping cart limit is a made-up example, but I'm curious what preconditions are you actually enforcing in the real world via DB transaction rollback?
eyads 2 days ago | root | parent | prev | next |
TLDR; You evaluate your preconditions and invariants before the event is published, using the current state of the aggregate.
Here's how that looks like in a DDD world:
* An aggregate is responsible for encapsulating business rules and emitting events
* An aggregate is responsible for maintaining the validity of its own state (ensuring invariants are valid)
* When a command/request is received, the aggregate first rehydrates its current state by replaying all previous events
* The aggregate then validates the command against its business rules using the current state
* Only if validation passes does the aggregate emit the new event
* If validation fails, the command is rejected (e.g., throws CartMaxLimitReached error)
Example flow: Command "AddItemToCart" arrives
>> System loads CartAggregate by replaying all its events
>> CartAggregate checks its invariants (current items count < 10)
>> If valid: emits "ItemAddedToCart" event. If invalid: throws CartMaxLimitReached error
cwalv 2 days ago | root | parent | prev |
You write the 'add item' event regardless, and when building the 'cart' view you handle the limit.
philbo 2 days ago | root | parent | next |
Alternatively "invalid cart" could itself become an event.
dagss 2 days ago | root | parent | prev | next |
Well, assume the non-overdraftable bank account example instead then, what do you do then?
RedShift1 2 days ago | root | parent | prev |
Sounds like an easy way to run out of storage space
dang 2 days ago | prev | next |
Turning the database inside out (2014) [video] - https://news.ycombinator.com/item?id=41664271 - Sept 2024 (1 comment)
Turning the database inside-out with Apache Samza (2015) - https://news.ycombinator.com/item?id=13581096 - Feb 2017 (30 comments)
Turning the database inside-out with Apache Samza - https://news.ycombinator.com/item?id=9145197 - March 2015 (64 comments)
sriku 2 days ago | prev | next |
https://materialize.com/ provides another approach, based on "timely dataflow" (https://timelydataflow.github.io/timely-dataflow/) - originated at MS.
Joker_vD 2 days ago | prev | next |
> Databases are global, shared, mutable state. [...] However, most self-respecting developers have got rid of mutable global variables in their code long ago. So why do we tolerate databases as they are?
Because the world itself is a global, shared, mutable state (which, incidentally, is also a single source of truth) and databases were invented to mirror it (well, relevant parts of it) 1-to-1, or close to it. This style of "we use the database as a proxy of the physical world itself" is still pretty common, see e.g. the example with the shopping cart somewhere else in these comments.
reubenmorais 2 days ago | root | parent | next |
Seeing the world as mutable is a matter of perspective, if you explicitly model time as a dimension it can instead be seen as a sequence of transitions from immutable state to immutable state, an accumulation of events over time, which fits the log abstraction perfectly.
Swizec 2 days ago | root | parent | next |
> if you explicitly model time as a dimension it can instead be seen as a sequence of transitions from immutable state to immutable state, an accumulation of events over time, which fits the log abstraction perfectly
I worked with a feature that used this approach once. It even made sense for the feature (an immutable history log of patient chart data). It was absolute hell to work with. Querying current state, which was 99% of the usecases, was cumbersome and extremely slow.
Turns out doctors rarely care about any of that immutable history. They just wanna know what’s up right now. In their ideal world, you’d re-answer all the same questions 30 seconds before walking in the door.
Turns out a combination mutable table of current state + derived immutable log/snapshot/audit table works much better for most things.
mrkeen 2 days ago | root | parent | next |
> I worked with a feature that used this approach once.
You work with many features that use this approach.
Izkata 2 days ago | root | parent |
From this:
> was cumbersome and extremely slow.
I'm pretty sure those sit somewhere in between the two sides GP was describing: All of your links have "current state" as the interface and changes are logged as they're applied. The system they worked with apparently has the log as a first-class citizen and no "current state" interface, while what they wanted was a "current state" interface with snapshots.
mrkeen a day ago | root | parent |
Understood, but when we look at all the databases and source-control products, we're standing on the customer side. We get to experience stuff that just works, because the people on the development side opted for write-AHEAD-log and/or copy-on-write, as opposed to write-afterward-log or write-but-store-backups-in-case-something-goes-wrong.
When we're on the developer side of things, we get to choose whether to CRUD or to go eventy. If we want to build systems as good as those DBs, we should do what they did (events) rather than what they say (insert/update/delete). It's unfortunate that there's not much upstream tech to help us with this (it's pretty much just kafka) and that people who choose to use kafka get accused of 'resume-driven-development', etc.
>> Querying current state, which was 99% of the usecases, was cumbersome and extremely slow.
It's an easy fix if the events are there. Just cache the current state. It sounds glib but it's one of the first-principles of event-sourcing that makes me choose it over CRUD. If your events are immutable, they can be shared. If they can be shared, they can be folded into a fast-current-state-view. If they are instead mutable (or non-existent)), you can't share them unless you also have a plan on how to sync them (which no-one gets around to and probably overlaps on some classical impossibility result like two-generals.)
robertlagrant 2 days ago | root | parent | prev |
You can use snapshotting to increase performance of the sequence of transitions approach.
crabbone 2 days ago | root | parent | prev | next |
> perfectly
That's a joke, right? If you tried to live in this world, as a human being, not in some abstract database-building sense, you'd be completely lost in the first second of your existence, and would probably die in minutes because your body parts would "forget" how to function properly.
We make sense of the world because we have "object permanence", which requires the concept of larger things made up of components, with multiple possible configurations. This object identity is what allows to understand that you can do things like "type on a keyboard" (a keypress that changes the physical configuration of the keyboard doesn't destroy it, it's still a keyboard, just slightly different, but functionally equivalent to the one you had a moment ago).
If all you can do is transitions, you don't know upon transitioning if you still have a keyboard you are typing on or not. This also means that if you want to be able to function somehow in this world, then after each transition, you'd have to reassess all properties of all interesting aspects of the world just to make sure they are still there.
Only in very restricted number of cases can you use append-only log as a useful model. And you'd be bending over backwards with this approach when modeling trivial things, like a realtor business or an online book store.
reubenmorais 2 days ago | root | parent |
> If all you can do is transitions, you don't know upon transitioning if you still have a keyboard you are typing on or not. This also means that if you want to be able to function somehow in this world, then after each transition, you'd have to reassess all properties of all interesting aspects of the world just to make sure they are still there.
You're assuming that at every timestep all transitions are equally valid or equally likely. That's not a given, you can and of course should carefully model which transitions your system allows or not. In real life this model is Schrödinger's equation, as best as we know it. In your information system it can be whatever you design it to be.
crabbone 2 days ago | root | parent |
If you want to build a stateless system with transitions, you are limited to regular expressions (finite automata).
I don't think you'd like to write real-life applications using a regular language. It might be interesting as an experiment, but life without recursion or potentially infinite loops sounds bleak... Just imagine the struggle of creating some generic containers like trees / hash-tables.
I don't think it's impossible, and may be even interesting to see how far one can take such a language to make it useful, but enjoyable this is not...
hahn-kev 2 days ago | root | parent | prev |
Except lots of applications need to be able to forget things, similar to how things can be destroyed in the real world
fnordsensei 2 days ago | root | parent |
Forgetting what you had for lunch last Friday is not necessarily the same as changing what you had for lunch last Friday.
You could, for example, throw away the encryption key for that fact. What you had for lunch is now inaccessible, but the fact remains unchanged.
mwgalloway 2 days ago | root | parent | prev | next |
The "world" can just as easily be conceived of as being composed of discrete immutable facts rather than global mutable state. Either way, I kind of don't think that unfalsifiable ontological premises should be an important factor in system design.
Joker_vD 2 days ago | root | parent |
While the world can tentatively be conceived to be like that, it is not nearly "just as easily". Lots of those "immutable facts" can't be realistically discovered: e.g. good luck recovering "this blackboard had a drawing of a cat and a bird until five minutes ago when it was cleared with a wet sponge" if you weren't there in time. The approach with mutable state fosters this destructible nature of many things upon you and makes you cope with it, somehow.
mwgalloway 2 days ago | root | parent |
Ok, but why should I even care about what was on this hypothetical blackboard? Do you have any real world examples with business or technical significance where this "cope with it, somehow" approach to mutability is a clear win?
draw_down 2 days ago | root | parent | prev |
Just get rid of side effects, man! (Turns out side effects are the whole reason we write code, whoops)
crabbone 2 days ago | prev | next |
> So why do we tolerate databases as they are?
Because they reflect the way we understand the world? We understand that things are made of smaller things, and that sometimes the smaller things making up larger things may change, while the larger thing stays the same? The idea that as soon as one component changes the whole thing needs to be discarded and rebuilt from ground up is insane and creates a lot of problems. It's absolutely not worth it to try to redefine the way we deal with the world to get the benefits of stateless code.
Making database stateless is making it worthless. The world has a state, and if you want a useful program, it needs to accept this "unfortunate" aspect of the world. The alternative is the world where as soon as you finish drinking your coffee, your cup, your table, your kitchen, your credit card history, your grandparents and all planets in the solar system disappear, and have to be built fresh. But you wouldn't know about it, because your memory of how the world used to be would disappear too.
esafak 2 days ago | prev | next |
swyx 2 days ago | prev | next |
Xenoamorphous 2 days ago | prev | next |
Does any one have some resources where a real, practical example is implemented? Because I can only find fairly theoretical resources but not real world examples.
Say, like how some simple CMS would work with a datastore like this. What does the event to update the headline of an article look like? How are integrity constraints enforced, e.g. an article can't reference an author that doesn't exist? Things like that.
briankelly 2 days ago | root | parent |
Similar enough is double entry bookkeeping and generally I think event sourcing is more common for fraud detection, log analysis for security, etc. - use cases where how the application got in its state is as important as the state.
belter 2 days ago | prev | next |
This is Aurora...
mrkeen 2 days ago | root | parent |
This is the technique that developers used to build Aurora, not Aurora the end-product.
Customers writing code against Aurora are still doing plain ol' destructive CRUD mutations "now".
Event-sourcing is write-ahead-logging is CQRS is journaling-file-systems is Git-reflog is persistent-data-structures is copy-on-write. It's all good stuff and is decades old.
belter 2 days ago | root | parent |
> This is the technique that developers used to build Aurora, not Aurora the end-product.
It's what I meant. But I am behaving like an LLM and economizing on tokens... :-)
chikere232 2 days ago | prev | next |
Isn't this essentially how a modern transactional database works anyway? All mutations end up in the Write Ahead Log (WAL) and you can replicate or back up that to be able to recover the state at a point in time?
mrkeen 2 days ago | root | parent | next |
Two points:
* Technically the data is probably there, but I really don't think you want back-up ops invoked by your REST call to /getUserHistory/. Is it even possible to mix old data and new data within the same SQL expression?
* The DB is still a god object at the centre of your system. It doesn't give you consistency across partner systems and end users. If a partner sends the event CustomerBanned(2025-02-04, 1234) and you try to translate it into CRUD with 'UPDATE Customer SET Banned=True WHERE id=1234' it could fail (or worse - be rejected by an invariant for "data integrity" reasons) and then it's gone. If you just blindly write the event, then you always know that fact about customer 1234 in any future query.
zbentley 2 days ago | root | parent |
> If you just blindly write the event, then you always know that fact about customer 1234 in any future query.
Unless the write times out or the DB is down for maintenance when the event arrives.
Sure, you could block acknowledgement of the event until the event log receives it, but can your DB handle synchronous write volume from however many people are out there sending events? If your RPC servers listening for events are unavailable, do you trust event senders to retry when they're back?
Down that road lies "let's put every event in a fast message bus with higher insert volume and availability than the database, and feed that into the DB asynchronously", hence Kafka and friends.
mrkeen a day ago | root | parent |
> Down that road lies "let's put every event in a fast message bus with higher insert volume and availability than the database, and feed that into the DB asynchronously", hence Kafka and friends.
Yes, this is the way. Which is why it's not
>> Isn't this essentially how a modern transactional database works anyway
and needed a comment.
isbvhodnvemrwvn 2 days ago | root | parent | prev | next |
One difficult to replicate thing is visibility rules and rollbacks, with postgres you can abort and your changes are hidden, no such luxuries worth this architecture unless you make it very complex with partial states, drafts or something similar.
timacles 2 days ago | root | parent | prev |
Thats how the database works internally, thats not how the interface to the database works.
Apps dont know about the WAL log, and the database has internal logic for processing the WAL. Things like transactions, rollback etc...
anacrolix 2 days ago | prev | next |
Isn't this what you get with Datomic?
fungiblecog 2 days ago | prev | next |
agumonkey 2 days ago | root | parent | next |
My first thought. And then I realized that the talk is from 2015, so I wonder if there was cross pollination between people at the time (datomic was released in 2012 according to wikipedia, so it's plausible it predates Martin's ideas but I don't know)
amelius 2 days ago | root | parent | prev |
The concept is even older than that.
guskel 2 days ago | root | parent |
Yeah, like 15th century old.
blacklion 2 days ago | root | parent |
More like 5th or 6th century BCE ;-)
kragen 2 days ago | prev | next |
It seems like event sourcing keeps gaining mindshare.
alecco 2 days ago | root | parent |
Event sourcing is a PITA, from experience.
Something like Datomic makes a lot more sense:
kragen 2 days ago | root | parent |
Thanks! Does the article accurately describe the PITA in your experience? Because it seems to say that it's separable from the core architectural principle of event sourcing.
alecco a day ago | root | parent |
In my experience most event-sourcing was implemented as storing versions of objects (it came from the OOP camp). All the consistency checks had to be done manually in imperative code across countless classes. Many large investment banks use it. And then all the actual DB stuff has to be exported to SQL or other actual database engines to be processed properly.
2 days ago | prev | next |
jackeyzhang 3 days ago | prev | next |
Awesome,very useful
mgaunard 2 days ago | prev |
Stopped reading at the word "Kafka".
chikere232 2 days ago | root | parent |
dagss 2 days ago | next |
We did this style on top of plain MSSQL. Each event would have a SQL table which is the primary storage. Then we have workers that listens to new data in tables and updates projections we needed. (Sometimes DB triggers but mostly async workers.)
The main issue is "listening to new data in a SQL table". I wrote this code to achieve it in MSSQL (perhaps it is somehow built into postgres?): https://github.com/vippsas/mssql-changefeed
In my experience this approach is beautiful; as Martins says, our backend code is mostly stateless and functional these days, why have mutable objects in the DB? And the approach is extremely useful for dry-running business logic etc
But we didn't like the prospect of adopting Kafka wholesale. Having all the data in a SQL DB is extremely convenient for debugging, and since we already used SQL it was a smaller change that was done first where it made most sense and then spread out.
It would be great with more DB features targeting this style. Explicit partition event tables (kafka-in-SQL), and writing a projection simply as a SQL query which is inverted into an async trigger by the DB would be awesome. (MSSQL has indexed views, but it cannot be done online...)
Materialize is the DB I know about in this territory.