Approaches to live collaborative editing
Table of Contents
A team at a large company is tasked with preparing a proposal for a client by the next morning. The company really needs to land this client, and the proposal is the key to closing the deal.
Lots of staff are working on this proposal. The marketing lead updates the executive summary, the finance manager revises the pricing section, and the project manager adjusts the delivery timeline.
As the deadline approaches, the team rushes to assemble the final version. Each person adds their sections into the final document in a mad scramble.
Unfortunately, in the confusion, an outdated pricing table is sent to the client. This jeopardizes the proposal and forces the team into last-minute damage control with the client.
The worst part is that this could have been avoided with better tooling that truly supported collaborative editing.
So why live collaborative editing? #
When you’re working on a document with multiple people, you’ll often need to share the document amongst each other. No matter how this is done, this requires some manual push/pull model and some way to know the true state.
That’s clunky. A better way is to use something like Google Docs, where you and many other people can edit the same document at the same time.
Live shared editing is great, but how? #
This is the million dollar question. It’s not a simple problem to solve. However, there have been decades to solve it and two approaches have emerged: Operational Transformation and Conflict-Free Replicated Data Types (CRDTs).
Let’s look at those approaches.
Operational transformation (OT) #
The approach of operational transformation was first introduced in 19891.
There are two concepts at work in OT:
- An operation, which is the basic action that modifies the document (like a text insert or delete)
- A transformation modifies an operation (usually against another operation) to account for concurrent edits
Each person involved in the collaboration has their own copy of the document. A person makes a change operation, their changes are broadcasted, transformations are applied, and the document is updated. OT ensures that all editors see a consistent document state at the end this process.
Although not strictly required, almost always there is a central server that coordinates the changes.
Some clarification on transformations #
There are two categories of transformations.
- An inclusion transformation adjusts an operation to include the effects of another operation (like two inserts happening in sequence)
- Similarly, an exclusion transformation adjusts an operation to exclude the effects of another operation (like an insert that happens after a delete)
Conflict-free replicated data types (CRDTs) #
CRDTs came about a bit more recently than OT, around 20112. These are data structures that can be used to hold replicas of data in a distributed system.
The goal of these structures is to provide strong eventual consistency, so that after all updates are propagated and processed, all replicas will match.
These rely on at least the following mathematical properties:
- Commutativity: changes can happen in any order and have the same effect
- Associativity: changes can be grouped in any fashion without changing the result
There are two flavors of CRDTs.
State-based CRDTs propagate a replica’s entire state to other replicas. A merge function is used to combine states, which requires the prior mentioned math properties and idempotence.
On the other hand, operation-based CRDTs transmit operations instead of full states. These aren’t required to be idempotent as long as the other math properties are fulfilled.
In practice, operation-based CRDTs are more common because transmitting increasingly large states at high scale is inefficient.
asynchronously User1->>User2: Broadcast op1 User2->>User1: Broadcast op2 User1->>User1: Apply op2 using CRDT rules
(position adjusted based on concurrent ops) User2->>User2: Apply op1 using CRDT rules
(position adjusted based on concurrent ops) Note over User1,User2: Both converge to:
"Hello splendid amazing world"
Tradeoffs #
Server dependence #
OT typically requires a central server, whereas CRDT-based systems can work without one. Centralization can help with keeping things organized, whereas not needing a central server helps with scalability.
Convergence #
Both OT and CRDTs ensure that local states will match and have an eventual assurance. OT will provide stronger semantic guarantees due to how the transformations are applied, so it may appear slightly more convergent in practice.
Complexity #
Loosely speaking, OT needs more complex transformations but ends up with a simpler state. CRDTs, on the other hand, have simpler merges but require more complex data structures.
Where are things headed? #
Most systems for collaboration have primarily used OT, especially because of the usefulness for rich text editing like Google Docs. The strengths and weaknesses of OT have been studied, so it’ll remain a known quantity.
CRDTs are being increasingly used in offline-first apps. This approach being newer means we’re learning a lot more about ways to make CRDTs more efficient and to mitigate the weaknesses.
Over time, it seems like OT will still be a solid choice for rich text or schema-centric editing, and for other apps, OT and CRDTs will be used in a hybrid fashion depending on the moment in time.