Rails Gem for Collaborative Editing with ProseMirror

I’m working on a Rails gem to make collaborative editing with ProseMirror drop-in for Rails apps.

Currently, it is an extremely early prototype (notably, no automated tests yet), but I wanted to get feedback on the current implementation. I’ll be working to make it more stable so we can use the gem in production.

The gem uses Rails ActionCable to handle the communication (websockets) and runs ProseMirror in a NodeJS child process to apply steps server-side.

I use a modified version of the prosemirror-collab algorithm, basically:

  • Steps are grouped into “commits” before being sent to the server.
  • Rather than use a client id, the client tracks commits by attaching a random “ref”. This allows the client or server to merge the steps within a commit without affecting the client’s internal representation (related to this post). It also helps prevent bad-acting clients from causing trouble
  • Due to the gem’s reliance on background processing for broadcasting commits & applying changes to documents, it’s possible a client may receive commits out of order. The client side code handles this by queueing future commits until prior ones are received.
  • I’ve added lodash's throttle function so that the server is less likely to be overwhelmed by a single client. I will likely want to add some form of rate-limiting on the server-side.

Currently, there is no handling for connection interruption. This should be easily solvable by simply reconnecting & recommitting on disconnect.

I also plan to use prosemirror-recreate-steps to handle a client who goes offline for a long period with uncommitted changes. The gem stores old steps, but a client could go offline for long enough that old steps were thrown out - in this case, it would be important to not only recover the changes, but support some type of interactive merging. Again though, this likely would require UI which would have to be implemented by library users. How has this been handled by others?

We’ve been talking internally about building some sort of diffing tool and I know the New York Times has some sort of internal tool for this purpose. We’ve never gone beyond sketches and discussions and the NYT tool is, as far as I know, still internal, but I’d be all ears for what you land on in that regard. We’re using PM on top of a Rails app at MPR/APM, but we have avoided the collab server portion entirely for now.

I think we got as far as, “we should wire up something like https://github.com/tnwinc/htmldiff.js (or one of its slightly more recently updated forks) and then figure out a way to convert the <ins> and <del> tags into steps to be applied against one side or the other”.

So, this is what I understand after a bit more than two weeks of working with ProseMirror:

There’s no built-in support for diffing in ProseMirror. Its support for interactive collaboration relies on rebasing & best-guess conflict handling. That’s fine for real-time collaboration because a user is in the loop and conflicts are generally minor (only a few seconds or so), but when documents have diverged significantly, you probably want to make conflict resolution interactive.

Here’s the blog post where ProseMirror’s author describes the library’s approach to collaborative editing. prosemirror-collab’s OT algorithm isn’t ideal and wastes a lot of bandwidth retransmitting steps unnecessarily - but it’s basically possible to implement any OT algorithm on top of prosemirror.


For rails-collab, I rewrote the collaboration algorithm based on Apache Wave/Google Doc’s approach to OT. I’ve actually extracted that into an npm module, prosemirror-collab-plus.

Here’s the blog post from Apache Wave that really led me to “get” the simplified OT algorithm Google uses.


If you’re not doing collaborative editing, you need to somehow generate the diffs yourself. Manuscripts app released a package to handle this - https://gitlab.com/mpapp-public/prosemirror-recreate-steps. It claims to be able to recreate Steps that occured between two documents and then do an interactive merge between groups of steps. While I haven’t tried it myself, I’ve skimmed the source and it seems very well written.

We’ll likely be using prosemirror-recreate-steps to handle syncing after offline editing & branching

However, we decided to avoid interactive merging if possible - the hard part is designing a UI/UX for conflict resolution that makes the process intuitive & painless for non-technical users. Git conflicts are a headache for developers - it’s not something that we want to force our writers to deal with, if possible.

For that type of thing, we’re much more into Multiplayer. It’s extremely intuitive for non-technical users as conflicts are minor (thus easily fixed) and basically only happen when two people edit the same section of a document.

Here’s another good resource that describes Figma’s approach to “multiplayer editing” and why they picked it. It’s not quite OT and their approach is (self-admittedly) poorly suited for a writing environment, but they make the case for realtime editing over other approaches quite well.

Understood, our Newsroom is our biggest pool of users and has to deal with things like NPR and AP stories that need minor cleanup or local context post-import that then change upstream so we’re probably stuck implementing a diffing view at some point no matter what, but I can see the appeal of skipping it entirely.

1 Like

That makes a lot of sense. Implementing change tracking/blames is, in my opinion, easier, and prosemirror has a few examples of how you could do that. However, it doesn’t solve collaboration. https://prosemirror.net/examples/track/.

Also anything with @johanneswilm’s signature on it tends to be well done. Fiddus Writer showed early on that a really full-featured writing apps could be built using Prosemirror.


Yes! Actually, his comment is how I found out about prosemirror-recreate-steps

So, my startup is actually now weighing dropping the gem entirely (it adds a good amount of load to our ActionCable servers & database - the gem relies on database locks and deserializing then serializing the document for every change so that it can work over a server cluster) and replacing it with a dedicated service that we’re considering offering as a SaaS solution (each active document gets its own V8 isolate). If we get further along, would you want to try it out in your app?

Since y’all are non-profit, we’d be willing to let you use it for free/at cost (depending on the cloud cost) in exchange for feedback if we do get further along

1 Like

It probably doesn’t mesh well with our earlier (possibly limiting) architectural decisions such as storing entire documents in our revision history vs pm steps. Right now all the PM code is pretty well isolated from the Rails side (it was actually a separate repo entirely for a bit) and we’ve had our own challenges running Node side-car processes for handling import (we landed on running a separate service that just takes an HTML string and returns a PM JSON doc). We also don’t have Action Cable going. Still cool to know a complete PM/Rails integration like this exists.

Yup - this is exactly what we had to do in the gem. We built a very simple RPC bridge that communicates with a pool of NodeJS processes. The server-side JS for the gem is actually in a separate repo.

Steps need to get stored to a) catch clients up who feel behind due to a network issue or b) map a step targeting a past version of the document to apply to the current version (“rebasing”). But steps are really only useful short term - for long term history, your approach of storing the entire document is likely better for long term diffing - having to rollback the document via steps sounds painful.

The gem approach stores steps in the database (has to deal with clients who are connected to any Actioncable server and doesn’t track step acknowledgement of connected clients - so it doesn’t know which steps are actually useful). In reality, for collaborative editing, you only need to store up to the oldest unconfirmed steps.

Our standalone service keeps steps in memory and (at least right now) only persists the most recent version of the document. Basically, an app starts a session by providing a serialized document to our server. We spin up a V8 isolate and clients use a session key to connect to our gateway. Later, your the app issues an HTTP request to get the edited document.