How we went about prosemirror-collab at the New York Times

My colleague Sophia and I just wrote up some of our team’s experience building a collab ProseMirror editor that we now use in article production at The New York Times:

…we’d be happy to talk here about any of the more technical bits from this for a ProseMirror enthusiast audience :slight_smile:

3 Likes

Nice! Thanks for the shout-out.

I’ve built most of the realtime collab layer for our PM implementation. I’d certainly be interested in learning more about whether you did any work to sync documents before saves, or if you used rollups/checkpoints at all. We do, because our documents are very-long-lived, and being absolutely certain that stepwise edits and rolled-up documents are compatible has proven non-trivial.

being absolutely certain that stepwise edits and rolled-up documents are compatible has proven non-trivial

We ran into this exact set of problems in a previous, pre-collab implementation of step storage for showing a version history; it was a nightmare! When we began building collaborative editing, we chose to treat steps as the source of truth for the state of the document, and the history of “rolled-up documents” as a materialized view of steps. While we no longer run into synchronization issues between steps and point-in-time documents, it’s been helpful to design the point-in-time documents with the assumption that we could blow them away and re-index them at some point if need be.

As far as how we solved the sync issue itself: any database with transactions support should guarantee that updates made to the “rolled-up documents” are current and consistent with any step insertions that may have happened while an update to a document is in progress. We also chose to use the recent-most rolled-up document (+ any not-yet “harvested” steps) as the starting point when loading a collaborative editor, which further solidified this philosophy: the few errors we had early on with out-of-sync documents were quickly ironed out because they otherwise blocked loading the collaborative editor altogether.

1 Like

Two more questions:

  1. How much editing goes into your PM docs? Do you add rich content directly? Or are you doing mostly simple text editing?
  2. Do you do all “validity” checks at the client level? Or do you send steps to an application server running PM and do saves at that level?
  1. How much editing goes into your PM docs? Do you add rich content directly? Or are you doing mostly simple text editing?

It’s very much a rich text affair — lots of leaf nodes w/ somewhat complex schema shapes.

  1. Do you do all “validity” checks at the client level? Or do you send steps to an application server running PM and do saves at that level?

I’m not sure what you mean by validity, but Firestore is a client-facing database, so the steps are presumed to be valid by the time they are inserted into the database. They are effectively double-checked because there are server-side processes that consume steps and apply them to a shared/persisted document as well, but the insertion is determined by the client-side code.

I was asking about checkpoint/save validity. You mentioned that you’ve resolved your issues with steps vs persisted documents, and I was wondering if that was done at the server level or the client level. For various reasons we chose to do our saves from the client layer. It sounds like you’ve done things the other way, with your persistence done by a server layer.

I’m not sure it actually changes the picture that much, but as we’re still seeing occasional issues I figured I’d ask.

Any chance you could talk about how you represent the other users cursors in the editor?

We (the FT) are currently building that “collaborative cursor” functionality and are finding that using a widget Decoration for the cursor and an inline Decoration for any text selection has caused a few confusing things to happen in the browser like the browser cursor jumping around as decorations are moved around the document by prosemirror.

1 Like