Hints for building a version history

Hi Marijn, I’m working on building a version history system similar to the one shown in NYTs article and wanted to run my plan by someone who better understands prosemirror:

Currently my plan is to:

  • Collect steps as they are applied to the document server-side in our collaboration server
  • Store versions with a snapshot of the document at the start of the version and the steps applied in that version
  • Use the snapshot and the steps applied to build a diff document with any additions wrapped in an addition mark and any deletions instead wrapped in a deletion mark

Thorny problems seem to be:

  • How to apply some step B after a deletion step A has caused it’s from and to positions to point to incorrect locations?

    Theory: Use the inverse mapping of A to get B-before-A is applied then use the mapping from A-deletion (the A with deletion marks applied instead of text deleted) to map B-before-A into B-after-A-deletion?

  • How to decide when to create a new version versus appending to an existing one?

    Theory: Use recency (within 5 mins of last update to a version), the author is the same, and how close the step is to the steps in the version

  • How to display changes to NodeViews?

    Theory: Highly dependent on each NodeView. Using decorations we can tell the node view of it’s diffs then each NodeView would need to display those differences on a case by case basis. Simplest approach, just wrap the whole NodeView in a “changed” style

Does this sound like a reasonable approach?

You can indeed map steps to get this kind of thing to work. Doing so can get a bit subtle—you have to map steps through the entire ‘route’ the document took to get from the thing that the step originally applied to to the version you want to apply it to, and if that route involves both step A and the inverse of step A, you have to mark those as mirroring each other in the Mapping to get proper results. But I’ve managed to do all kinds of funky things with this, and it usually works great.

You basically come up with your own heuristics. Using time works. Could also do extremely complicated things with time between steps and how close the steps are together, but initially slicing by blocks of time should be good.

If the node views have local state that’s not represented in the nodes they display, yeah, you’ll have to take care to keep them synced with your state—but I don’t think this problem is much different from that of displaying them with the right state in the first place, without version history. (It is probably a good idea to recreate an editor state when the user skips to a specific version, so the code that initializes your state and node views should apply as normal.)

Thanks for the reply. I’ll return with a show post and hopefully a write up once we’ve got something properly done.