Rebasing when steps partially overlap

I am trying to understand rebasing better (BTW, setMirror seems not documented).

So if I have confirmed steps [A, B, C] and unconfirmed step E, then if I get D to be before E, rebasing does:

  • Apply inverse of E.
  • Apply D.
  • Apply E mapped over [D].

I then get [A, B, C, D, E]. I understand this.

But if I have a more complicated example, where I have [A, B, C] confirmed, and unconfirmed [E, F]. And I get [D, E'], where E' is the same as E, only that it has already been rebased on top of D. In my case I can get something like this because I am trying to get my clients to share some state directly between each other, before they get to confirm it on the server. Let’s assume I can detect that relation between E and E'. My question is if there is a better way to do a rebase in such case. The straightforward approach based on existing code would be to do:

  • Apply inverse of F,
  • Apply inverse of E.
  • Apply D.
  • Ignore E'.
  • Apply E mapped over [D].
  • Apply F mapped over [inverse of E, D, E mapped over D].

But it would be slightly easier for me to implement this if I could start with [A, B, C, D, E'] and somehow rebase F on top of that. I am not sure how to map F in such case.

Have you read this blog post yet?

Oh, yes. This was one of the firs things I read and why I got attracted to the ProseMirror in the first place. I have also read the transform guide and a large part of the source code for all this.

Are you pointing to argument about centralization? So I agree generally with all that and I am using a central server. But I am also trying to add more interesting modes of collaboration between users than traditional real-time editing and this is why I am exploring such scenario of rebasing as described above.

No, I was just making sure you had seen the example in that blog post. I don’t entirely understand the scenario. Do you have kind of multi-layered synchonization design, where a group of clients synchronizes locally first and then globally?

It can help to see the documents as a tree. So in this case, I guess you have

doc0 --A--> docA --B--> docB --C--> docC
    `--D--> docD --E'--> docE'
    `--E--> docE --F--> docF

And you want to move D, E, and F to apply after docC. To map a step from one start document to another, you have to move it through these arrows (mapping back when going against the arrow, and forward otherwise) from its actual start doc to the desired start doc. So you could map F through [-E, D, E’] and then apply it to docE', but I’m not sure that helps a lot, since you eventually need to expand the branch at the top from docC—it seems more useful to first map D there, and then E and F directly, ignoring E’.

Do you have kind of multi-layered synchonization design, where a group of clients synchronizes locally first and then globally?

Yes, kinda. It still all happens through the server. But the idea is that a working group can fork the main document and work in real-time on the fork, independent from the main document, and then once it is ready they can merge it into the main document.

If there are any changes done in the main document after it has been forked (not often, generally when some other working group finished their work and merges their changes into the main document), I rebase those changes from the main document to all forks.

The example above in my mind has a tree more like:

main document --> fork2 --> fork3
             `--> fork1

And I am looking if I can optimize rebasing of fork3 when it already shares rebased steps with fork2 from before. So tree including steps, before merging, would be something like (in [] brackets I have which steps are part of that document):

main document [A, B, C] --> fork2 [A, B, C, E] --> fork3 [A, B, C, E, F]
                       `--> fork1 [A, B, C, D]

So I see at this moment [A, B, C] as confirmed steps. So fork2 has one additional step E, and fork3 has one more step on top, F. Now, because all forks are kept rebased all the time, when fork1 gets merged into main document, this is simple: main document just gets all additional step added. So in this example D. So after merging, main document has steps [A, B, C, D].

Now, first fork2 has to get rebased. This is easy, I apply inverse of E, add D, and then add rebased E'. So steps would be [A, B, C, E, inverse of E, D, E']. Because I allow rewriting history in my case, I simplify this into [A, B, C, D, E'].

Now, to get fork3 rebased, I could do a similar thing (as described in the initial post) and get steps [A, B, C, E, F, inverse of F, inverse of E, D, E', F'] and I can simplify this into [A, B, C, D, E', F']. This is what i am currently doing.

But what I wonder is, is there a simpler way to rebase fork3 if I have already rebased fork2. So as you see, during rebasing of fork3 currently I have to invert both F and E and then re-apply them. But if I have already obtained [A, B, C, D, E'], is there an easier way to get from [A, B, C, E, F] to [A, B, C, E', F']?

Did you ever make progress with this by any chance?

Yes, I did. I made a blog post about results. And there is also a open source code for it.

2 Likes