Offline, Peer-to-Peer, Collaborative Editing using Yjs

@dmonad I am working on a new app which uses Prosemirror and has full offline support. The missing piece was providing offline support for Prosemirror. So this is a fantastic and most welcome addition.

The use of Web RTC and peer-to-peer is impressive, however if I understand correctly this relies on at least one PC up and running and accessible over the Internet at all times, in order for other devices to come and go and all instances keep in sync.

I can envisage this becoming a problem in the real world. Instead I’d like to (optionally) see the ability to have a central server which was always up to date as these clients come an go. Ideally all data on this server would be encrypted by a key only the end-users know, therefore maintaining data privacy.

Thoughts?

PS. I have looked at Yjs ages ago, clearly it’s time to revisit. Keep up the great work.

@nevf Isn’t this solved by websockets? There’s a yjs websocket client and server library. And as mentioned above, there’s a prosemirror demo of those libraries:

@bhl Thanks for that. I did see a mention of a websocket client, but missed that there was a server. I also was under the impression there was only the webrtc implementation - my mistake.

I’ve just tried the websocket demo and it appears as though offline edits aren’t saved in the Browser (indexeddb). So if you close a Tab with websocket demo open, but you are offline then a) When you re-open the Tab you don’t see any content, b) when you go back online any edits you did offline before closing the Tab are lost.

Of course this may well be resolved using the y-indexeddb provider. Any idea?

Yeah, I think based off http://y-js.org/, you also need a database adapter for persistence while offline. Indexeddb is one way to do that.

1 Like

Hi @nevf, Yjs by itself is just a small CRDT implementation. There are several modules around Yjs that allow you to do different things.

Editor bindings, like y-prosemirror, y-codemirror, or y-quill make a specific editor collaborative. Connectors, like y-webrtc, or y-websocket handle how to sync to other peers. And Persistence adapters handle how to persist data to a database to make it available offline (e.g. y-indexeddb for the browser, or y-leveldb for the server).

The idea is to make Yjs as modular and unopinionated as possible and build a huge ecosystem around it. There is a list of stable extensions for Yjs here: https://github.com/yjs/yjs Yjs version 13 is pretty new and I haven’t ported everything yet, so please bear with me.

You can make the demos in https://github.com/yjs/yjs-demos offline ready by defining a service worker (e.g. pwabuilder) and including y-indexeddb.

FYI y-websocket-client/server are now bundled into a single repository (y-websocket).

Awesome, thanks for sharing the tiptap demo @holtwick. Lets put that into https://github.com/yjs/yjs-demos

1 Like

Sure @dmonad I added a pull request with a very basic sample for TipTap: https://github.com/yjs/yjs-demos/pull/7 cc @philippkuehn

1 Like

@dmonad Thanks for the extra info. I assume switching between connectors and adapters requires no changes to the code using these, is that right?

I have other more general questions about Yjs which I assume would best be posted in https://discuss.yjs.dev

OMG this is awesome! :heart:

@nevf Correct, that would be the best place to ask questions about Yjs.

Thanks @philippkuehn ^^ I demonstrated this last week at FOSDEM. After an initial issue with the network everyone in the lecture room synced up. https://fosdem.org/2020/schedule/event/yjs_shared_editing/

@dmonad, The yjs ecosystem is incredibly impressive work. I was able to very quickly get a yjs websocket server running along aside the client side prosemirror-yjs. There’s another thread that discusses CRDT and ramifications on automatic conflict resolution and I agree with you that automatic merging is preferable if revisions / change tracking is easily accessible / auditable / viewable. With that said, I was excited to see a minimal implementation in the prosemirror-versions demo.

Since my application has a custom comment / annotation plugin that works via decorations / prosemirror position map, I am extra curious as to where you are now with regards to a functional prosemirror decoration implementation (but also general support of prosemirror plugins as discussed in this thread)? This is critical for me to be able to adopt this library, and I’d like to know what to expect / timeline. Thanks so much!

4 Likes

It looks as if replacing the entire doc at once when syncing changes is a bit of a deal breaker in many contexts, especially for large documents and when you have to track positions or create lots of decorations. Has anyone been able to do an approach that utilizes a diff or some other mechanism to do smaller, more targeted updates?

As you guys stated, an identity based approach seems optimal but I’d have to dig into the bindings more.

5 Likes

suggestion: use matrix as transport layer, instead of y-websocket.
matrix-element can embed etherpad-lite as widget (video)

(etherpad frontend is based on ace-editor, jquery, …)

@jessejorgenson did you manage to figure out how to avoid replacing the entire doc? We’ve noticed slowdowns when rendering multiple prosemirrors that use y-prosemirror to sync content.

@namitc Sort of. We are still exploring a better solution using mapping and references but you can get this working in a naive way with diffs using something like this GitHub - sueddeutsche/prosemirror-recreate-transform: Create a set of steps transforming one prosemirror json document to another

Basically you could serialize the Y.js document to an HTML document, parse it into a prosemirror document, run a diff between the parsed doc and the current doc, then dispatch the recreated transform

1 Like

The current approach will work fine as long as you don’t use too many decorations. Replacing sounds bad, but as I explained before, ProseMirror will only apply the diff to the DOM. The replacement is efficient because y-prosemirror preserves the identity of the nodes.

However, working with decorations and y-prosemirror has been one of the biggest pain points. It’s particularly hard to share decorations between collaborators. Not that it’s impossible, it just requires working directly with Yjs based relative positions that rever to the Yjs model and sync them to the ProseMirror state. I feel we need better abstractions.

The y-prosemirror binding is one of the most used editor bindings for Yjs. This was my first implementation and I think there is a lot of room for improvement. Especially the change-tracking / versioning extension is just an ugly hack that I made possible for the FOSDEM demo.

I’d like to create a rewrite this year with the feedback that I accumulated. A big focus will be to enable offline-editing scenarios. This will include:

  • A cleaner codebase.
  • Applying minimal diffs to ProseMirror without losing decorations.
  • A better abstraction around tracking shared positions that will replace the too complex relative position API.
  • An extension to track shared decorations that are synced between collaborators (e.g. for implementing comments).
  • A rewrite of the snapshots API to create a nice abstraction for implementing change tracking.
  • Potentially using the new move API to handle split-node scenarios better. Explanation: When splitting a paragraph into two separate paragraphs, we are removing the remainder of the first paragraph and creating a new node with the copied content from the first paragraph. Concurrent changes that are synced to the remainder of the first paragraph are still synced to the first paragraph. Yjs now supports moving of ranges of content. With the new Move-API (still WIP) we can “move” the remainder of the first paragraph to the new paragraph and retain concurrent edits. This is especially relevant for offline editing (changes that are only synced after a long time).

I talked to another editor project and we thought about creating a common abstraction around editor bindings. There are a lot of similarities between editor projects and how they represent editor state. The “editor abstraction” will handle representing editor state efficiently using Yjs’ types and provide a common abstraction for editor projects.

Looking for funding

Together with the creators of the TipTap editor @hanspagel & @philippkuehn I created an Open Collective (y-collective) for funding collaborative technologies. Basically, approved projects can charge up to $100/h for working on their projects. So far we made some really awesome things happen through the open-collective like Yrs (the Rust port of Yjs) and Hocuspocus. If we receive at least $30k in funding for the y-prosemirror rewrite, I will start development. Please direct your contributions to the y-prosemirror sub-project - which is part of the y-collective. Thank you!

If you want to fund a significant portion of the requested funds, feel free to ping me first so we can talk about expectations.

Cheers all, and thanks for all the feedback.

4 Likes

I think it’s more than just decorations that are problematic. You run into the following the problems with replacing the whole doc

  • Tracking positions in plugin state becomes nearly impossible. You have to use Y.js positions only which may not work
  • Nodeviews get constantly re-rendered. For example if you have a React integration, all of your React node views re-render which causes a huge performance bottleneck
  • Plugins that use appended transactions on just affected positions will now have to iterate over the entire document to apply their changes. This can cause a giant bottleneck for larger notes

We’ve also had issues with how opinionated the structure of the Y.doc is. It uses a 1-1 mapping of whatever the schema is defined as and bypasses any use of serialization. This means that if you want to save your content with a different structure than the exact Prosemirror document, you can’t use the bindings. We also have scenarios where certain Prosemirror nodes have children that are never saved due to privacy concerns and that is also not possible with the current bindings.

I think the current bindings work if you are starting a new project and you lock yourself in to using Y.js from the beginning. However, we are experimenting with adding Y.js to a current project and we had to create our own bindings because of these limitations.

1 Like

Happy you found a new approach. However, creating your own editor bindings based on a custom schema might not be feasible for everyone. Creating an editor binding for a complex editor like ProseMirror is pretty complex if you care about retaining editing semantics of, for example, concurrent changes on formatting attributes.

In any case, I think you’ll find your issues addressed in the proposal for the rewrite.

There is another good reason why y-prosemirror syncs the ProseMirror model and not the serialized document. Concurrent edits can break a schema, resulting in a document that cannot be rendered by ProseMirror anymore. y-prosemirror syncs ProseMirror’s schema and recovers when the schema is broken due to concurrent edits.

For backend serialization, I suggest that you use the utility functions to transform the Yjs document to a ProseMirror JSON. You can use the ProseMirror JSON and serialize it to your custom format. I wouldn’t know how to make this process less opinionated.

3 Likes

This sounds great @dmonad!

I ran into the following that causes difficulties in syncing across nodes:

y-prosemirror

(some of the edits are lost)

Would this be fixed by your suggestion to handle “split-node” scenarios better? (it’s kind of like the reverse of your example I think)

1 Like

@YousefED As I wrote, only potentially. This is an issue that I want to address. However, the move feature for rich text might prove to be too complex which is why I don’t want to make any promises. However, I’d like to work on this.

1 Like