Easy way to construct Prosemirror dcoument

elimarable · September 27, 2018, 8:39pm

Hello, I’ve been tasked with writing a parser/serializer that can take documents in a proprietary format as input and output Prosemirror markup. I was wondering if anyone else has tried to assemble Prosemirror markup outside of an editor state context?

I was thinking that the easiest thing to do would be to build an intermediate HTML tree such that running it through a DOMParser will produce the expected result, but taking that intermediate step kind of rubs me the wrong way.

The other idea I had was to build the markup upside-down, starting with the leaf nodes and using NodeType.create(content) to build the tree up to the document root.

Do I have the right idea here or is there an easier way that I’ve missed?

bradleyayers · September 27, 2018, 11:43pm

How were you planning on building the intermediate HTML tree?

This sounds closer to what I’d do, going via HTML sounds like it would complicate testing.

Here’s a few other implementations of transforming proprietary ↔ ProseMirror:

marijn · September 28, 2018, 6:30am

If there is a clear correspondence between the input format’s structure and the resulting ProseMirror document, a recursive function that builds up ProseMirror nodes for each document element seems the best approach. If the process is more messy/lossy, you might get somewhere by rendering the old format to HTML and parsing that with a DOMParser, but you’d have to inspect the result closely to see if any important data is being lost in the process (it is entirely possible for the existing HTML renderer to output HTML nodes that your parser doesn’t handle properly.)

elimarable · September 28, 2018, 4:03pm

Thank you for the replies! The proprietary format I’m working with can export to JSON, so it was not too hard to build a recursive function like you said.