Hello @johanneswilm,
I’m writing a Prosemirror-based editor for Pandoc’s internal format. Here’s the link, but for now you won’t find anything, because I haven’t published it yet, sorry.
It reads and writes Pandoc’s JSON format, so I had to write the code to make the conversion from Pandoc’s Block
s and Inline
s to Prosemirror’s Node
s and Mark
s, and vice versa.
Pandoc vs Prosemirror
The conversion is pretty straightforward for blocks, but it’s rather complex at inline level, because you have to match the tree-like nature of Pandoc Inline
s with the flat model of Prosemirror Mark
s.
It is perfectly fine for Pandoc to nest an Emph
inside another Emph
, but it’s difficult to model it with a Mark
in Prosemirror, unless you differentiate the two Emph
s with some attribute, and set the excludes property to an empty string in its MarkSpec
.
That’s because in Prosemirror a Mark
is either set or not set on a span of text; you can’t set it twice.
Even for a given document model – I’m focusing on the Pandoc AST now – you can imagine a bunch of slightly different Prosemirror schemas.
For example, how do you model a Pandoc RawInline
? Since it’s an Inline
, I first thought of a Mark
in Prosemirror. I eventually decided for an atomic inline Node
instead, providing a textual sub-editor in the GUI.
Back to your question
I think it’s nevertheless possible to abstract some functions to help in the conversion between Prosemirror and Pandoc, or any format that is tree-like at inline level. The trickiest part of that job is solving the flat vs tree-like translation.
Here I’m describing the path I followed, because I think it relates to your question:
-
I started thinking of different prosemirror-based editors for different models;
-
for each one I wanted to provide an export function to Pandoc JSON, this way providing an export to any format supported by Pandoc;
-
it meant maintaining a bunch of editors sharing parts of code and the ability to export into Pandoc JSON;
-
eventually I opted for a single editor based on Pandoc JSON, that can be configured to adapt to different models and workflows, that way becoming “multiple editors”;
-
choosing Pandoc internal model is clearly a strong requirement, but you can use all its input and output formats, and you can even support further ones through custom readers and writers;
-
the challenge I face is making a single editor become “multiple editors” without changing the editor’s code, only through configuration files or custom readers, writers and filters