How would I go about creating a Wiki Parser?

We use a custom wiki format, namely Tiki, and I was hoping to get some guidance on how best to creating a Wiki Parser for it. The Serializer part seems pretty straight forward (based on to_markdown.js) but the Parser part (from_markdown.js) has me thinking.

Should I:

1.) Use some existing wiki -> html parser and then convert that to a pm.doc using Schema.parseDOM() ? This seems to be the easier way

2.) Create my own parser? - Not sure where to start here, I would probably need a tokenizer library but Markitdown does not seem to accept custom blocks other than https://github.com/markdown-it/markdown-it-container . If there was some example of custom markdown to prosemirror that would help a lot :slight_smile:

Perhaps I have to extend markitdown with plugins for my own syntax?

Thank you, Andre

If you have a parser that outputs some kind of token stream or AST, it’s probably not too hard to use that to emit a ProseMirror document. But failing that, yes, going through HTML might be the most effective approach (writing a parser from scratch, for wiki-style of languages, tends to be a lot of work, since they tend to have messy and informal grammars).

1 Like

I just wrote an answer in another thread. Short version: Going through HTML worked well for me when converting DokuWiki and Confluence content, which was exported as a static website.

Thanks, I ended up using Markdown-it with the zero default and with my own custom markdown-it plugin’s for the WikiCode

const md = markdownit('zero', {
    html: false,
  })
  .enable(['blockquote', 'link', 'list'])
  .use(heading)
  .use(link)
  .use(autolink)
  .use(emphasis)
  .use(strong);

I’ll post my code when I’m done but right now I’m able to switch between CodeMirror and ProseMirror and the code converts seamlessly.