Position of hard break in markdown just after `em` or `strong` node

klaftertief · November 20, 2018, 1:41pm

We have a mostly example-setup based WYSIWYM text editor running. The content gets serialised to markdown using the prosemirror-markdown plugin in its default configuration. When the user inserts a hard break just after strong or emphasised text, the generated markdown looks like the following, which can’t get parsed back correctly.

Foo **Bar\
** Baz

You can see the same behaviour in https://prosemirror.net/examples/markdown/

I’m wondering if this is a bug or expected behaviour.

I have the feeling that this is expected since the JSON of the fragment looks like this (see the marks in the hard_break).

[{"type":"text","text":"Foo "},{"type":"text","marks":[{"type":"strong"}],"text":"Bar"},{"type":"hard_break","marks":[{"type":"strong"}]},{"type":"text","text":" Baz"}]

I played around with some workarounds, but all had their drawbacks.

Setting inclusive: false in the marks.strong and .em schema fixed the rendering, but one could not keep writing marked text.
Disallowing marks in nodes.hard_break resulted in an additional line break when typing in the editor, and the markdown could not get parsed back.
I thought about creating a specialised keymap that exits the marked node, creates a break and opens a marked node again, but I have not actually tried it.
At the moment I’m subclassing MarkdownSerializerState, overriding renderInline there where all strong and em marks of hard_break nodes get removed. This fixes the problem, but feels not right. The method does a lot, and I’m just adding some little specialised behaviour.
I have a working regex to post-process the serialised markdown, but this of course is only a last resort.

So I’m not sure if this should be fixed in user land or if this is something that should be handled in the markdown plugin, even if it might not be a real bug. Have you experienced the same? Do you know of other solutions?

marijn · November 20, 2018, 1:59pm

Hm, yeah, Markdown is a pretty terrible target format because of corner cases like these. I think it’d make sense to modify the default serializer to always render hard breaks without any marks on them, which should work around stuff like this. I think that’s a relatively small adjustment to the current code. Do you want to take a stab at implementing it?

klaftertief · November 20, 2018, 2:15pm

Sure, thanks for the quick reply and the confirmation.