Mark extension splits codeblock and links

I have created a custom mark extension but when I apply it in a paragraph containing links, it splits the paragraph from the link into multiple marks. How do I group everything in a single <mark />

Eg:

This is an example link paragraph

on applying mark:

This is an example link paragraph

expected:

This is an examplelink paragraph

I am not directly using the Prosemirror API but using a plugin built on it called “tiptap”

It’s a consequence of Prosemirror model, see here.

It’s up to you to join adjacent Mark ranges into one when you save the content in an output format (e.g. HTML).

You should cycle over the text nodes of a paragraph, see their Marks and store them in a temporary structure with their range.

Something like this (I have not tested it):

interface MarkRange {
  mark: Mark
  from: number
  to: number
}

const markranges: MarkRange[] = []

paragraphNode.descendants( (node, pos) => {
  const from = pos
  const to = pos + node.content.size
  node.marks.forEach( (mark) => {
    markranges.push({ mark, from, to })
  })
})

Then you can

  • sort them by mark and start position (from)
  • reduce them to join adjacent marks of the same type into a single range
  • sort them again by start position (from)

There are some things you should take care of:

  • not all the paragraph content is made of text nodes
  • you should set the excludes field of the Link MarkType to an empty string, so that two Links with a different URL are considered different by Mark.eq() (this is important when you sort Marks)

okay. leave adjacent marks. What about codeblocks? These are also being split in a similar fashion. Does your code handle that as well ?

the idea for this mark is to wrap/group all the underlying HTML with a parent tag. I understand that marks do not work in this way, I am okay to use a custom extension if possible

So you need a sort of highlighter that starts e.g. in the middle of a paragraph and ends in the middle of another one, with some blocks in between; is it like that?

Something like this:

<p>A paragraph <start-highlight/>blah blah...</p>
<div>
...
</div>
<p>Another paragraph...</p>
<p>Another<stop-highlight/> paragraph.</p>

The content you want to mark would be between <start-highlight/> and <stop-highlight/>

correct. Similar to comments in Google docs. Does not matter which blocks come in between (code blocks, inline-code, whatever)

If you used a <mark> tag, it would be split anyway in HTML:

<p>A paragraph <mark/>blah blah...</mark></p>
<div>
...
</div>
<p><mark>Another paragraph...</mark></p>
<p><mark>Another</mark> paragraph.</p>

To avoid splitting tags, I’d use punctual markers, i.e. empty tags, like those <start-highlight/> and <stop-highlight/> in my previous message.

In Prosemirror, you may use leaf/atom nodes without content, track their positions and call doc.slice() to get the content between them.

You may attach attributes (e.g. the text of a comment) to the opening marker.

To highlight what’s between them, you can use inline decorations.

Those decorations will cover ranges as the <mark>...</mark> pairs in the example above, but they won’t be part of your document, so you don’t need to join the ranges before saving to an output format, where only the opening and closing markers nodes will appear.