Underline ParseRule that excludes Links

bhl · November 29, 2020, 5:15am

My current underline mark schema is

  get schema(): MarkSpec {
    return {
      parseDOM: [{tag: 'u'}, {style: 'text-decoration=underline'}],
      toDOM() { return ['u', 0] },
    }
  }

One issue I’ve found with this schema by inspecting the html of the editor after a link has been pasted, is that an underline mark is created around a link as well.

This is because most links have a text-decoration: underline style applied to them. Is there a way of constructing a ParseRule that still checks for text-decoration=underline but excludes anchor (link) tags?

marijn · November 29, 2020, 1:13pm

I think you should be able to add a getAttrs function that returns false for elements where the mark shouldn’t apply.

bhl · November 30, 2020, 4:27am

If I use a ParseRule with style, getAttrs only gets the style’s value as the argument so I wouldn’t be able to detect if the underline came from an anchor or not.

If I use a ParseRule with tag, I would be able to get the html element as an argument, but what tag would I use? Do I use all inline html nodes, or something as simple as “span” only?

With the current ParseRule api with style or tag, how would I say “text-decoration=underline but not in an anchor tag”? Is there an inverse operator like {tag: 'not(a)', style: 'text-decoration=underline'} that can combine both?

marijn · November 30, 2020, 6:42am

Oh, right, getAttrs is only passed the style value in this case, that’s indeed no use. You could use tag: "*" with a low priority as a kludge, I suppose.

bhl · December 1, 2020, 12:30am

Not sure why I didn’t do this earlier, but I dug into the prosemirror-model source code and realized the tag value is just a CSS selector. So to negate anchor tags, we can just use :not(a).

{
  tag: ':not(a)', 
  getAttrs: dom => dom.style.textDecoration.includes("underline") || dom.style.textDecorationLine.includes("underline")
}

One thing I need to double check though is: we don’t need to account for priority with the tag selector when we’re applying marks right (assuming no conflicts with marks themselves)? Are all matching tags applied, not only the first one found?

marijn · December 1, 2020, 8:27am

Only the first tag rule for a given element will be applied (multiple style rules may match, if they match different style properties), so there is a difference between the old rule and this one in that it’ll ‘consume’ the styled element, and it won’t apply if a higher-precedence rule matches the tag (i.e. <blockquote style="text-decoration: underline"> will just be a blockquote, and not add the mark).

bhl · December 1, 2020, 10:40am

Hmm, why is this the case when using ParseRule and tag for marks?

I understand the reasoning of ParseRule with node consuming the element since a DOM node can only be of one type of node; but there isn’t a same restriction with applying multiple marktypes to a DOM node.

Would it be possible to modify prosemirror-model to allow for this behavior? i.e. node parseRules are consumptive and mark parseRules are not, rather than the current behavior of tags are consumptive and styles are not.

I’m thinking the changes would be centered around addDom:

  addDOM(dom) {
    if (dom.nodeType == 3) {
      this.addTextNode(dom)
    } else if (dom.nodeType == 1) {
      let style = dom.getAttribute("style")
      // parse not only mark rules (rule.mark != null) with styles, but also marks with tags here
      let marks = style ? this.readStyles(parseStyles(style)) : null, top = this.top
      if (marks != null) for (let i = 0; i < marks.length; i++) this.addPendingMark(marks[i])
      this.addElement(dom) // remove mark parsing rules here to avoid re-parsing?
      if (marks != null) for (let i = 0; i < marks.length; i++) this.removePendingMark(marks[i], top)
    }
  }

Two edge cases I’ve thought of with this are:

are parseRules with style and node defined, but rule.mark = null allowed? This question came from looking at readStyles and seeing only marks are returned and rule.mark is used to index into schema.marks.
ensuring backwards compatibility with keeping mark parseRules with tag attributes being consumptive may require an additional boolean like returnOnFirstMatch default set to true

marijn · December 2, 2020, 10:57am

Hmm, why is this the case when using ParseRule and tag for marks?

Because there, too, matching the same node to multiple rules might be nonsense (if the rule uses the node name to create a mark, it’d be weird to use it again in another rule to create a node).

Would an opt-in consuming: false flag on parse rules help you here?

bhl · December 3, 2020, 12:00am

if the rule uses the node name to create a mark, it’d be weird to use it again in another rule to create a node

Ah, that’s a use case I didn’t account for: I was only thinking about parseRules in terms of model output and not html dom inputs.

Would an opt-in consuming: false flag on parse rules help you here?

Yes! This would allow me to use tags for mark pasteRules, and help with the text-underline scenario.

marijn · December 3, 2020, 8:32am

I’ve added an RFC for this to allow community feedback, will follow up in a week.

marijn · December 11, 2020, 9:01am

This has now been implemented in prosemirror-model.