DOM parsing and .getAttrs

I just stumbled about the following mark declaration in schema-basic. It took me a bit to understand what’s actually going on there, so I write it down here in hopes it may help others (and as future reference for myself). Prosemirror pros can safely skip this post.

  em: {
    parseDOM: [{tag: "i"}, {tag: "em"},
               {style: "font-style", getAttrs: value => value == "italic" && null}],
    toDOM() { return ["em"] }
  },

The weird thing is the getAttrs definition. First of all it’s a bit hard to read and figure out what it will actually do. Transforming the ES6 arrow function to an old school function helps a little:

function (value) {
  return value == "italic" && null;
}

So what is it good for? Whenever HTML needs to be converted into Prosemirror’s internal document representation, the parse rules of all nodes and marks are checked. For emphasized text, there are three rules. The first two are straight forward. When there is an <i> or <em> tag it will be converted to a em mark. The third rule matches all elements that have font-style CSS attribute and then applies the getAttrs() function to that attribute.

The getAttrs() function will return false for all values except "italic". For "italic" it will return null. Why that?

  1. it works as an additional filter to decide if the rule matches - it returns false if it doesn’t
  2. it modifies the set of attributes of the mark, by returning an object

This explains why it returns false for everything but "italic" - we only want elements with a font-style: italic to be recognized as em mark.

But why null? The documentation gives a hint:

When it returns null or undefined, that is interpreted as an empty/default set of attributes.

What set of attributes? Well, em marks simply do not have any attributes. If they had, they would be defined in the attrs property.

An example for a mark with attributes can be seen further down in the same file for the link mark.

mystery solved.

1 Like

Your description is correct. Apologies for the confusion, I just can’t resist good one-liners.

1 Like