Better TypeScript types for ParseRule?

ParseRule currently has the following type:

interface ParseRule {
  /// A single rule should have _either_ a `tag` or a `style` property.
  tag?: string
  style?: string
  /// Called with a DOM Element for `tag` rules, and with a string (the
  /// style's value) for `style` rules.
  getAttrs?: (node: HTMLElement | string) => Attrs | false | null
  // ... other fields
}

Now if I have a ParseRule like

const rule: ParseRule = {
  tag: 'div',
  getAttrs(node: HTMLElement) { // type error
    return { class: node.className }
  },
  // ... other fields
}

which is perfectly fine at runtime, getAttrs gets incorrectly marked as a type error by TypeScript compiler.

TypeScript compiler is happy with the much more verbose version:

const rule: ParseRule = {
  tag: 'div',
  getAttrs(node /* inferred as (HTMLElement | string) */) {
    if (typeof node === 'string') throw new Error('IMPOSSIBLE')
    return { class: node.className }
  },
  // ... other fields
}

But writing something like this every time is cumbersome.

ProseMirror/prosemirror-schema-basic bypasses this problem by casting an object blindly into NodeSpec or MarkSpec, which works because

(node: HTMLElement) => Attrs | false | null

is a superset of

(node: HTMLElement | string) => Attrs | false | null

(not the other way around!)

I think it is not right to force users to write needlessly verbose code or fallback to type assertions to work around the types. Two solutions come to my mind:

Make ParseRule a union type

This makes the dependency of getAttrs on whether tag or style is present clear. Among other benefits, it also ensures that one and only one of tag or style exists on the type level.

type ParseRule = (
  | {
    tag: string
    namespace?: string
    getAttrs: (node: HTMLElement) => Attrs | false | null
  }
  | {
    style: string
    getAttrs: (node: string) => Attrs | false | null
  }
) & {
  priority?: number
  // ... other fields
}

The downside is that the type looks complex. Converting an interface to a type alias also means that other code cannot extend the interface with additional fields anymore, making it technically a breaking change in types, but I don’t think anybody should be augmenting ParseRule anyway.

Tweak the type a bit for ease of use

Since a getAttrs function should accept either HTMLElement or string but not both, it is reasonable to retype getAttrs as

getAttrs?: ((node: HTMLElement) => Attrs | false | null)
  | ((node: string) => Attrs | false | null)

This does not prevent errors like expecting an element with style rules, but is a smaller and more compatible change to the current interface type. prosemirror-model needs to be modified accordingly to cast getAttrs wherever it is called; the cast is unavoidable, because it is impossible to differentiate function types at runtime. Fortunately, there are only two call sites in matchTag and matchStyle.

I’d love to contribute a PR to fix the problem, but I am not sure about which solution would work better.

I tried the union approach and it does clean some things up, but it will cause code that currently type checks to stop typechecking, because in a context where the type checker knows you are typing a parse rule, you can currently leave the argument to getAttrs to have its type inferred. But as soon as ParseRule becomes a union (and the same issue exists when only getAttrs becomes a union), that type cannot be inferred and this becomes a type error.

Unfortunately, TypeScript seems unable to infer which element type of a union we are picking from sibling properties (see this simplified example).

The breakage is only on the type level, so though it might break people’s builds it won’t break people’s running code, but I’m still not sure I want to inflict that on users.

Oops. My first intuition is that, the inference in the simplified example fails because TypeScript does not know if propB exists even if propA exists.

type T = {propA: string, pred: (a: string) => string}
  | {propB: string, pred: (a: number) => string}

let x: T = {propA: "x", propB: "x", pred: x /* rightfully any */ => x + ""}

It is possible to provide good inference by explicitly saying that the other property does not exist.

type T = {propA: string, propB?: undefined, pred: (a: string) => string}
  | {propB: string, pred: (a: number) => string}
  
let a: T = {propA: "x", pred: x => x} // type checks
let b: T = {propB: "x", pred: x => x.toString()} // type checks
let c: T = {propA: "x", propB: "x", pred: x => x.toString()} // error

After some tinkering, I find that TypeScript is currently designed to provide best support for a union type discriminated by a literal field. The reason why the example above works is that propB is acting as the discriminator, where undefined is the literal triggering the TypeScript compiler to think so. Unfortunately, given the current status of discriminator types in TypeScript, I’m not aware of a solution that feels less hacky. The intent is clear, at least.

Oh, that’s clever. Indeed, adding a dummy tag property to StyleParseRule makes this issue go away. I’ve released prosemirror-model 1.20.0 with these types. We’ll see whether it causes any further fallout.

1 Like