Conversion to and from Markdown standard library

The Markdown standard library in Julia provides an alternative representation of the Markdown AST. In particular, the parser and AST there is internally used by Julia for docstrings, and is also used by some of the tooling in the Julia ecosystem that deals with Markdown.

MarkdownAST supports bi-directional conversion between the two AST representations via the convert function. The conversion, however, is not perfect since there are differences in what and how the two libraries represent the Markdown AST.

Conversion from standard library representation

Any AST that is produced by the Markdown standard library parser should parse into MarkdownAST AST. However, as the data structures for the elements in Markdown are pretty loose in what they allow, user-crafted Markdown ASTs may error if it does not exactly follow the conventions of the Markdown parser.

Base.convertMethod
convert(::Type{Node}, md::Markdown.MD) -> Node
convert(::Type{Node{M}}, md::Markdown.MD, meta=M) where M -> Node{M}

Converts a standard library Markdown AST into MarkdownAST representation.

Note that it is not possible to convert subtrees, as only MD objects can be converted. The result will be a tree with Document as the root element.

When the type argument passed is Node, the resulting tree will be constructed of objects of the default node type Node{Nothing}. However, it is also possible to convert into MarkdownAST trees that have custom metadata field of type M, in which case the M type must have a zero-argument constructor available, which will be called whenever a new Node object gets constructed.

It is also possible to use a custom function to construct the .meta objects via the `meta argument, which must be a callable object with a zero-argument method, and that then gets called every time a new node is constructed.

source

Due to the differences between the Markdown representations, the following things should be kept in mind when converting from the standard library AST into MarkdownAST representation:

  • The standard library parser does not have a dedicated type for representing backslashes, and instead stores them as separate single-character text nodes containing a backslash (i.e. "\\").
  • Soft line breaks are ignored in the standard library parser and represented with a space instead.
  • Strings (or Markdown elements) interpolated into the standard library Markdown (e.g. in docstrings or with the @md_str macro) are indistinguishable from text (or corresponding Markdown) nodes in the standard library AST, and therefore will not be converted into JuliaValues.
  • The standard library allows for block-level interpolation. These get converted into inline JuliaValues wrapped in a Paragraph element.
  • In case the standard library AST contains any inline nodes in block context (e.g. as children for Markdown.MD), the get wrapped in a Paragraph element too.
  • When converting a Markdown.Table, the resulting table will be normalized, such as adding empty cells to rows, to make sure that all rows have the same number of cells.
  • For links and images, the .title attribute is set to an empty string, since the standard library AST does not support parsing titles.

Conversion to standard library representation

Any AST that contains only the native MarkdownAST elements can be converted into the standard library representation. The conversion of user-defined elements, however, is not supported and will lead to an error.

Base.convertMethod
convert(::Type{Markdown.MD}, node::Node) -> Markdown.MD

Converts a MarkdownAST representation of a Markdown document into the Markdown standard library representation.

Note that the root node node must a Document element.

source

Due to the differences between the Markdown representations, the following things should be kept in mind when converting from the MarkdownAST representation into the standard library AST:

  • The value from a JuliaValue element (i.e. .ref) gets stored directly in the AST (just like variable interpolation with docstrings and the @md_str macro). This means that, for example, an interpolated string or Markdown element would become a valid AST element, losing the information that it used to be interpolated.
  • The expression information in a JuliaValue element (i.e. the .ex field) gets discarded.
  • The .title attribute of Link and Image elements gets discarded.
  • The standard library does not support storing the child nodes of Image elements (i.e. "alt text", ![alt text]()) as AST, and it is instead reduced to a string with the help of the Markdown.plain function.
  • The standard library AST does not have dedicated elements for SoftBreak and Backslash, and these get converted into strings (i.e. text elements) instead.

Index