Tree node interface

The Markdown abstract syntax tree (AST) is a tree of Markdown elements. In order to avoid type instabilities when performing basic operations on a tree, such as traversin it, it is implemented by linking together instances of the Node type. Each Node instance functions as a container for some AbstractElement.

The Node type has various properties that can be used to access information about the structure of the tree, but it is generally not possible to set them directly. Changing the structure of a tree (e.g. to adding child nodes), should be done with the help of the various functions and methods to MarkdownAST provides for mutating the tree.

MarkdownAST.NodeType
mutable struct Node{M}

Implements a linked list type representation of a Markdown abstract syntax tree, where each node contains pointers to the children and parent nodes, to make it possible to easily traverse the whole tree in any direction. Each node also contains an "element", which is an instance of some AbstractElement subtype, and can be accessed via the .element property. The element object contains the semantic information about the node (e.g. wheter it is a list or a paragraph).

Optionally, each node can also store additional meta information, which will be an object of type M (see also the .meta property). By default, the node does not contain any extra meta information and M = Nothing.

Constructors

Node(element :: AbstractElement)

Constructs a simple standalone node (not part of any tree) without any additional metadata (M = Nothing) containing the Markdown AST element c.

Node{M}(element :: AbstractElement, meta :: M)

Constructs a simple standalone node (not part of any tree) with the meta information meta, containing the Markdown AST element c.

Extended help

There are various properties that can be used to access the details of a node. Many of them can not be set directly though, as that could lead to an inconsistent tree. Similarly, the underlying fields of the struct should not be accessed directly.

  • .meta :: M: can be used to access or set the extra meta information of the node.
  • .element :: T where {T <: AbstractElement}: can be used to access or set the element corresponding to the node
  • .next :: Union{Node{M},Nothing}: access the next child node after this one, with the value set to nothing if there is no next child
  • .previous :: Union{Node{M},Nothing}: access the previous child node before this one, with the value set to nothing if there is no such node
  • .parent :: Union{Node{M},Nothing}: access the parent node of this node, with the value set to nothing if the node does not have a parent
  • .children: an iterable object of type NodeChildren that can be used to access and modify the child nodes of this node

The .children field is implemented with a wrapper type that implemements the iteration protocol. However, the exact type information etc. is an implementation detail, and one should only rely on the following documented APIs:

Other ways to work with child nodes that do not directly reference .children are:

  • To add new children between others, the insert_after!, insert_before! functions can be used to insert new children relative to a reference child node.
  • To remove a child from a node, the unlink! function can be used on the corresponding child node.

In addition, there are other functions and methods that can be used to work with nodes and trees:

source
Base.:==Method
==(x::Node, y::Node) -> Bool

Determines if two trees are equal by recursively walking through the whole tree (if need be) and comparing each node. Parent nodes are ignored when comparing for equality (so that it would be possible to compare subtrees). If the metadata type does not match, the two trees are not considered equal.

source

Accessing child nodes

Internally, to store the children, a node simply stores the reference to the first and the last child node, and each child stores the references to the next and previous child. The .children property is implemented simply as a lazy iterator of type NodeChildren that traverses the linked list. As such, some operations, such as determining the number of children a node has with length, can have unexpected $O(n)$ complexity.

MarkdownAST.NodeChildrenType
struct NodeChildren

The type of the the .children property of a Node which acts as an iterator over the children of a node. This type is mostly considered to be an implementation detail, and only has the following publicly defined APIs:

  • The name of the type NodeChildren, so that it could be dispatched on.
  • The .parent :: Node field that allows the user to access the parent node of the children.
source
Base.eltypeMethod
eltype(node.children::NodeChildren) = Node{M}

Returns the exact Node type of the tree, corresponding to the type of the elements of the .children iterator.

source
Base.lengthMethod
length(node.children::NodeChildren) -> Int

Returns the number of children of node :: Node.

As the children are stored as a linked list, this method has O(n) complexity. As such, to check there are any children at all, it is generally preferable to use isempty.

source
Base.isemptyMethod
isemtpy(node.children::NodeChildren) -> Bool

Can be called on the .children field of a node :: Node to determine whether or not the node has any child nodes.

source
Base.firstMethod
first(node.children::NodeChildren) -> Node

Returns the first child of the node :: Node, or throws an error if the node has no children.

source
Base.lastMethod
last(node.children::NodeChildren) -> Node

Returns the last child of the node :: Node, or throws an error if the node has no children.

source

Mutating the tree

The following functions and methods can be used to mutate the Markdown AST trees represented using Node objects. When using these methods, the consistency of the tree is preserved (i.e. the references between the affected nodes are correctly updated). Changing the structure of the tree in any other way should generally be avoided, since the code that operates on trees generally assumes a consistent tree, and will likely error or behave in unexpected ways on inconsistent trees.

Mutating the tree while traversing

Mutating the structure of the tree while traversing it with some iterator (e.g. .children or one of the AbstractTrees iterators) can lead to unexpected behavior and should generally be avoided. Updating the .element of a node while traversing, on the other hand, is fine. In general, replace! can be used to mutate a tree in arbitrary ways.

Base.replaceMethod
replace(f::Function, root::Node) -> Node

Creates a copy of the tree where all child nodes of root are recursively (post-order depth-first) replaced by the result of f(child).

The function f(child::Node) must return either a new Node to replace child or a Vector of nodes that will be inserted as siblings, replacing child.

Note that replace does not allow the construction of invalid trees, and element replacements that require invalid parent-child relationships (e.g., a block element as a child to an element expecting inlines) will throw an error.

Example

The following snippet removes links from the given AST. That is, it replaces Link nodes with their link text (which may contain nested inline markdown elements):

new_mdast = replace(mdast) do node
    if node.element isa MarkdownAST.Link
        return [MarkdownAST.copy_tree(child) for child in node.children]
    else
        return node
    end
end
source
MarkdownAST.unlink!Function
unlink!(node::Node) -> Node

Isolates and removes the node from the tree by removing all of its links to its neighboring nodes. Returns the updated node, which is now a single, isolate root node.

source
MarkdownAST.insert_before!Function
insert_before!(node::Node, sibling::Node) -> Node

Inserts a new child node sibling as the child right before node. node must not be a root node. If sibling is part of another tree, then it is unlinked from that tree first (see unlink!). Returns the original reference node.

source
MarkdownAST.insert_after!Function
insert_after!(node::Node, sibling::Node) -> Node

Inserts a new child node sibling as the next child after node. node must not be a root node. If sibling is part of another tree, then it is unlinked from that tree first (see unlink!). Returns the original reference node.

source
Base.push!Method
Base.push!(node.children::NodeChildren, child::Node) -> NodeChildren

Adds child as the last child node of node :: Node. If child is part of another tree, then it is unlinked from that tree first (see unlink!). Returns the iterator over children.

source
Base.pushfirst!Method
Base.pushfirst!(node.children::NodeChildren, child::Node) -> NodeChildren

Adds child as the first child node of node :: Node. If child is part of another tree, then it is unlinked from that tree first (see unlink!). Returns the iterator over children.

source
Base.append!Method
append!(node.children::NodeChildren, children) -> NodeChildren

Adds all the elements of the iterable children to the end of the list of children of node. If any of children are part of another tree, then they are unlinked from that tree first (see unlink!). Returns the iterator over children.

Error during an append

The operation is not atomic, and an error during an append! (e.g. due to an element of the wrong type in children) can result in a partial append of the new children, similar to how append! behaves with arrays (see JuliaLang/julia#15868).

source
Base.prepend!Method
prepend!(node.children::NodeChildren, children) -> NodeChildren

Adds all the elements of the iterable children to the beginning of the list of children of node. If any of children are part of another tree, then they are unlinked from that tree first (see unlink!). Returns the iterator over children.

Error during a prepend

The operation is not atomic, and an error during a prepend! (e.g. due to an element of the wrong type in children) can result in a partial prepend of the new children, similar to how append! behaves with arrays (see JuliaLang/julia#15868).

source
Base.empty!Method
empty!(node.children::NodeChildren) -> NodeChildren

Removes all the children of a node.

source
Mutating the .children property

The choice to apparently mutate the .children property when adding child nodes is purely syntactic, and in reality the operation affects the parent Node object. Internally the .children iterator is simply a thin wrapper around the parent node.

Copying trees

The copy_tree function can be used to easily copy a tree.

MarkdownAST.copy_treeFunction
copy_tree(root::Node)
copy_tree(f, root::Node)

Creates a copy of the tree, starting from node as the root node, and optionally calling f on each of the nodes to determine the corresponding .element in the copied tree.

If node is not the root of its tree, its parent nodes are ignored, and the root node of the copied node corresponds to node.

The function f should have the signature (::Node, ::AbstractElement) -> AbstractElement, and it gets passed the current node being copied and its element. It must return an instance of some AbstractElement, which will then be assigned to the .element field of the copied node. By default, copy_tree performs a deepcopy of both the element (.element) and the node metadata (.meta).

Extended help

For example, to perform a copy instead of deepcopy on the elements, copy_tree can be called as follows

copy_tree((_, e) -> copy(e), node::Node)

Note that copy_tree does not allow the construction of invalid trees, and element replacements that require invalid parent-child relationships (e.g. a block element as a child to an element expecting inlines) will throw an error.

source

This can be particularly useful in circumstances where a tree is passed to other code that while processing the tree also mutates it. As Node is a mutable type, this means that the original tree also mutates. Passing the result from copy_tree can be used to avoid that issue.

Index