Design Notes

AST Traversal

During the AST traversal stage, the complete AST (generated by the clang frontend) is walked beginning with the root TranslationUnitDecl node. It is during this stage that USRs (universal symbol references) are generated and hashed with SHA1 to form the 160 bit SymbolID for an entity. With the exception of built-in types, all entities referenced in the corpus will be traversed and be assigned a SymbolID; including those from the standard library. This is necessary to generate the full interface for user-defined types.

Bitcode

AST traversal is performed in parallel on a per-translation-unit basis. To maximize the size of the code base MrDocs is capable of processing, Info types generated during traversal are serialized to a compressed bitcode representation. Once AST traversal is complete for all translation units, the bitcode is deserialized back into Info types, and then merged to form the corpus. The merging step is necessar as there may be multiple identical definitions of the same entity (e.g. for class types, templates, inline functions, etc), as well as functions declared in one translation unit & defined in another.

The Corpus

After AST traversal and Info merging, the result is stored as a map of Info`s indexed by their respective `SymbolID`s. Documentation generators may traverse this structure by calling `Corpus::traverse with a Corpus::Visitor derived visitor and the SymbolID of the entity to visit (e.g. the global namespace).

Namespaces

Namespaces do not have a source location. This is because there can be many namespaces. We probably don’t want to store any javadocs for namespaces either.

Paths

The AST visitor and metadata all use forward slashes to represent file pathnames, even on Windows. This is so the generated reference documentation does not vary based on the platform.

Exceptions

Errors thrown by the program should always have type Exception. Objects of this type are capable of transporting an Error object. This is important for the scripting to work; exceptions are used to propagate errors from library code to scripts and back to the invoking code. For exceptional cases, these thrown exceptions should be uncaught. The tool installs an uncaught exception handler that prints a stack trace and exits the process immediately.