Skip to content

Latest commit

 

History

History
55 lines (46 loc) · 2.62 KB

output.md

File metadata and controls

55 lines (46 loc) · 2.62 KB

HTML output

The final stage of indexing is to output a static HTML file for:

  • Every source file via tools/src/bin/output-file.rs
  • Every directory, linking to subdirectories and source files via scripts/output-dir.js
  • The search file template used by router/router.py via scripts/output-template.js. (The template is little more than the HTML UI boilerplate, a place to inline the JSON-style results object, and a "load" listener to trigger the JS logic to render the results.)
  • The help.html file at the root of the output tree. scripts/output-help.html wraps the contents of the config tree's help.html in the HTML boilerplate of the UI so that the standard search bar is at the top of the page.

Because output logic is currently split between rust and JS code, any structural changes will require changes to both scripts/output.js and tools/src/output.rs.

Indexed Source Files

This code lives in tools/src/bin/output-file.rs and tools/src/output.rs. The main formatting loop is in tools/src/format.rs. The inputs to this process are:

  • The original source code, either from the file system (for the current version) or from version control (for historical versions).
  • Blame information from the blame repository.
  • Analysis records generated for the given file.
  • Jump information generated by the cross referencer.

The original code is tokenized using one of two hand-coded tokenizers (both in tools/src/tokenize.rs). One tokenizer recognizes C-like languages (JS, C++, IDL, Python) and the other recognizes tag-based languages (HTML, XML).

The central loop in format.rs iterates over tokens. When it finds an identifier token, it outputs markup for all text between the previous identifier and this one. Then it checks if this identifier has an analysis source record for the token's location. If it does, then it adds data- attributes to the markup that describe what the context menu should do for that identifier. Regardless, the markup colors the identifier based on whether it's a reserved word as well as the syntax property on the source record (if there is one).

Blame diffs

The output code also has the ability to show annotated commit diffs. These diffs are generated dynamically by the web server when the user requests an annotated diff. The diff is generated by running git diff -U100000. All the lines forming the "new" version of the file are also run through format.rs to syntax highlight them (although there are no analysis records available). The "old" - lines are then merged in at the right locations and the appropriate blame information is fetched for unchanged and - lines.