WebAssembly Debugging Capabilities

Is this document missing something? Send a pull request or open an issue on GitHub!

1. Motivation

WebAssembly is not typically written by-hand. A higher-level language—such as C, C++, C#, or Rust—is compiled into WebAssembly. Tools that operate on the emitted WebAssembly should present results their results in terms of the original source language, not the generated WebAssembly code.

1.1. Supported Tools

Here is an incomplete list of the kinds tools that we expect to consume or manipulate debugging capabilities:

Stepping debuggers: A stepping debugger allows users to step through their program’s execution line-by-line (or expression-by-expression) in their original source language. gdb is a popular stepping debugger, and most Web browser’s developer tools also include a stepping debugger for JavaScript.
Logging consoles: Logging consoles collect and display messages logged by the program. These messages are not limited to simple strings; they may contain objects or even tables. Most Web browser’s developer tools include a logging console.
Time profilers: Time profilers analyze where a running program is spending its time. They help users improve their program’s throughput or latency. They might be implemented by periodically sampling the running program’s stack or by precisely tracing certain operations.
Code size profilers: A code size profiler analyzes which source language functions and constructs are responsible for which portions of a binary. They help users produce small binaries. Twiggy is a code size profiler for WebAssembly.
Linkers and bundlers: Linkers and bundlers combine incomplete fragments of programs (object code) into a complete program. They typically compose debugging capabilities, rather than consume or generate them directly. Examples include lld and webpack.
Static analysis tools: Static analysis tools analyze a program and report their results to developers without ever running it. For example, providing a conservative estimate about how deep the stack may grow, or visualizing why a compound type’s size is so large.
Instrumentation and rewriting tools: It can be useful to instrument binaries with extra instructions, or to completely rewrite blocks of code. For example, binaryen can instrument all memory accesses, helping users debug heap corruption, and wasm-snip can replace a function’s body with an unreachable instruction.

This list is not exhaustive, but we hope it is representative.

2. Requirements

This section attempts, as much as possible, to avoid embedding assumptions about whether debugging capabilities are provided by parsing a static format (a la DWARF), via a programmatic interface (a la language servers), a combination of those two, or by something else entirely. Rather than proposing a particular solution, this is an attempt to enumerate the requirements and constraints that any solution must satisfy. Therefore, I use the term "debugging capability" rather than "debugging information" or "debugger server".

Furthermore, I’ve categorized these requirements as "musts" or "shoulds". A "must" is a hard requirement for an MVP: generally something that is already supported by source maps, or is required now in order to support a "should" later. A "should" is something that is not a hard requirement for the MVP, but for which we have supporting use cases.

2.1. General

2.1.1. Must be future extensible

So that we can ship incrementally, we need to be able to evolve the debugging capabilities over time, as we add functionality. For example, we might initially ship without support for describing packed struct types, but we must not preclude ourselves from ever supporting them.

Note: The source map format is not future extensible. The number of variable-length quantities that each mapping contains is fixed, and adding more information to a mapping will only break old consumers of the format. On the other hand, JSON would be future-extensible, in that one could add new fields to an object for new debugging information. Old consumers should ignore fields they do not recognize, while new consumers leverage the new information. Of course, source maps used to be JSON-based, but moved to variable-length quantities to be more compact on disk and over the network.

2.1.2. Must be embedder agnostic

WebAssembly is portable and embedder agnostic, and its debugging capabilities must be as well. The debugging capabilities must not be specific to any single kind of embedder or domain. For example, the debugging capabilities must not assume that the embedder is a Web browser.

2.1.3. Must support querying static properties without running the debuggee program

Every query describes either a static or dynamic property of the debuggee program. What is a static property for one binary might be a dynamic property for another. The debugging capabilities must support querying both static and dynamic properties, and static properties must be queryable even when the debuggee program is not running.

For example, a CLI code size profiler must be able to consume information about inlined functions or generic function monomorphization without running the debuggee program. A static analysis tool that never runs its input program must be able to report its results in terms of source locations. A linker or bundler that is composing two wasm object files must not have to run the incomplete programs contained in the object files in order to merge their debugging capabilities.

2.1.4. Must be embeddable within the WebAssembly

For ease of development, the debugging capabilities must be embeddable within a custom section or sections of the debuggee program’s .wasm binary. This avoids wrangling multiple files and associated issues like moving one file but not the other and breaking relative references between them.

It is common to embed source maps as a data URL within //# sourceMappingURL pragmas by base64 encoding them.

2.1.5. Must be separable from the WebAssembly

On the other hand, you most certainly do not want to include the debugging capabilities within the .wasm binary you ship in production because of size concerns and the associated network costs. But bugs happen in production, and you might want to use a stepping debugger with your live production code base, or apply debugging information to stack traces captured via telemetry and then processed offline, a la Sentry.

2.1.6. Must be compact on disk and over the network

There is so much information that a tool might query for, on the order of the debuggee program itself. Much of this information is regular and repetitious. Source maps encode only a fraction of the information that we ultimately want to support queries for, and they are regularly many megabytes in size. The current source map specification is in its third revision, and each revision has focused on making the format more dense. Empirically, relying on general compression algorithms has not been enough.

2.1.7. Should be compact in memory

Ideally only portions of the debugging capabilities are necessary to load in memory at any given time. E.g. if the debuggee program consisted of multiple linked compilation units, then a debugger should only need to initially load the capabilities for the current function’s compilation unit. Answering a single query (e.g. finding the original source location for a generated code location) should not require loading the full debugging capabilities for the whole debuggee program.

Note: DWARF, for example, has various index sections that allow for accelerated access into other sections. Those other sections do not need to be loaded into memory, and one can use the index to find an offset and then read just the relevant bits. The .debug_aranges section allows a debugger to quickly find the compilation unit and associated data in the .debug_info section for a given address, without parsing and loading into memory all of the .debug_info section. See section 6.1 of DWARF 5 for details.

2.1.8. Should be fast to consume, generate, and manipulate

A fast tool provides a better user experience than a slow tool.

For example, a linker would ideally not need to eagerly apply relocs within the debugging capabilities. Instead, the debugging capabilities consumers or debugging capabilities themselves would lazily apply relocs when a particular query forces it. The indexed variant of source maps provides similar lazy-reloc functionality.

2.1.9. Should support interpreted debuggees, where the interpreter is written in Wasm

With interpreted debuggees, there are nearly zero static properties of the debuggee that are reflected in the wasm itself. Instead of being the debuggee program, the wasm is interpreting yet another language in which the debuggee program is authored. This has profound impacts on the way that tools, debugging capabilities, and the debuggee program interact:

For a stepping debugger to set a breakpoint, it cannot simply translate source locations into a set of wasm bytecode offsets and use an engine-specific API to pause when those bytecodes are executed. There are no wasm bytecodes that directly correspond to a given source location. We could be paused at the same wasm bytecode multiple different times and logically be paused in multiple different source locations. The interpreter’s cooperation is necessary to set a breakpoint, since it is the only entity that knows what source location is executing at any given time.
For a sampling profiler to capture a stack, it cannot simply save the wasm functions and their bytecode offsets on the stack, and symbolicate and expand inlined functions offline. In fact, the active wasm functions have almost no correlation with the active interpreted debuggee functions. Only the interpreter itself knows which interpreted debuggee functions are active, and therefore its cooperation is required to correctly sample the stack.
Etc.

2.2. Locations

2.2.1. Must support querying the original source location for a given generated code location

This is a common query. A trap is raised in the debuggee program and a stepping debugger would like to know in which source location it should place its cursor. A static analysis tool found something interesting at some location in the generated code, and would like to present this to its user in terms of the original source location. A profiler would like to express its sampled stack trace in terms of original source locations.

2.2.2. Must support querying the generated code location(s) for a given original source location

Another common query. A user sets a breakpoint on some original source line, and the debugger must use the debugging capabilities to translate that into a set of generated code locations that it should pause execution upon reaching. An instrumentation tool that inserts tracepoints or logging at some original source location would translate the input source location into a generated code location where it should insert new instructions.

2.2.3. Must support enumerating all bidirectional mappings between original source and generated code locations

Firefox’s stepping debugger will enumerate all mappings to determine the set of lines in the original source files that can have breakpoints set on them, and highlight this information in the UI. Since pretty much every generated code location ends up needing to be relocated when bundling JavaScript files, bundlers will enumerate every mapping and apply relocs directly, rather than maintaining a side table for these relocs.

2.3. Source Text

2.3.1. Must support embedding the original source text

Moving a source file or a generated .wasm binary might break the link between the debugging capabilities and the source text, if the source text is external. For ease of development, the debugging capabilities must support embedding the source text within itself to render this scenario impossible.

Note: source maps support embedding the original source text into the source map itself with the "sourceContents" JSON field.

Note: this is not desirable for production or with larger projects, since it will inflate the size of the debugging capabilities and conflict with §2.1.6 Must be compact on disk and over the network.

2.3.2. Must support referencing external source text

For large projects, embedding the original source text within the debugging capabilities is not worth the trade off. There may be many megabytes of library source text that the user will not view, because they are (for example) stepping in their business logic code. In this scenario, that library source text serves only to slow down the initial download of the debugging capabilities.

Note: source maps support referencing external source text by URL on the Web, and (absolute or relative) file path on Node.js.

2.4. Inlined Functions

2.4.1. Should support querying the inline function frames that are logically on the stack at some generated code location

A sampling profiler should augment its sampled stack of physical frames with the logical frames that were inlined. JavaScript profilers will do this today with the help of the JavaScript engine, rather than using debugging capabilities.

The set of logical frames on the stack might also be a dynamic property. For example, an interpreter written in WebAssembly should be able to expose its interpreted language’s activation frames.

2.4.2. Should support enumerating the logical inlined function invocations within a physical function

What code ranges are due to an inlined function invocation? Note that a single inlined function invocation may be spread across multiple, distinct ranges due to code motion by the compiler.

Code size profilers want to find large functions that are getting aggressively inlined many times, and might be causing size bloat.

2.5. Types

2.5.1. Should support describing scalar types

Examples include signed and unsigned integers, floats, booleans, etc. What is the name of the type? What is the type’s size and alignment? These things are of particular importance to stepping debuggers and anything that wishes to understand a value in a running program’s live memory.

2.5.2. Should support describing compound types

Examples include pointers, arrays, structs, packed structs, structs with bitfields, C-style enums, C-style untagged unions, Rust-style tagged unions, etc. What is the name of the type? At what bit or byte offset is each member field?

In addition to stepping debuggers, this information is also useful to tools that are not actually running the debuggee program, like [ddbug](https://github.com/gimli-rs/ddbug), which lets you inspect the sizes and layouts of a program’s types, so you can optimize their representation.

2.5.3. Should support type-based pretty printing

A stepping debugger or logging console should be able to pretty print values based on each value’s type. For example, a string might be represented as a pointer and length pair, but should be displayed as its characters.

2.6. Scopes and Bindings

2.6.1. Should support querying for the scope chain at a given generated code location

When a stepping debugger pauses at a location, it should display the scopes it is paused within.

2.6.2. Should support enumerating all bindings within a scope

When a stepping debugger is paused, it should display the bindings within each scope it is paused within. Is the bindings a formal parameter? A local variable? Is it a mutable or constant binding?

2.6.3. Should support querying a binding’s type

When a stepping debugger is paused, for each binding that is in scope, it should be able to display the binding’s type.

2.6.4. Should support describing a method to find the location of or reconstruct a binding’s value

When a stepping debugger is paused, for each binding it displays, it should also display the binding’s value.

"Reconstruct" because a value might not live in a single contiguous region of memory or a single local. The compiler might have exploded it into its component parts and passed each part by value in different parameters. It might be on the stack during this bytecode range but in memory during that bytecode range within the same function.

2.7. Generic Functions and Monomorphizations

2.7.1. Should support querying whether a function is a monomorphization of a generic function

A code size profiler should identify generic functions that have been monomorphized "too many" times and are leading to code bloat. They should be able to do this regardless if the function is inlined or not.

3. Specification

TODO: after we have flushed out and agreed upon requirements.