Adds a regression test based on klauspost/compress#186. This necessitated some related changes:
* Mark "RET" as a terminal instruction
* printer refactor to maintain compatibility with asmfmt
* Tweaks to other regression tests to ensure they are run correctly in CI
Updates #100#65#8
Issue #100 demonstrated that register allocation for aliased registers is
fundamentally broken. The root of the issue is that currently accesses to the
same virtual register with different masks are treated as different registers.
This PR takes a different approach:
* Liveness analysis is masked: we now properly consider which parts of a register are live
* Register allocation produces a mapping from virtual to physical ID, and aliasing is applied later
In addition, a new pass ZeroExtend32BitOutputs accounts for the fact that 32-bit writes in 64-bit mode should actually be treated as 64-bit writes (the result is zero-extended).
Closes#100
In jump-table-like constructs, the natural way of writing the code can sometimes produce redundant jumps or labels. Therefore some basic cleanup steps have been proposed. This diff adds two transforms:
1. Remove unconditional jumps to a label immediately following.
2. Remove labels with no references at all.
Fixes#75
In some cases natural use of abstraction in avo programs can lead to redundant move instructions, specifically self-moves such as MOVQ AX, AX. This does not produce incorrect code but it is incorrect and inelegant.
This diff introduces a PruneSelfMoves pass that removes such unnecessary instructions.
Closes#76
The Context.Label method and LABEL global function did not agree. Also
breaks the convention I'd like to set that capitalized functions must
agree with existing Go assembly syntax.
To help avoid a conflict with `avo.Label`, attributes were moved to
their own package.
Fixes#35
Adds methods for referencing sub- or super-registers. For example, for
general purpose registers you can now reference As8(), As16(), ... and
for vector AsX(), AsY(), AsZ().
Closes#1
Previously we updated the set of live in registers before live out. This
was extremely inefficient, since on each pass through live in depends on
live out.
We also change to processing the instructions in reverse order, which is
more likely to be efficient, although we should replace this with
topological sort order soon.