Commit Graph

9 Commits

Author SHA1 Message Date
Michael McLoughlin
b76e849b5c all: AVX-512 (#217)
Extends avo to support most AVX-512 instruction sets.

The instruction type is extended to support suffixes. The K family of opmask
registers is added to the register package, and the operand package is updated
to support the new operand types. Move instruction deduction in `Load` and
`Store` is extended to support KMOV* and VMOV* forms.

Internal code generation packages were overhauled. Instruction database loading
required various messy changes to account for the additional complexities of the
AVX-512 instruction sets. The internal/api package was added to introduce a
separation between instruction forms in the database, and the functions avo
provides to create them. This was required since with instruction suffixes there
is no longer a one-to-one mapping between instruction constructors and opcodes.

AVX-512 bloated generated source code size substantially, initially increasing
compilation and CI test times to an unacceptable level. Two changes were made to
address this:

1.  Instruction constructors in the `x86` package moved to an optab-based
    approach. This compiles substantially faster than the verbose code
    generation we had before.

2.  The most verbose code-generated tests are moved under build tags and
    limited to a stress test mode. Stress test builds are run on
    schedule but not in regular CI.

An example of AVX-512 accelerated 16-lane MD5 is provided to demonstrate and
test the new functionality.

Updates #20 #163 #229

Co-authored-by: Vaughn Iverson <vsivsi@yahoo.com>
2021-11-12 19:02:39 -08:00
Michael McLoughlin
e089a6c93c tests/fixedbugs: regression test for issue 100 (#129)
Adds a regression test based on klauspost/compress#186. This necessitated some related changes:

* Mark "RET" as a terminal instruction
* printer refactor to maintain compatibility with asmfmt
* Tweaks to other regression tests to ensure they are run correctly in CI

Updates #100 #65 #8
2020-01-27 21:05:33 -08:00
Michael McLoughlin
f40d602170 reg,pass: refactor allocation of aliased registers (#121)
Issue #100 demonstrated that register allocation for aliased registers is
fundamentally broken. The root of the issue is that currently accesses to the
same virtual register with different masks are treated as different registers.
This PR takes a different approach:

* Liveness analysis is masked: we now properly consider which parts of a register are live
* Register allocation produces a mapping from virtual to physical ID, and aliasing is applied later

In addition, a new pass ZeroExtend32BitOutputs accounts for the fact that 32-bit writes in 64-bit mode should actually be treated as 64-bit writes (the result is zero-extended).

Closes #100
2020-01-22 22:50:40 -08:00
Michael McLoughlin
cde7e9483b pass,printer: display required ISA features (#120)
Fixes #119
2020-01-19 16:45:09 -08:00
Michael McLoughlin
c8004ba627 ir,build: pragma support (#97)
Adds support for arbitrary compiler directives.

Fixes #15
2019-09-16 11:01:48 -07:00
Michael McLoughlin
d43efabdbe inst,ir: cancelling inputs (#92)
Adds support for a `CancellingInputs` instruction flag, to indicate cases like `XORQ R10, R10` where the instruction actually does not depend on the value of `R10` at all.

Closes #89
2019-07-28 17:58:49 -07:00
Michael McLoughlin
2e7d06bc7a pass: remove redundant jumps and dangling labels (#81)
In jump-table-like constructs, the natural way of writing the code can sometimes produce redundant jumps or labels. Therefore some basic cleanup steps have been proposed. This diff adds two transforms:

1. Remove unconditional jumps to a label immediately following.
2. Remove labels with no references at all.

Fixes #75
2019-04-15 19:42:11 -07:00
Michael McLoughlin
284ee13ada build: support comments in functions (#41) 2019-01-11 10:33:41 -08:00
Michael McLoughlin
0f63e0906d ast: move "ast" types from root to ir sub-package
Closes #32
2019-01-06 14:21:10 -08:00