Commit 759be3dad9 bumped our Go
requirement to 1.18 which allows us to drop support for old-style
`+build` tags. This change runs `go fix ./...` to remove them, and
updates some remaining code generators that produced `+build` lines.
Adds the "Vector Bit Manipulation Instructions 2" instruction set.
These new instructions are added via the `opcodesextra` mechanism #345, since
they're missing from the opcodes database.
Contributed by @vsivsi. Extracted from #349 with simplifications.
Specifically, as prompted by the `dupl` linter we extract some common forms
lists into a helper `forms.go` file.
Co-authored-by: Vaughn Iverson <vsivsi@yahoo.com>
Adds the AVX-512 Bit Algorithms instruction set.
These new instructions are added via the `opcodesextra` mechanism #345, since
they're missing from the opcodes database.
Contributed by @vsivsi. Extracted from #234 with simplifications for AVX-512
form expansion.
Co-authored-by: Vaughn Iverson <vsivsi@yahoo.com>
Adds the VPOPCNTDQ instruction set, providing packed population count for
double and quadword integers.
These are added via the `opcodesextra` mechanism #345, since they're missing
from the opcodes database. In this case the 512-bit non-AVX512VL forms are
added here as well as the opcodes database, but they're deduplicated later.
Contributed by @vsivsi. Extracted from #234 with simplifications for AVX-512
form expansion.
Co-authored-by: Vaughn Iverson <vsivsi@yahoo.com>
Adds VEX and EVEX encoded versions of the `PCLMULQDQ` carry-less quadword
multiplication instruction.
These are added via the `opcodesextra` mechanism #345, since they're missing
from the opcodes database.
Contributed by @vsivsi. Extracted from #349 with minor tweaks.
Co-authored-by: Vaughn Iverson <vsivsi@yahoo.com>
Adds "Vector Neural Network Instructions" instruction set.
These are added via the `opcodesextra` mechanism #345, since they're missing
from the opcodes database.
Contributed by @vsivsi. Extracted from #349 with some tweaks.
Co-authored-by: Vaughn Iverson <vsivsi@yahoo.com>
Adds "Vector Advanced Encryption Standard" instruction set.
These are added via the `opcodesextra` mechanism #345, since they're missing
from the opcodes database.
Contributed by @vsivsi. Extracted from #349 with minor tweaks.
Co-authored-by: Vaughn Iverson <vsivsi@yahoo.com>
Adds support for the GFNI "Galois Field New Instructions" instruction set.
These instructions are not included in the Opcodes database, therefore they're
added using the "extras" mechanism introduced in #345.
For simplicity, the loading phase is updated slightly so that AVX-512 form
expansion rules are applied after extras are added to the list. This greatly
reduces the number of forms that have to be specified by hand.
Based on #343Fixes#335
Co-authored-by: Klaus Post <klauspost@gmail.com>
* Bump CI to Go 1.19
* Update golang/go edwards25519 test
* Apply formatting to printer stubs output (to get correct comment formatting)
* Bump gofumpt version
Extends avo to support most AVX-512 instruction sets.
The instruction type is extended to support suffixes. The K family of opmask
registers is added to the register package, and the operand package is updated
to support the new operand types. Move instruction deduction in `Load` and
`Store` is extended to support KMOV* and VMOV* forms.
Internal code generation packages were overhauled. Instruction database loading
required various messy changes to account for the additional complexities of the
AVX-512 instruction sets. The internal/api package was added to introduce a
separation between instruction forms in the database, and the functions avo
provides to create them. This was required since with instruction suffixes there
is no longer a one-to-one mapping between instruction constructors and opcodes.
AVX-512 bloated generated source code size substantially, initially increasing
compilation and CI test times to an unacceptable level. Two changes were made to
address this:
1. Instruction constructors in the `x86` package moved to an optab-based
approach. This compiles substantially faster than the verbose code
generation we had before.
2. The most verbose code-generated tests are moved under build tags and
limited to a stress test mode. Stress test builds are run on
schedule but not in regular CI.
An example of AVX-512 accelerated 16-lane MD5 is provided to demonstrate and
test the new functionality.
Updates #20#163#229
Co-authored-by: Vaughn Iverson <vsivsi@yahoo.com>
Adds a regression test based on klauspost/compress#186. This necessitated some related changes:
* Mark "RET" as a terminal instruction
* printer refactor to maintain compatibility with asmfmt
* Tweaks to other regression tests to ensure they are run correctly in CI
Updates #100#65#8
Adds support for a `CancellingInputs` instruction flag, to indicate cases like `XORQ R10, R10` where the instruction actually does not depend on the value of `R10` at all.
Closes#89
The Go assembler merges MOVD/MOVQ instruction forms. The logic in the
avo instruction loader was discarding the MOVD forms. This diff should
merge them correctly.
Updates #50