Extends avo to support most AVX-512 instruction sets.
The instruction type is extended to support suffixes. The K family of opmask
registers is added to the register package, and the operand package is updated
to support the new operand types. Move instruction deduction in `Load` and
`Store` is extended to support KMOV* and VMOV* forms.
Internal code generation packages were overhauled. Instruction database loading
required various messy changes to account for the additional complexities of the
AVX-512 instruction sets. The internal/api package was added to introduce a
separation between instruction forms in the database, and the functions avo
provides to create them. This was required since with instruction suffixes there
is no longer a one-to-one mapping between instruction constructors and opcodes.
AVX-512 bloated generated source code size substantially, initially increasing
compilation and CI test times to an unacceptable level. Two changes were made to
address this:
1. Instruction constructors in the `x86` package moved to an optab-based
approach. This compiles substantially faster than the verbose code
generation we had before.
2. The most verbose code-generated tests are moved under build tags and
limited to a stress test mode. Stress test builds are run on
schedule but not in regular CI.
An example of AVX-512 accelerated 16-lane MD5 is provided to demonstrate and
test the new functionality.
Updates #20 #163 #229
Co-authored-by: Vaughn Iverson <vsivsi@yahoo.com>
30 lines
421 B
Go
30 lines
421 B
Go
package zeroing
|
|
|
|
import (
|
|
"testing"
|
|
|
|
"golang.org/x/sys/cpu"
|
|
)
|
|
|
|
//go:generate go run asm.go -out zeroing.s -stubs stub.go
|
|
|
|
func TestZeroing(t *testing.T) {
|
|
const (
|
|
n = 32
|
|
expect = n * (n + 1) / 2
|
|
)
|
|
|
|
if !cpu.X86.HasAVX512F {
|
|
t.Skip("require AVX512F")
|
|
}
|
|
|
|
var got [8]uint64
|
|
Zeroing(&got)
|
|
|
|
for i := 0; i < 8; i++ {
|
|
if got[i] != expect {
|
|
t.Errorf("got[%d] = %d; expect %d", i, got[i], expect)
|
|
}
|
|
}
|
|
}
|