Files

10 lines
456 B
Go
Raw Permalink Normal View History

all: AVX-512 (#217) Extends avo to support most AVX-512 instruction sets. The instruction type is extended to support suffixes. The K family of opmask registers is added to the register package, and the operand package is updated to support the new operand types. Move instruction deduction in `Load` and `Store` is extended to support KMOV* and VMOV* forms. Internal code generation packages were overhauled. Instruction database loading required various messy changes to account for the additional complexities of the AVX-512 instruction sets. The internal/api package was added to introduce a separation between instruction forms in the database, and the functions avo provides to create them. This was required since with instruction suffixes there is no longer a one-to-one mapping between instruction constructors and opcodes. AVX-512 bloated generated source code size substantially, initially increasing compilation and CI test times to an unacceptable level. Two changes were made to address this: 1. Instruction constructors in the `x86` package moved to an optab-based approach. This compiles substantially faster than the verbose code generation we had before. 2. The most verbose code-generated tests are moved under build tags and limited to a stress test mode. Stress test builds are run on schedule but not in regular CI. An example of AVX-512 accelerated 16-lane MD5 is provided to demonstrate and test the new functionality. Updates #20 #163 #229 Co-authored-by: Vaughn Iverson <vsivsi@yahoo.com>
2021-11-12 18:35:36 -08:00
// Code generated by command: go run asm.go -out md5x16.s -stubs stub.go. DO NOT EDIT.
package md5x16
// block MD5 hashes 16 messages into the running hash states h. Messages are
// at the given offsets from the base pointer. The 16-bit mask specifies
// which lanes are active: when bit i is not set loads will be disabled and
// the value of the resulting hash is undefined.
func block(h *[4][16]uint32, base uintptr, offsets *[16]uint32, mask uint16)