Extends avo to support most AVX-512 instruction sets.
The instruction type is extended to support suffixes. The K family of opmask
registers is added to the register package, and the operand package is updated
to support the new operand types. Move instruction deduction in `Load` and
`Store` is extended to support KMOV* and VMOV* forms.
Internal code generation packages were overhauled. Instruction database loading
required various messy changes to account for the additional complexities of the
AVX-512 instruction sets. The internal/api package was added to introduce a
separation between instruction forms in the database, and the functions avo
provides to create them. This was required since with instruction suffixes there
is no longer a one-to-one mapping between instruction constructors and opcodes.
AVX-512 bloated generated source code size substantially, initially increasing
compilation and CI test times to an unacceptable level. Two changes were made to
address this:
1. Instruction constructors in the `x86` package moved to an optab-based
approach. This compiles substantially faster than the verbose code
generation we had before.
2. The most verbose code-generated tests are moved under build tags and
limited to a stress test mode. Stress test builds are run on
schedule but not in regular CI.
An example of AVX-512 accelerated 16-lane MD5 is provided to demonstrate and
test the new functionality.
Updates #20 #163 #229
Co-authored-by: Vaughn Iverson <vsivsi@yahoo.com>
25 lines
1.1 KiB
Markdown
25 lines
1.1 KiB
Markdown
# Examples
|
|
|
|
Simple functions:
|
|
|
|
* **[add](add):** Add two numbers. The "Hello World!" of `avo`.
|
|
* **[sum](sum):** Sum an array of numbers.
|
|
|
|
Features:
|
|
|
|
* **[args](args):** Loading function arguments.
|
|
* **[returns](returns):** Building return values.
|
|
* **[complex](complex):** Working with `complex{64,128}` types.
|
|
* **[data](data):** Defining `DATA` sections.
|
|
* **[ext](ext):** Interacting with types from external packages.
|
|
* **[pragma](pragma):** Apply compiler directives to generated functions.
|
|
|
|
"Real" examples:
|
|
|
|
* **[fnv1a](fnv1a):** [FNV-1a](https://en.wikipedia.org/wiki/Fowler%E2%80%93Noll%E2%80%93Vo_hash_function#FNV-1a_hash) hash function.
|
|
* **[dot](dot):** Vector dot product.
|
|
* **[geohash](geohash):** Integer [geohash](https://en.wikipedia.org/wiki/Geohash) encoding.
|
|
* **[md5x16](md5x16):** AVX-512 accelerated [MD5](https://en.wikipedia.org/wiki/MD5).
|
|
* **[sha1](sha1):** [SHA-1](https://en.wikipedia.org/wiki/SHA-1) cryptographic hash.
|
|
* **[stadtx](stadtx):** [`StadtX` hash](https://github.com/demerphq/BeagleHash) port from [dgryski/go-stadtx](https://github.com/dgryski/go-stadtx).
|