code/avo - avo - truenas.cloud sources

code/avo

Author	SHA1	Message	Date
Michael McLoughlin	05ed388d0f	all: BITALG instructions (#362 ) Adds the AVX-512 Bit Algorithms instruction set. These new instructions are added via the `opcodesextra` mechanism #345, since they're missing from the opcodes database. Contributed by @vsivsi. Extracted from #234 with simplifications for AVX-512 form expansion. Co-authored-by: Vaughn Iverson <vsivsi@yahoo.com>	2023-01-10 18:55:12 -08:00
Michael McLoughlin	a42c8ae281	all: VPOPCNTDQ instructions (#361 ) Adds the VPOPCNTDQ instruction set, providing packed population count for double and quadword integers. These are added via the `opcodesextra` mechanism #345, since they're missing from the opcodes database. In this case the 512-bit non-AVX512VL forms are added here as well as the opcodes database, but they're deduplicated later. Contributed by @vsivsi. Extracted from #234 with simplifications for AVX-512 form expansion. Co-authored-by: Vaughn Iverson <vsivsi@yahoo.com>	2023-01-09 22:36:27 -08:00
Michael McLoughlin	b893b32213	all: VNNI instructions (#359 ) Adds "Vector Neural Network Instructions" instruction set. These are added via the `opcodesextra` mechanism #345, since they're missing from the opcodes database. Contributed by @vsivsi. Extracted from #349 with some tweaks. Co-authored-by: Vaughn Iverson <vsivsi@yahoo.com>	2023-01-08 11:42:48 -08:00
Michael McLoughlin	946323570a	all: add GFNI instructions (#344 ) Adds support for the GFNI "Galois Field New Instructions" instruction set. These instructions are not included in the Opcodes database, therefore they're added using the "extras" mechanism introduced in #345. For simplicity, the loading phase is updated slightly so that AVX-512 form expansion rules are applied after extras are added to the list. This greatly reduces the number of forms that have to be specified by hand. Based on #343 Fixes #335 Co-authored-by: Klaus Post <klauspost@gmail.com>	2022-11-27 18:53:46 -08:00
Michael McLoughlin	b76e849b5c	all: AVX-512 (#217 ) Extends avo to support most AVX-512 instruction sets. The instruction type is extended to support suffixes. The K family of opmask registers is added to the register package, and the operand package is updated to support the new operand types. Move instruction deduction in `Load` and `Store` is extended to support KMOV* and VMOV* forms. Internal code generation packages were overhauled. Instruction database loading required various messy changes to account for the additional complexities of the AVX-512 instruction sets. The internal/api package was added to introduce a separation between instruction forms in the database, and the functions avo provides to create them. This was required since with instruction suffixes there is no longer a one-to-one mapping between instruction constructors and opcodes. AVX-512 bloated generated source code size substantially, initially increasing compilation and CI test times to an unacceptable level. Two changes were made to address this: 1. Instruction constructors in the `x86` package moved to an optab-based approach. This compiles substantially faster than the verbose code generation we had before. 2. The most verbose code-generated tests are moved under build tags and limited to a stress test mode. Stress test builds are run on schedule but not in regular CI. An example of AVX-512 accelerated 16-lane MD5 is provided to demonstrate and test the new functionality. Updates #20 #163 #229 Co-authored-by: Vaughn Iverson <vsivsi@yahoo.com>	2021-11-12 19:02:39 -08:00

5 Commits