Commit Graph

11 Commits

Author SHA1 Message Date
Vaughn Iverson
fc7bbb86e5 internal/opcodesextra: fix actions for VNNI and VBMI2 shifts (#372)
Fix the destination register actions for VNNI and VBMI2 concatenate and
variable shift instructions `VPSH{L,R}DV{W,D,Q}`. In the case of mask merging,
the action should be `RW` not `W`.

Prior to this change, the bug manifests as incorrect vector register
scheduling when `avo` doesn't recognize that these instructions have a data
dependency on the destination register.

See:
https://www.felixcloutier.com/x86/vpdpbusd
https://www.felixcloutier.com/x86/vpdpbusds
https://www.felixcloutier.com/x86/vpshldv
https://www.felixcloutier.com/x86/vpshrdv
2023-03-06 22:52:03 -08:00
Michael McLoughlin
b01bfffcdd internal/opcodesextra: minor comment tweaks (#366) 2023-01-14 16:12:42 -08:00
Michael McLoughlin
9e56971ed6 internal/opcodesextra: _yvblendmpd forms helper (#365)
Use helper function for instructions sharing the `_yvblendmpd` family of forms.
2023-01-14 13:55:36 -08:00
Michael McLoughlin
e2c0a40f50 all: VBMI2 instructions (#363)
Adds the "Vector Bit Manipulation Instructions 2" instruction set.

These new instructions are added via the `opcodesextra` mechanism #345, since
they're missing from the opcodes database.

Contributed by @vsivsi. Extracted from #349 with simplifications.
Specifically, as prompted by the `dupl` linter we extract some common forms
lists into a helper `forms.go` file.

Co-authored-by: Vaughn Iverson <vsivsi@yahoo.com>
2023-01-14 13:25:44 -08:00
Michael McLoughlin
05ed388d0f all: BITALG instructions (#362)
Adds the AVX-512 Bit Algorithms instruction set.

These new instructions are added via the `opcodesextra` mechanism #345, since
they're missing from the opcodes database.

Contributed by @vsivsi. Extracted from #234 with simplifications for AVX-512
form expansion.

Co-authored-by: Vaughn Iverson <vsivsi@yahoo.com>
2023-01-10 18:55:12 -08:00
Michael McLoughlin
a42c8ae281 all: VPOPCNTDQ instructions (#361)
Adds the VPOPCNTDQ instruction set, providing packed population count for
double and quadword integers.

These are added via the `opcodesextra` mechanism #345, since they're missing
from the opcodes database. In this case the 512-bit non-AVX512VL forms are
added here as well as the opcodes database, but they're deduplicated later.

Contributed by @vsivsi. Extracted from #234 with simplifications for AVX-512
form expansion.

Co-authored-by: Vaughn Iverson <vsivsi@yahoo.com>
2023-01-09 22:36:27 -08:00
Michael McLoughlin
7dac51aabf all: VPCLMULQDQ instruction (#360)
Adds VEX and EVEX encoded versions of the `PCLMULQDQ` carry-less quadword
multiplication instruction.

These are added via the `opcodesextra` mechanism #345, since they're missing
from the opcodes database.

Contributed by @vsivsi. Extracted from #349 with minor tweaks.

Co-authored-by: Vaughn Iverson <vsivsi@yahoo.com>
2023-01-09 21:38:43 -08:00
Michael McLoughlin
b893b32213 all: VNNI instructions (#359)
Adds "Vector Neural Network Instructions" instruction set.

These are added via the `opcodesextra` mechanism #345, since they're missing
from the opcodes database.

Contributed by @vsivsi. Extracted from #349 with some tweaks.

Co-authored-by: Vaughn Iverson <vsivsi@yahoo.com>
2023-01-08 11:42:48 -08:00
Michael McLoughlin
0569748e19 all: VAES instructions (#358)
Adds "Vector Advanced Encryption Standard" instruction set.

These are added via the `opcodesextra` mechanism #345, since they're missing
from the opcodes database.

Contributed by @vsivsi. Extracted from #349 with minor tweaks.

Co-authored-by: Vaughn Iverson <vsivsi@yahoo.com>
2023-01-07 21:55:36 -08:00
Michael McLoughlin
946323570a all: add GFNI instructions (#344)
Adds support for the GFNI "Galois Field New Instructions" instruction set.

These instructions are not included in the Opcodes database, therefore they're
added using the "extras" mechanism introduced in #345.

For simplicity, the loading phase is updated slightly so that AVX-512 form
expansion rules are applied after extras are added to the list. This greatly
reduces the number of forms that have to be specified by hand.

Based on #343
Fixes #335

Co-authored-by: Klaus Post <klauspost@gmail.com>
2022-11-27 18:53:46 -08:00
Michael McLoughlin
a0ea0f3e6f internal/opcodesextra: curated extra instructions (#345)
Supporting extra instructions not included in the Opcodes database is
currently a challenge. Short of migrating to an entirely different source
(such as #23), the options are either to patch the XML data file or to append
additional instructions at the loading phase.

An example of patching the XML was shown in the as-yet unlanded PR #234. This
shows the XML patching approach is unwieldy and requires more information than
we actually need (for example instruction form encodings).

In #335 we discussed the alternative of adding extra instructions during
loading. This has the advantage of using avo's simpler internal data
structure.

This PR prepares for using that approach by adding an `internal/opcodesextra`
package, intended to contain manually curated lists of extra instructions to
add to the instruction database during loading. At the moment, the only
instruction added here is the `MOVLQZX` instruction that's already handled
this way.

Updates #335 #234 #23
2022-11-27 18:32:31 -08:00