Files
avo/README.md

255 lines
10 KiB
Markdown
Raw Normal View History

2019-01-03 00:46:06 -08:00
<p align="center">
<img src="logo.svg" width="40%" border="0" alt="avo" />
2019-01-03 18:39:59 -08:00
<br />
<img src="https://img.shields.io/github/actions/workflow/status/mmcloughlin/avo/ci.yml?style=flat-square" alt="Build Status" />
2020-02-01 15:50:37 -05:00
<a href="https://pkg.go.dev/github.com/mmcloughlin/avo"><img src="https://img.shields.io/badge/doc-reference-007d9b?logo=go&style=flat-square" alt="go.dev" /></a>
2019-01-13 12:06:47 -08:00
<a href="https://goreportcard.com/report/github.com/mmcloughlin/avo"><img src="https://goreportcard.com/badge/github.com/mmcloughlin/avo?style=flat-square" alt="Go Report Card" /></a>
2019-01-03 00:46:06 -08:00
</p>
2018-11-04 09:47:49 -08:00
<p align="center">Generate x86 Assembly with Go</p>
`avo` makes high-performance Go assembly easier to write, review and maintain. The `avo` package presents a familiar assembly-like interface that simplifies development without sacrificing performance:
* **Use Go control structures** for assembly generation; `avo` programs _are_ Go programs
* **Register allocation**: write functions with virtual registers and `avo` assigns physical registers for you
2019-01-06 22:39:07 -08:00
* **Automatically load arguments and store return values**: ensure memory offsets are correct for complex structures
* **Generation of stub files** to interface with your Go package
For more about `avo`:
* Introductory talk ["Better `x86` Assembly Generation with Go"](https://www.youtube.com/watch?v=6Y5CZ7_tyA4) at [dotGo 2019](https://2019.dotgo.eu/) ([slides](https://speakerdeck.com/mmcloughlin/better-x86-assembly-generation-with-go))
* [Longer tutorial at Gophercon 2019](https://www.youtube.com/watch?v=WaD8sNqroAw) showing a highly-optimized dot product ([slides](https://speakerdeck.com/mmcloughlin/better-x86-assembly-generation-with-go-gophercon-2019))
* Watch [Filippo Valsorda](https://filippo.io/) live code the [rewrite of `filippo.io/edwards25519` assembly with `avo`](https://vimeo.com/679848853)
* Explore [projects using `avo`](doc/adopters.md)
* Discuss `avo` and general Go assembly topics in the [#assembly](https://gophers.slack.com/archives/C6WDZJ70S) channel of [Gophers Slack](https://invite.slack.golangbridge.org/)
_Note: APIs subject to change while `avo` is still in an experimental phase. You can use it to build [real things](examples) but we suggest you pin a version with your package manager of choice._
## Quick Start
Install `avo` with `go get`:
```
$ go get -u github.com/mmcloughlin/avo
```
`avo` assembly generators are pure Go programs. Here's a function that adds two `uint64` values:
```go
//go:build ignore
package main
import . "github.com/mmcloughlin/avo/build"
func main() {
TEXT("Add", NOSPLIT, "func(x, y uint64) uint64")
Doc("Add adds x and y.")
x := Load(Param("x"), GP64())
y := Load(Param("y"), GP64())
ADDQ(x, y)
Store(y, ReturnIndex(0))
RET()
Generate()
}
```
`go run` this code to see the assembly output. To integrate this into the rest of your Go package we recommend a [`go:generate`](https://blog.golang.org/generate) line to produce the assembly and the corresponding Go stub file.
```go
//go:generate go run asm.go -out add.s -stubs stub.go
```
After running `go generate` the [`add.s`](examples/add/add.s) file will contain the Go assembly.
```s
// Code generated by command: go run asm.go -out add.s -stubs stub.go. DO NOT EDIT.
#include "textflag.h"
// func Add(x uint64, y uint64) uint64
TEXT ·Add(SB), NOSPLIT, $0-24
MOVQ x+0(FP), AX
MOVQ y+8(FP), CX
ADDQ AX, CX
MOVQ CX, ret+16(FP)
RET
```
The same call will produce the stub file [`stub.go`](examples/add/stub.go) which will enable the function to be called from your Go code.
```go
// Code generated by command: go run asm.go -out add.s -stubs stub.go. DO NOT EDIT.
package add
// Add adds x and y.
func Add(x uint64, y uint64) uint64
```
See the [`examples/add`](examples/add) directory for the complete working example.
2019-01-03 22:20:25 -08:00
## Examples
See [`examples`](examples) for the full suite of examples.
2019-01-03 22:20:25 -08:00
### Slice Sum
Sum a slice of `uint64`s:
```go
func main() {
TEXT("Sum", NOSPLIT, "func(xs []uint64) uint64")
2019-01-03 22:20:25 -08:00
Doc("Sum returns the sum of the elements in xs.")
ptr := Load(Param("xs").Base(), GP64())
n := Load(Param("xs").Len(), GP64())
2019-01-11 10:57:38 -08:00
Comment("Initialize sum register to zero.")
s := GP64()
2019-01-03 22:20:25 -08:00
XORQ(s, s)
Label("loop")
Comment("Loop until zero bytes remain.")
CMPQ(n, Imm(0))
JE(LabelRef("done"))
2019-01-11 10:57:38 -08:00
Comment("Load from pointer and add to running sum.")
ADDQ(Mem{Base: ptr}, s)
2019-01-11 10:57:38 -08:00
Comment("Advance pointer, decrement byte count.")
ADDQ(Imm(8), ptr)
2019-01-03 22:20:25 -08:00
DECQ(n)
JMP(LabelRef("loop"))
Label("done")
Comment("Store sum to return value.")
2019-01-03 22:20:25 -08:00
Store(s, ReturnIndex(0))
RET()
Generate()
}
```
The result from this code generator is:
```s
// Code generated by command: go run asm.go -out sum.s -stubs stub.go. DO NOT EDIT.
#include "textflag.h"
// func Sum(xs []uint64) uint64
TEXT ·Sum(SB), NOSPLIT, $0-32
MOVQ xs_base+0(FP), AX
MOVQ xs_len+8(FP), CX
2019-01-11 10:57:38 -08:00
// Initialize sum register to zero.
XORQ DX, DX
loop:
// Loop until zero bytes remain.
CMPQ CX, $0x00
JE done
2019-01-11 10:57:38 -08:00
// Load from pointer and add to running sum.
ADDQ (AX), DX
2019-01-11 10:57:38 -08:00
// Advance pointer, decrement byte count.
ADDQ $0x08, AX
DECQ CX
JMP loop
done:
// Store sum to return value.
MOVQ DX, ret+24(FP)
RET
```
Full example at [`examples/sum`](examples/sum).
2019-01-07 22:21:16 -08:00
### Features
2019-01-03 22:20:25 -08:00
2019-01-07 22:21:16 -08:00
For demonstrations of `avo` features:
2019-01-03 22:20:25 -08:00
2019-01-07 22:21:16 -08:00
* **[args](examples/args):** Loading function arguments.
* **[returns](examples/returns):** Building return values.
* **[complex](examples/complex):** Working with `complex{64,128}` types.
* **[data](examples/data):** Defining `DATA` sections.
* **[ext](examples/ext):** Interacting with types from external packages.
* **[pragma](examples/pragma):** Apply compiler directives to generated functions.
2019-01-03 22:20:25 -08:00
### Real Examples
2019-01-07 22:21:16 -08:00
Implementations of full algorithms:
* **[sha1](examples/sha1):** [SHA-1](https://en.wikipedia.org/wiki/SHA-1) cryptographic hash.
* **[fnv1a](examples/fnv1a):** [FNV-1a](https://en.wikipedia.org/wiki/Fowler%E2%80%93Noll%E2%80%93Vo_hash_function#FNV-1a_hash) hash function.
* **[dot](examples/dot):** Vector dot product.
all: AVX-512 (#217) Extends avo to support most AVX-512 instruction sets. The instruction type is extended to support suffixes. The K family of opmask registers is added to the register package, and the operand package is updated to support the new operand types. Move instruction deduction in `Load` and `Store` is extended to support KMOV* and VMOV* forms. Internal code generation packages were overhauled. Instruction database loading required various messy changes to account for the additional complexities of the AVX-512 instruction sets. The internal/api package was added to introduce a separation between instruction forms in the database, and the functions avo provides to create them. This was required since with instruction suffixes there is no longer a one-to-one mapping between instruction constructors and opcodes. AVX-512 bloated generated source code size substantially, initially increasing compilation and CI test times to an unacceptable level. Two changes were made to address this: 1. Instruction constructors in the `x86` package moved to an optab-based approach. This compiles substantially faster than the verbose code generation we had before. 2. The most verbose code-generated tests are moved under build tags and limited to a stress test mode. Stress test builds are run on schedule but not in regular CI. An example of AVX-512 accelerated 16-lane MD5 is provided to demonstrate and test the new functionality. Updates #20 #163 #229 Co-authored-by: Vaughn Iverson <vsivsi@yahoo.com>
2021-11-12 18:35:36 -08:00
* **[md5x16](examples/md5x16):** AVX-512 accelerated [MD5](https://en.wikipedia.org/wiki/MD5).
* **[geohash](examples/geohash):** Integer [geohash](https://en.wikipedia.org/wiki/Geohash) encoding.
* **[stadtx](examples/stadtx):** [`StadtX` hash](https://github.com/demerphq/BeagleHash) port from [dgryski/go-stadtx](https://github.com/dgryski/go-stadtx).
2019-01-03 22:20:25 -08:00
## Adopters
Popular projects[^projects] using `avo`:
[^projects]: Projects drawn from the `avo` third-party test suite. Popularity
estimated from Github star count collected on Mar 1, 2025.
<img src="https://images.weserv.nl?fit=cover&h=24&mask=circle&maxage=7d&url=https%3A%2F%2Fgithub.com%2Fgolang.png&w=24" width="24" height="24" hspace="4" valign="middle" /> [golang / **go**](https://github.com/golang/go)
:star: 126.1k
> The Go programming language
<img src="https://images.weserv.nl?fit=cover&h=24&mask=circle&maxage=7d&url=https%3A%2F%2Fgithub.com%2Fklauspost.png&w=24" width="24" height="24" hspace="4" valign="middle" /> [klauspost / **compress**](https://github.com/klauspost/compress)
:star: 4.9k
> Optimized Go Compression Packages
<img src="https://images.weserv.nl?fit=cover&h=24&mask=circle&maxage=7d&url=https%3A%2F%2Fgithub.com%2Fgolang.png&w=24" width="24" height="24" hspace="4" valign="middle" /> [golang / **crypto**](https://github.com/golang/crypto)
:star: 3.1k
> [mirror] Go supplementary cryptography libraries
<img src="https://images.weserv.nl?fit=cover&h=24&mask=circle&maxage=7d&url=https%3A%2F%2Fgithub.com%2Fklauspost.png&w=24" width="24" height="24" hspace="4" valign="middle" /> [klauspost / **reedsolomon**](https://github.com/klauspost/reedsolomon)
:star: 1.9k
> Reed-Solomon Erasure Coding in Go
<img src="https://images.weserv.nl?fit=cover&h=24&mask=circle&maxage=7d&url=https%3A%2F%2Fgithub.com%2Fbytedance.png&w=24" width="24" height="24" hspace="4" valign="middle" /> [bytedance / **gopkg**](https://github.com/bytedance/gopkg)
:star: 1.8k
> Universal Utilities for Go
<img src="https://images.weserv.nl?fit=cover&h=24&mask=circle&maxage=7d&url=https%3A%2F%2Fgithub.com%2Fcloudflare.png&w=24" width="24" height="24" hspace="4" valign="middle" /> [cloudflare / **circl**](https://github.com/cloudflare/circl)
:star: 1.4k
> CIRCL: Cloudflare Interoperable Reusable Cryptographic Library
<img src="https://images.weserv.nl?fit=cover&h=24&mask=circle&maxage=7d&url=https%3A%2F%2Fgithub.com%2Fsegmentio.png&w=24" width="24" height="24" hspace="4" valign="middle" /> [segmentio / **asm**](https://github.com/segmentio/asm)
:star: 882
> Go library providing algorithms optimized to leverage the characteristics of modern CPUs
<img src="https://images.weserv.nl?fit=cover&h=24&mask=circle&maxage=7d&url=https%3A%2F%2Fgithub.com%2Fzeebo.png&w=24" width="24" height="24" hspace="4" valign="middle" /> [zeebo / **xxh3**](https://github.com/zeebo/xxh3)
:star: 429
> XXH3 algorithm in Go
<img src="https://images.weserv.nl?fit=cover&h=24&mask=circle&maxage=7d&url=https%3A%2F%2Fgithub.com%2Fzeebo.png&w=24" width="24" height="24" hspace="4" valign="middle" /> [zeebo / **blake3**](https://github.com/zeebo/blake3)
:star: 414
> Pure Go implementation of BLAKE3 with AVX2 and SSE4.1 acceleration
<img src="https://images.weserv.nl?fit=cover&h=24&mask=circle&maxage=7d&url=https%3A%2F%2Fgithub.com%2Flukechampine.png&w=24" width="24" height="24" hspace="4" valign="middle" /> [lukechampine / **blake3**](https://github.com/lukechampine/blake3)
:star: 368
> An AVX-512 accelerated implementation of the BLAKE3 cryptographic hash function
See the [full list of projects using `avo`](doc/adopters.md).
## Contributing
Contributions to `avo` are welcome:
2019-01-08 00:19:46 -08:00
* Feedback from using `avo` in a real project is incredibly valuable. Consider [porting an existing project to `avo`](https://github.com/mmcloughlin/avo/issues/40).
* [Submit bug reports](https://github.com/mmcloughlin/avo/issues/new) to the issues page.
* Pull requests accepted. Take a look at outstanding [issues](https://github.com/mmcloughlin/avo/issues) for ideas (especially the ["good first issue"](https://github.com/mmcloughlin/avo/labels/good%20first%20issue) label).
* Join us in the [#assembly](https://gophers.slack.com/archives/C6WDZJ70S) channel of [Gophers Slack](https://invite.slack.golangbridge.org/).
## Credits
2019-01-08 08:54:56 -08:00
Inspired by the [PeachPy](https://github.com/Maratyszcza/PeachPy) and [asmjit](https://github.com/asmjit/asmjit) projects. Thanks to [Damian Gryski](https://github.com/dgryski) for advice, and his [extensive library of PeachPy Go projects](https://github.com/mmcloughlin/avo/issues/40).
## License
`avo` is available under the [BSD 3-Clause License](LICENSE).