287 lines
8.1 KiB
Markdown
287 lines
8.1 KiB
Markdown
<p align="center">
|
|
<img src="logo.svg" width="40%" border="0" alt="avo" />
|
|
<br />
|
|
<a href="https://app.shippable.com/github/mmcloughlin/avo/dashboard"><img src="https://api.shippable.com/projects/5bf9e8f059e32e0700ec360f/badge?branch=master" alt="Build Status" /></a>
|
|
<a href="http://godoc.org/github.com/mmcloughlin/avo"><img src="http://img.shields.io/badge/godoc-reference-5272B4.svg" alt="GoDoc" /></a>
|
|
</p>
|
|
|
|
<p align="center">High-level Golang x86 Assembly Generator</p>
|
|
|
|
`avo` aims to make high-performance Go assembly easier to write, review and maintain. It's a Go package that presents a familiar assembly-like interface, together with features to simplify development without sacrificing performance:
|
|
|
|
* `avo` programs _are_ Go programs: use **control structures** for assembly generation
|
|
* **Register allocation**: write your kernels with **virtual registers** and `avo` assigns physical registers for you
|
|
* Automatic **parameter load/stores**: ensure memory offsets are always correct even for complex data structures
|
|
* Generation of **stub files** to interface with your Go package
|
|
|
|
Inspired by the [PeachPy](https://github.com/Maratyszcza/PeachPy) and [asmjit](https://github.com/asmjit/asmjit) projects.
|
|
|
|
_Note: APIs subject to change while `avo` is still in an experimental phase. You can use it to build [real things](examples) but we suggest you pin a version with your package manager of choice._
|
|
|
|
## Install
|
|
|
|
Install `avo` with `go get`:
|
|
|
|
```
|
|
$ go get -u github.com/mmcloughlin/avo
|
|
```
|
|
|
|
## Quick Start
|
|
|
|
`avo` assembly generators are pure Go programs. Let's get started with a function that adds two `uint64` values.
|
|
|
|
[embedmd]:# (examples/add/asm.go)
|
|
```go
|
|
// +build ignore
|
|
|
|
package main
|
|
|
|
import (
|
|
. "github.com/mmcloughlin/avo/build"
|
|
)
|
|
|
|
func main() {
|
|
TEXT("Add", "func(x, y uint64) uint64")
|
|
Doc("Add adds x and y.")
|
|
x := Load(Param("x"), GP64())
|
|
y := Load(Param("y"), GP64())
|
|
ADDQ(x, y)
|
|
Store(y, ReturnIndex(0))
|
|
RET()
|
|
Generate()
|
|
}
|
|
```
|
|
|
|
You can `go run` this code to see the assembly output. To integrate this into the rest of your Go package we recommend a [`go:generate`](https://blog.golang.org/generate) line to produce the assembly and the corresponding Go stub file.
|
|
|
|
[embedmd]:# (examples/add/add_test.go go /.*go:generate.*/)
|
|
```go
|
|
//go:generate go run asm.go -out add.s -stubs stub.go
|
|
```
|
|
|
|
After running `go generate` the [`add.s`](examples/add/add.s) file will contain the Go assembly.
|
|
|
|
[embedmd]:# (examples/add/add.s)
|
|
```s
|
|
// Code generated by command: go run asm.go -out add.s -stubs stub.go. DO NOT EDIT.
|
|
|
|
// func Add(x uint64, y uint64) uint64
|
|
TEXT ·Add(SB), $0-24
|
|
MOVQ x(FP), AX
|
|
MOVQ y+8(FP), CX
|
|
ADDQ AX, CX
|
|
MOVQ CX, ret+16(FP)
|
|
RET
|
|
```
|
|
|
|
The same call will produce the stub file [`stub.go`](examples/add/stub.go) which will enable the function to be called from your Go code.
|
|
|
|
[embedmd]:# (examples/add/stub.go)
|
|
```go
|
|
// Code generated by command: go run asm.go -out add.s -stubs stub.go. DO NOT EDIT.
|
|
|
|
package add
|
|
|
|
// Add adds x and y.
|
|
func Add(x uint64, y uint64) uint64
|
|
```
|
|
|
|
See the [`examples/add`](examples/add) directory for the complete working example.
|
|
|
|
## Examples
|
|
|
|
See [`examples`](examples) for the full suite of examples.
|
|
|
|
### Slice Sum
|
|
|
|
Sum a slice of `uint64`s:
|
|
|
|
[embedmd]:# (examples/sum/asm.go /func main/ /^}/)
|
|
```go
|
|
func main() {
|
|
TEXT("Sum", "func(xs []uint64) uint64")
|
|
Doc("Sum returns the sum of the elements in xs.")
|
|
ptr := Load(Param("xs").Base(), GP64())
|
|
n := Load(Param("xs").Len(), GP64())
|
|
s := GP64()
|
|
XORQ(s, s)
|
|
Label("loop")
|
|
CMPQ(n, Imm(0))
|
|
JE(LabelRef("done"))
|
|
ADDQ(Mem{Base: ptr}, s)
|
|
ADDQ(Imm(8), ptr)
|
|
DECQ(n)
|
|
JMP(LabelRef("loop"))
|
|
Label("done")
|
|
Store(s, ReturnIndex(0))
|
|
RET()
|
|
Generate()
|
|
}
|
|
```
|
|
|
|
### Parameter Load/Store
|
|
|
|
`avo` provides deconstruction of complex data datatypes into components. For example, load the length of a string argument with:
|
|
|
|
[embedmd]:# (examples/args/asm.go go /.*TEXT.*StringLen/ /Load.*/)
|
|
```go
|
|
TEXT("StringLen", "func(s string) int")
|
|
strlen := Load(Param("s").Len(), GP64())
|
|
```
|
|
|
|
Index an array:
|
|
|
|
[embedmd]:# (examples/args/asm.go go /.*TEXT.*ArrayThree/ /Load.*/)
|
|
```go
|
|
TEXT("ArrayThree", "func(a [7]uint64) uint64")
|
|
a3 := Load(Param("a").Index(3), GP64())
|
|
```
|
|
|
|
Access a struct field (provided you have loaded your package with the `Package` function):
|
|
|
|
[embedmd]:# (examples/args/asm.go go /.*TEXT.*FieldFloat64/ /Load.*/)
|
|
```go
|
|
TEXT("FieldFloat64", "func(s Struct) float64")
|
|
f64 := Load(Param("s").Field("Float64"), XMM())
|
|
```
|
|
|
|
Component accesses can be arbitrarily nested:
|
|
|
|
[embedmd]:# (examples/args/asm.go go /.*TEXT.*FieldArrayTwoBTwo/ /Load.*/)
|
|
```go
|
|
TEXT("FieldArrayTwoBTwo", "func(s Struct) byte")
|
|
b2 := Load(Param("s").Field("Array").Index(2).Field("B").Index(2), GP8())
|
|
```
|
|
|
|
Very similar techniques apply to writing return values. See [`examples/args`](examples/args) and [`examples/returns`](examples/returns) for more.
|
|
|
|
### SHA-1
|
|
|
|
[SHA-1](https://en.wikipedia.org/wiki/SHA-1) is an excellent example of how powerful this kind of technique can be. The following is a (hopefully) clearly structured implementation of SHA-1 in `avo`, which ultimately generates a [1000+ line impenetrable assembly file](examples/sha1/sha1.s).
|
|
|
|
[embedmd]:# (examples/sha1/asm.go /func main/ /^}/)
|
|
```go
|
|
func main() {
|
|
TEXT("block", "func(h *[5]uint32, m []byte)")
|
|
Doc("block SHA-1 hashes the 64-byte message m into the running state h.")
|
|
h := Mem{Base: Load(Param("h"), GP64())}
|
|
m := Mem{Base: Load(Param("m").Base(), GP64())}
|
|
|
|
// Store message values on the stack.
|
|
w := AllocLocal(64)
|
|
W := func(r int) Mem { return w.Offset((r % 16) * 4) }
|
|
|
|
// Load initial hash.
|
|
h0, h1, h2, h3, h4 := GP32(), GP32(), GP32(), GP32(), GP32()
|
|
|
|
MOVL(h.Offset(0), h0)
|
|
MOVL(h.Offset(4), h1)
|
|
MOVL(h.Offset(8), h2)
|
|
MOVL(h.Offset(12), h3)
|
|
MOVL(h.Offset(16), h4)
|
|
|
|
// Initialize registers.
|
|
a, b, c, d, e := GP32(), GP32(), GP32(), GP32(), GP32()
|
|
|
|
MOVL(h0, a)
|
|
MOVL(h1, b)
|
|
MOVL(h2, c)
|
|
MOVL(h3, d)
|
|
MOVL(h4, e)
|
|
|
|
// Generate round updates.
|
|
quarter := []struct {
|
|
F func(Register, Register, Register) Register
|
|
K uint32
|
|
}{
|
|
{choose, 0x5a827999},
|
|
{xor, 0x6ed9eba1},
|
|
{majority, 0x8f1bbcdc},
|
|
{xor, 0xca62c1d6},
|
|
}
|
|
|
|
for r := 0; r < 80; r++ {
|
|
q := quarter[r/20]
|
|
|
|
// Load message value.
|
|
u := GP32()
|
|
if r < 16 {
|
|
MOVL(m.Offset(4*r), u)
|
|
BSWAPL(u)
|
|
} else {
|
|
MOVL(W(r-3), u)
|
|
XORL(W(r-8), u)
|
|
XORL(W(r-14), u)
|
|
XORL(W(r-16), u)
|
|
ROLL(U8(1), u)
|
|
}
|
|
MOVL(u, W(r))
|
|
|
|
// Compute the next state register.
|
|
t := GP32()
|
|
MOVL(a, t)
|
|
ROLL(U8(5), t)
|
|
ADDL(q.F(b, c, d), t)
|
|
ADDL(e, t)
|
|
ADDL(U32(q.K), t)
|
|
ADDL(u, t)
|
|
|
|
// Update registers.
|
|
ROLL(Imm(30), b)
|
|
a, b, c, d, e = t, a, b, c, d
|
|
}
|
|
|
|
// Final add.
|
|
ADDL(a, h0)
|
|
ADDL(b, h1)
|
|
ADDL(c, h2)
|
|
ADDL(d, h3)
|
|
ADDL(e, h4)
|
|
|
|
// Store results back.
|
|
MOVL(h0, h.Offset(0))
|
|
MOVL(h1, h.Offset(4))
|
|
MOVL(h2, h.Offset(8))
|
|
MOVL(h3, h.Offset(12))
|
|
MOVL(h4, h.Offset(16))
|
|
RET()
|
|
|
|
Generate()
|
|
}
|
|
```
|
|
|
|
This relies on the bitwise functions that are defined as subroutines. For example here is bitwise `choose`; the others are similar.
|
|
|
|
[embedmd]:# (examples/sha1/asm.go /func choose/ /^}/)
|
|
```go
|
|
func choose(b, c, d Register) Register {
|
|
r := GP32()
|
|
MOVL(d, r)
|
|
XORL(c, r)
|
|
ANDL(b, r)
|
|
XORL(d, r)
|
|
return r
|
|
}
|
|
```
|
|
|
|
See the complete code at [`examples/sha1`](examples/sha1).
|
|
|
|
### Real Examples
|
|
|
|
* **[fnv1a](examples/fnv1a):** [FNV-1a](https://en.wikipedia.org/wiki/Fowler%E2%80%93Noll%E2%80%93Vo_hash_function#FNV-1a_hash) hash function.
|
|
* **[dot](examples/dot):** Vector dot product.
|
|
* **[geohash](examples/geohash):** Integer [geohash](https://en.wikipedia.org/wiki/Geohash) encoding.
|
|
* **[stadtx](examples/stadtx):** [`StadtX` hash](https://github.com/demerphq/BeagleHash) port from [dgryski/go-stadtx](https://github.com/dgryski/go-stadtx).
|
|
|
|
## Contributing
|
|
|
|
Contributions to `avo` are welcome:
|
|
|
|
* Feedback from using `avo` in a real project is incredibly valuable.
|
|
* [Submit bug reports](https://github.com/mmcloughlin/avo/issues/new) to the issues page.
|
|
* Pull requests accepted. Take a look at outstanding [issues](https://github.com/mmcloughlin/avo/issues) for ideas (especially the ["good first issue"](https://github.com/mmcloughlin/avo/labels/good%20first%20issue) label).
|
|
|
|
## License
|
|
|
|
`avo` is available under the [BSD 3-Clause License](LICENSE).
|