Compare commits

..

8 Commits

Author SHA1 Message Date
Austin Clements
096a80ff51 [dev.simd] simd/archsimd: 128- and 256-bit FMA operations do not require AVX-512
Currently, all FMA operations are marked as requiring AVX512, even on
smaller vector widths. This is happening because the narrower FMA
operations are marked as extension "FMA" in the XED. Since this
extension doesn't start with "AVX", we filter them out very early in
the XED process. However, this is just a quirk of naming: the FMA
feature depends on the AVX feature, so it is part of AVX, even if it
doesn't say so on the tin.

Fix this by accepting the FMA extension and adding FMA to the table of
CPU features. We also tweak internal/cpu slightly do it correctly
enforces that the logical FMA feature depends on both the FMA and AVX
CPUID flags.

This actually *deletes* a lot of generated code because we no longer
need the AVX-512 encoding of these 128- and 256-bit operations.

Change-Id: I744a18d0be888f536ac034fe88b110347622be7e
Reviewed-on: https://go-review.googlesource.com/c/go/+/736160
Auto-Submit: Austin Clements <austin@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2026-01-13 11:45:23 -08:00
Austin Clements
d8720e1c29 [dev.simd] simd/archsimd/_gen/simdgen: feature implications
This simplifies our handling of XED features, adds a table of which
features imply which other features, and adds this information to the
documentation of the CPU features APIs.

As part of this we fix an issue around the "AVXAES" feature. AVXAES is
defined as the combination of the AVX and AES CPUID flags. Several
other features also work like this, but have hand-written logic in
internal/cpu to compute logical feature flags from the underlying
CPUID bits. For these, we expose a single feature check function from
the SIMD API.

AVXAES currently doesn't work like this: it requires the user to check
both features. However, this forces the SIMD API to expose an "AES"
feature check, which really has nothing to do with SIMD. To make this
consistent, we introduce an AVXAES feature check function and use it
in feature requirement docs. Unlike the others combo features, this is
implemented in the simd package, but the difference is invisible to
the user.

Change-Id: I2985ebd361f0ecd45fd428903efe4c981a5ec65d
Reviewed-on: https://go-review.googlesource.com/c/go/+/736100
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
2026-01-13 09:25:01 -08:00
Cherry Mui
4a702376b7 [dev.simd] all: merge master (e84983f) into dev.simd
Merge List:

+ 2026-01-02 e84983fa40 cmd/compile: optimize SIMD IsNaN.Or(IsNaN)
+ 2026-01-02 8244b85677 simd/archsimd: add tests for IsNaN
+ 2026-01-02 13440fb518 simd/archsimd: make IsNaN unary
+ 2026-01-02 c3550b3352 simd/archsimd: correct documentation of Mask types
+ 2026-01-02 34ad26341d net/rpc: correct comment for isExportedOrBuiltinType function
+ 2025-12-30 b28808d838 cmd/go/internal/modindex: fix obvious bug using failed type assertion
+ 2025-12-30 d64add4d60 simd/archsimd: adjust documentations slightly
+ 2025-12-30 1843cfbcd6 runtime/secret: make tests more sturdy
+ 2025-12-30 fd45d70799 all: fix some minor grammatical issues in the comments
+ 2025-12-30 df4e08ac65 test/codegen: fix a tab in comparisons.go to ensure pattern works
+ 2025-12-30 cd668d744f cmd/compile: disable inlining for functions using runtime.deferrangefunc
+ 2025-12-29 06eff0f7c3 simd/archsimd: add tests for Saturate-Concat operations
+ 2025-12-29 110aaf7137 simd/archsimd: add tests for Saturate operations
+ 2025-12-29 22e7b94e7f simd/archsimd: add tests for ExtendLo operations
+ 2025-12-29 76dddce293 simd/archsimd: remove redundant suffix of ExtendLo operations
+ 2025-12-29 6ecdd2fc6e simd/archsimd: add more tests for Convert operations
+ 2025-12-29 e0c99fe285 simd/archsimd: add more tests for Truncate operations
+ 2025-12-29 08369369e5 reflect: document Call/CallSlice panic when v is unexported field
+ 2025-12-29 ca8effbde1 internal/coverage/decodemeta: correct wording in unknown version error
+ 2025-12-29 0b06b68e21 encoding/gob: clarify docs about pointers to zero values not being sent
+ 2025-12-29 9cb3edbfe9 regexp: standardize error message format in find_test.go
+ 2025-12-29 b3ed0627ce tests: improve consistency and clarity of test diagnostics
+ 2025-12-29 3dcb48d298 test: follow got/want convention in uintptrescapes test
+ 2025-12-29 f7b7e94b0a test: clarify log message for surrogate UTF-8 check
+ 2025-12-29 e790d59674 simd/archsimd: add tests for Truncate operations
+ 2025-12-27 f4cec7917c cmd: fix unused errors reported by ineffassign
+ 2025-12-27 ca13fe02c4 simd/archsimd: add more tests for Convert operations
+ 2025-12-27 037c047f2c simd/archsimd: add more tests for Extend operations
+ 2025-12-26 7971fcdf53 test/codegen: tidy tests for bits
+ 2025-12-24 0f620776d7 simd/archsimd: fix "go generate" command
+ 2025-12-24 a5fe8c07ae simd/archsimd: guard test helpers with amd64 tag
+ 2025-12-23 a23d1a4ebe bytes: improve consistency in split test messages
+ 2025-12-23 866e461b96 cmd/go: update pkgsite doc command to v0.0.0-20251223195805-1a3bd3c788fe
+ 2025-12-23 08dc8393d7 time: skip test that will fail with GO111MODULE=off
+ 2025-12-23 43ebed88cc runtime: improve a log message in TestCleanupLost
+ 2025-12-23 81283ad339 runtime: fix nGsyscallNoP accounting
+ 2025-12-23 3e0e1667f6 test/codegen: codify bit related code generation for riscv64
+ 2025-12-23 3faf988f21 errors: add a test verifying join does not flatten errors
+ 2025-12-23 2485a0bc2c cmd/asm/internal/asm: run riscv64 end-to-end tests for each profile
+ 2025-12-23 8254d66eab cmd/asm/internal/asm: abort end to end test if assembly failed
+ 2025-12-23 1b3db48db7 Revert "errors: optimize errors.Join for single unwrappable errors"
+ 2025-12-23 b6b8b2fe6e cmd/compile: handle propagating an out-of-range jump table index
+ 2025-12-22 2cd0371a0a debug/pe: avoid panic in File.ImportedSymbols
+ 2025-12-22 91435be153 runtime: revert entry point on freebsd/arm64
+ 2025-12-22 c1efada1d2 simd/archsimd: correct documentation for pairwise operations
+ 2025-12-22 3d77a0b15e os/exec: second call to Cmd.Start is always an error
+ 2025-12-20 7ecb1f36ac simd/archsimd: add HasAVX2() guards to tests that need them
+ 2025-12-19 70c22e0ad7 simd/archsimd: delete DotProductQuadruple methods for now
+ 2025-12-19 42cda7c1df simd/archsimd: add Grouped for 256- and 512-bit SaturateTo(U)Int16Concat, and fix type
+ 2025-12-19 baa0ae3aaa simd/archsimd: correct type and instruction for SaturateToUint8
+ 2025-12-19 d46c58debb go/doc: link to struct fields in the same package
+ 2025-12-19 25ed6c7f9b cmd/go/internal/doc: update pkgsite version
+ 2025-12-19 4411edf972 simd/archsimd: reword documentation for some operations
+ 2025-12-19 7d9418a19c simd/archsimd: reword documentation of comparison operations
+ 2025-12-18 d00e96d3ae internal/cpu: repair VNNI feature check

Change-Id: Iae8572e60076e52ef0121408041c1959a97515ea
2026-01-02 19:42:53 -05:00
Cherry Mui
3132179209 [dev.simd] all: merge master (cfc024d) into dev.simd
Merge List:

+ 2025-12-18 cfc024daeb simd/archsimd: reword documentation for conversion ops
+ 2025-12-17 ad91f5d241 simd/archsimd: reword documentation of shfit operations
+ 2025-12-17 b8c4cc63e7 runtime: keep track of secret allocation size
+ 2025-12-17 8564fede89 cmd/go: remove reference to no longer existing -i flag
+ 2025-12-17 eecdb61eeb crypto: rename fips140v2.0 to fips140v1.26
+ 2025-12-17 05e41225f6 simd/archsimd: reword documentation of As methods
+ 2025-12-17 516699848b runtime/secret: warn users about allocations, loosen guarantees
+ 2025-12-16 8c28ab936a cmd/cgo: don't emit C local if it is not used
+ 2025-12-16 65b71c11d4 crypto/internal/fips140only: test fips140=only mode
+ 2025-12-16 ea1aa76554 go/doc: exclude examples with results
+ 2025-12-16 5046bdf8a6 crypto/tls: reject trailing messages after client/server hello
+ 2025-12-16 3f6eabdf09 cmd/compile: use unsigned constant when folding loads for SIMD ops with constants
+ 2025-12-16 a4b5b92055 cmd/dist: preserve existing GOEXPERIMENTs when running tests with additional experiments
+ 2025-12-15 d14b6427cf cmd/link: set canUsePlugins only on platforms that support plugin
+ 2025-12-15 9ba0948172 cmd/link: don't create __x86.get_pc_thunk symbol if it already exists
+ 2025-12-15 b7944a5f80 text/template: fix slice builtin for pointers to arrays
+ 2025-12-15 6713f46426 archive/tar, compress/bzip2: base64 some troublesome testdata files
+ 2025-12-15 388eb10f50 cmd/go: show comparable in go doc interface output
+ 2025-12-13 1b291b70df simd/archsimd: skip tests if AVX is not available
+ 2025-12-12 d30884ba1f cmd/dist: test GOEXPERIMENT=simd on AMD64
+ 2025-12-12 ee0275d15b runtime, cmd/link: tighten search for stackObjectRecord
+ 2025-12-12 63fced531c runtime/secret: restore goroutine behavior to proposal
+ 2025-12-12 6455afbc6f runtime: dropg after emitting trace event in preemptPark
+ 2025-12-12 8f45611e78 runtime/pprof: deflake TestGoroutineLeakProfileConcurrency
+ 2025-12-12 34af879dde cmd/link: add new linknames to blocked linkname list
+ 2025-12-12 8d31244562 runtime/secret: guard files with goexperiment
+ 2025-12-12 e8a83788a4 go/doc: reuse import name logic for examples
+ 2025-12-11 927c89bbc5 cmd/compile: update ABI document for riscv64
+ 2025-12-11 245bcdd478 runtime: add extra subtest layer to TestFinalizerOrCleanupDeadlock
+ 2025-12-11 ae62a1bd36 Revert "database/sql: allow drivers to override Scan behavior"
+ 2025-12-11 89614ad264 runtime/trace: fix broken TestSubscribers
+ 2025-12-11 bb2337f24c cmd/go: set GOOS in vet_asm test
+ 2025-12-11 2622d2955b go/types, types2: remove indirection of Named.finite
+ 2025-12-11 5818c9d714 encoding/json/jsontext: add symbolic Kind constants
+ 2025-12-11 9de6468701 json/jsontext: normalize all invalid Kinds to 0
+ 2025-12-11 00642ee23b encoding/json: report true from v2 Decoder.More when an error is pending
+ 2025-12-11 7b60d06739 lib/time: update to 2025c/2025c
+ 2025-12-11 1de9585be2 runtime: prevent calls to GOMAXPROCS while clearing P trace state
+ 2025-12-11 e38c38f0e5 internal/trace: correctly handle GoUndetermined for GoroutineSummary
+ 2025-12-11 c0ba519764 simd/archsimd: rename Mask.AsIntMxN to Mask.ToIntMxN
+ 2025-12-11 f110ba540c simd/archsimd: define ToMask only on integer vectors
+ 2025-12-11 1da0c29c2a simd/archsimd: add package doc
+ 2025-12-11 f105dfd048 lib/fips140: freeze v1.1.0-rc1 FIPS 140 module zip file
+ 2025-12-11 af14f67911 runtime: make goroutines inherit DIT state, don't lock to OS thread
+ 2025-12-11 72c83bcc80 go/types, types2: put Named.finite behind Named.mu
+ 2025-12-10 b2a697bd06 all: update to x/crypto@7dacc380ba
+ 2025-12-10 fc66a5655b crypto: clean up subprocess-spawning tests
+ 2025-12-10 b130dab792 crypto/hpke: apply fips140.WithoutEnforcement to ML-KEM+X25519 hybrid
+ 2025-12-10 c39fe18fea crypto/mlkem/mlkemtest: error out in fips140=only mode
+ 2025-12-10 db0ab834d6 crypto/hpke: don't corrupt enc's excess capacity in DHKEM decap
+ 2025-12-10 cd873cf7e9 crypto/internal/fips140/aes/gcm: don't panic on bad nonces out of FIPS 140-3 mode
+ 2025-12-10 550c0c898b crypto/hpke: use new gcm.NewGCMForHPKE for FIPS 140-3 compliance
+ 2025-12-10 d349854de6 crypto/internal: ACVP test data migrated to Geomys repo
+ 2025-12-10 cd9319ff8e runtime: use correct function name in methodValueCallFrameObjs comment
+ 2025-12-10 0d71bd57c9 runtime: VZEROUPPER at the end of FilterNilAVX512
+ 2025-12-09 36bca3166e cmd: fix some issues in the comments
+ 2025-12-09 b9693a2d9a runtime: on AIX check isarchive before calling libpreinit
+ 2025-12-09 1274d58dac go/types, types2: add check for finite size at value observance
+ 2025-12-08 9e09812308 all: REVERSE MERGE dev.simd (c456ab7) into master

Change-Id: I66a46d710bf4738bf2ceaad20af4211fa5b67863
2025-12-18 12:06:59 -05:00
Cherry Mui
29842d6b23 [dev.simd] simd/archsimd: rename Mask.AsIntMxN to Mask.ToIntMxN
To be more consistent with vector.ToMask and mask.ToBits.

Change-Id: I47f9b7c66efd622b97da9025f667ad32415ef750
Reviewed-on: https://go-review.googlesource.com/c/go/+/729022
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-12-10 12:08:35 -08:00
Cherry Mui
63d1eec2bb [dev.simd] simd/archsimd: define ToMask only on integer vectors
The ToMask method is for converting an AVX2-style mask
represented in a vector to the Mask type. The AVX2-style mask is
a (signed) integer, so define ToMask only on integer vectors.

Change-Id: Id0976c89de4ef02fef61470a5acbd71fd1b5d4ee
Reviewed-on: https://go-review.googlesource.com/c/go/+/729020
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
2025-12-10 12:08:06 -08:00
Cherry Mui
636119f3d0 [dev.simd] simd/archsimd: add package doc
Change-Id: Ic5ce8d3e0b8a690525b93d3f269feba93a812175
Reviewed-on: https://go-review.googlesource.com/c/go/+/728560
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-12-09 13:35:55 -08:00
Cherry Mui
c87e344f7a [dev.simd] internal/buildcfg: turn GOEXPERIMENT=simd back on
Turn the experiment back on by default on the dev.simd branch, for
the ease of experimenting and development.

Change-Id: I31171c839d8359d50c1b66ca51a7945e665df289
Reviewed-on: https://go-review.googlesource.com/c/go/+/728461
TryBot-Bypass: Cherry Mui <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
2025-12-09 13:35:50 -08:00
63 changed files with 1394 additions and 2286 deletions

View File

@@ -1,2 +0,0 @@
go1.26rc2
time 2026-01-08T21:54:29Z

View File

@@ -1,2 +1,2 @@
branch: release-branch.go1.26
branch: dev.simd
parent-branch: master

View File

@@ -163,13 +163,6 @@ will fail early. The default value is `httpcookiemaxnum=3000`. Setting
number of cookies. To avoid denial of service attacks, this setting and default
was backported to Go 1.25.2 and Go 1.24.8.
Go 1.26 added a new `urlmaxqueryparams` setting that controls the maximum number
of query parameters that net/url will accept when parsing a URL-encoded query string.
If the number of parameters exceeds the number set in `urlmaxqueryparams`,
parsing will fail early. The default value is `urlmaxqueryparams=10000`.
Setting `urlmaxqueryparams=0`bles the limit. To avoid denial of service attacks,
this setting and default was backported to Go 1.25.4 and Go 1.24.10.
Go 1.26 added a new `urlstrictcolons` setting that controls whether `net/url.Parse`
allows malformed hostnames containing colons outside of a bracketed IPv6 address.
The default `urlstrictcolons=1` rejects URLs such as `http://localhost:1:2` or `http://::1/`.

View File

@@ -10,4 +10,4 @@
# go test cmd/go/internal/fips140 -update
#
v1.0.0-c2097c7c.zip daf3614e0406f67ae6323c902db3f953a1effb199142362a039e7526dfb9368b
v1.26.0.zip 9b28f847fdf1db4a36cb2b2f8ec09443c039383f085630a03ecfaddf6db7ea23
v1.1.0-rc1.zip ea94f8c3885294c9efe1bd8f9b6e86daeb25b6aff2aeb20707cd9a5101f6f54e

View File

@@ -834,16 +834,7 @@ func (r *Reader) initFileList() {
continue
}
dir := name
for {
if idx := strings.LastIndex(dir, "/"); idx < 0 {
break
} else {
dir = dir[:idx]
}
if dirs[dir] {
break
}
for dir := path.Dir(name); dir != "."; dir = path.Dir(dir) {
dirs[dir] = true
}

View File

@@ -9,7 +9,6 @@ import (
"encoding/binary"
"encoding/hex"
"errors"
"fmt"
"internal/obscuretestdata"
"io"
"io/fs"
@@ -1875,83 +1874,3 @@ func TestBaseOffsetPlusOverflow(t *testing.T) {
// as the section reader offset & size were < 0.
NewReader(bytes.NewReader(data), int64(len(data))+1875)
}
func BenchmarkReaderOneDeepDir(b *testing.B) {
var buf bytes.Buffer
zw := NewWriter(&buf)
for i := range 4000 {
name := strings.Repeat("a/", i) + "data"
zw.CreateHeader(&FileHeader{
Name: name,
Method: Store,
})
}
if err := zw.Close(); err != nil {
b.Fatal(err)
}
data := buf.Bytes()
for b.Loop() {
zr, err := NewReader(bytes.NewReader(data), int64(len(data)))
if err != nil {
b.Fatal(err)
}
zr.Open("does-not-exist")
}
}
func BenchmarkReaderManyDeepDirs(b *testing.B) {
var buf bytes.Buffer
zw := NewWriter(&buf)
for i := range 2850 {
name := fmt.Sprintf("%x", i)
name = strings.Repeat("/"+name, i+1)[1:]
zw.CreateHeader(&FileHeader{
Name: name,
Method: Store,
})
}
if err := zw.Close(); err != nil {
b.Fatal(err)
}
data := buf.Bytes()
for b.Loop() {
zr, err := NewReader(bytes.NewReader(data), int64(len(data)))
if err != nil {
b.Fatal(err)
}
zr.Open("does-not-exist")
}
}
func BenchmarkReaderManyShallowFiles(b *testing.B) {
var buf bytes.Buffer
zw := NewWriter(&buf)
for i := range 310000 {
name := fmt.Sprintf("%v", i)
zw.CreateHeader(&FileHeader{
Name: name,
Method: Store,
})
}
if err := zw.Close(); err != nil {
b.Fatal(err)
}
data := buf.Bytes()
for b.Loop() {
zr, err := NewReader(bytes.NewReader(data), int64(len(data)))
if err != nil {
b.Fatal(err)
}
zr.Open("does-not-exist")
}
}

View File

@@ -301,12 +301,17 @@ func (f *File) saveExport(x any, context astContext) {
error_(c.Pos(), "export comment has wrong name %q, want %q", name, n.Name.Name)
}
doc := ""
for _, c1 := range n.Doc.List {
if c1 != c {
doc += c1.Text + "\n"
}
}
f.ExpFunc = append(f.ExpFunc, &ExpFunc{
Func: n,
ExpName: name,
// Caution: Do not set the Doc field on purpose
// to ensure that there are no unintended artifacts
// in the binary. See https://go.dev/issue/76697.
Doc: doc,
})
break
}

View File

@@ -25,23 +25,23 @@ func ssaGenSIMDValue(s *ssagen.State, v *ssa.Value) bool {
ssa.OpAMD64VPABSQ128,
ssa.OpAMD64VPABSQ256,
ssa.OpAMD64VPABSQ512,
ssa.OpAMD64VPBROADCASTQ128,
ssa.OpAMD64VBROADCASTSS128,
ssa.OpAMD64VBROADCASTSD256,
ssa.OpAMD64VPBROADCASTD128,
ssa.OpAMD64VPBROADCASTQ256,
ssa.OpAMD64VBROADCASTSS256,
ssa.OpAMD64VBROADCASTSD512,
ssa.OpAMD64VPBROADCASTW128,
ssa.OpAMD64VPBROADCASTD256,
ssa.OpAMD64VPBROADCASTQ512,
ssa.OpAMD64VBROADCASTSS512,
ssa.OpAMD64VPBROADCASTQ128,
ssa.OpAMD64VPBROADCASTB128,
ssa.OpAMD64VPBROADCASTW256,
ssa.OpAMD64VPBROADCASTD512,
ssa.OpAMD64VPBROADCASTW128,
ssa.OpAMD64VPBROADCASTD128,
ssa.OpAMD64VBROADCASTSS256,
ssa.OpAMD64VBROADCASTSD256,
ssa.OpAMD64VPBROADCASTB256,
ssa.OpAMD64VPBROADCASTW512,
ssa.OpAMD64VPBROADCASTW256,
ssa.OpAMD64VPBROADCASTD256,
ssa.OpAMD64VPBROADCASTQ256,
ssa.OpAMD64VBROADCASTSS512,
ssa.OpAMD64VBROADCASTSD512,
ssa.OpAMD64VPBROADCASTB512,
ssa.OpAMD64VPBROADCASTW512,
ssa.OpAMD64VPBROADCASTD512,
ssa.OpAMD64VPBROADCASTQ512,
ssa.OpAMD64VCVTPD2PSX128,
ssa.OpAMD64VCVTPD2PSY128,
ssa.OpAMD64VCVTPD2PS256,
@@ -832,23 +832,23 @@ func ssaGenSIMDValue(s *ssagen.State, v *ssa.Value) bool {
ssa.OpAMD64VPABSQMasked128,
ssa.OpAMD64VPABSQMasked256,
ssa.OpAMD64VPABSQMasked512,
ssa.OpAMD64VPBROADCASTQMasked128,
ssa.OpAMD64VBROADCASTSSMasked128,
ssa.OpAMD64VBROADCASTSDMasked256,
ssa.OpAMD64VPBROADCASTDMasked128,
ssa.OpAMD64VPBROADCASTQMasked256,
ssa.OpAMD64VBROADCASTSSMasked256,
ssa.OpAMD64VBROADCASTSDMasked512,
ssa.OpAMD64VPBROADCASTWMasked128,
ssa.OpAMD64VPBROADCASTDMasked256,
ssa.OpAMD64VPBROADCASTQMasked512,
ssa.OpAMD64VBROADCASTSSMasked512,
ssa.OpAMD64VPBROADCASTQMasked128,
ssa.OpAMD64VPBROADCASTBMasked128,
ssa.OpAMD64VPBROADCASTWMasked256,
ssa.OpAMD64VPBROADCASTDMasked512,
ssa.OpAMD64VPBROADCASTWMasked128,
ssa.OpAMD64VPBROADCASTDMasked128,
ssa.OpAMD64VBROADCASTSSMasked256,
ssa.OpAMD64VBROADCASTSDMasked256,
ssa.OpAMD64VPBROADCASTBMasked256,
ssa.OpAMD64VPBROADCASTWMasked512,
ssa.OpAMD64VPBROADCASTWMasked256,
ssa.OpAMD64VPBROADCASTDMasked256,
ssa.OpAMD64VPBROADCASTQMasked256,
ssa.OpAMD64VBROADCASTSSMasked512,
ssa.OpAMD64VBROADCASTSDMasked512,
ssa.OpAMD64VPBROADCASTBMasked512,
ssa.OpAMD64VPBROADCASTWMasked512,
ssa.OpAMD64VPBROADCASTDMasked512,
ssa.OpAMD64VPBROADCASTQMasked512,
ssa.OpAMD64VCOMPRESSPSMasked128,
ssa.OpAMD64VCOMPRESSPSMasked256,
ssa.OpAMD64VCOMPRESSPSMasked512,
@@ -1959,23 +1959,11 @@ func ssaGenSIMDValue(s *ssagen.State, v *ssa.Value) bool {
ssa.OpAMD64VPERMI2Q256load,
ssa.OpAMD64VPERMI2PD512load,
ssa.OpAMD64VPERMI2Q512load,
ssa.OpAMD64VFMADD213PS128load,
ssa.OpAMD64VFMADD213PS256load,
ssa.OpAMD64VFMADD213PS512load,
ssa.OpAMD64VFMADD213PD128load,
ssa.OpAMD64VFMADD213PD256load,
ssa.OpAMD64VFMADD213PD512load,
ssa.OpAMD64VFMADDSUB213PS128load,
ssa.OpAMD64VFMADDSUB213PS256load,
ssa.OpAMD64VFMADDSUB213PS512load,
ssa.OpAMD64VFMADDSUB213PD128load,
ssa.OpAMD64VFMADDSUB213PD256load,
ssa.OpAMD64VFMADDSUB213PD512load,
ssa.OpAMD64VFMSUBADD213PS128load,
ssa.OpAMD64VFMSUBADD213PS256load,
ssa.OpAMD64VFMSUBADD213PS512load,
ssa.OpAMD64VFMSUBADD213PD128load,
ssa.OpAMD64VFMSUBADD213PD256load,
ssa.OpAMD64VFMSUBADD213PD512load,
ssa.OpAMD64VPSHLDVD128load,
ssa.OpAMD64VPSHLDVD256load,
@@ -2460,23 +2448,23 @@ func ssaGenSIMDValue(s *ssagen.State, v *ssa.Value) bool {
ssa.OpAMD64VPABSQMasked128Merging,
ssa.OpAMD64VPABSQMasked256Merging,
ssa.OpAMD64VPABSQMasked512Merging,
ssa.OpAMD64VPBROADCASTQMasked128Merging,
ssa.OpAMD64VBROADCASTSSMasked128Merging,
ssa.OpAMD64VBROADCASTSDMasked256Merging,
ssa.OpAMD64VPBROADCASTDMasked128Merging,
ssa.OpAMD64VPBROADCASTQMasked256Merging,
ssa.OpAMD64VBROADCASTSSMasked256Merging,
ssa.OpAMD64VBROADCASTSDMasked512Merging,
ssa.OpAMD64VPBROADCASTWMasked128Merging,
ssa.OpAMD64VPBROADCASTDMasked256Merging,
ssa.OpAMD64VPBROADCASTQMasked512Merging,
ssa.OpAMD64VBROADCASTSSMasked512Merging,
ssa.OpAMD64VPBROADCASTQMasked128Merging,
ssa.OpAMD64VPBROADCASTBMasked128Merging,
ssa.OpAMD64VPBROADCASTWMasked256Merging,
ssa.OpAMD64VPBROADCASTDMasked512Merging,
ssa.OpAMD64VPBROADCASTWMasked128Merging,
ssa.OpAMD64VPBROADCASTDMasked128Merging,
ssa.OpAMD64VBROADCASTSSMasked256Merging,
ssa.OpAMD64VBROADCASTSDMasked256Merging,
ssa.OpAMD64VPBROADCASTBMasked256Merging,
ssa.OpAMD64VPBROADCASTWMasked512Merging,
ssa.OpAMD64VPBROADCASTWMasked256Merging,
ssa.OpAMD64VPBROADCASTDMasked256Merging,
ssa.OpAMD64VPBROADCASTQMasked256Merging,
ssa.OpAMD64VBROADCASTSSMasked512Merging,
ssa.OpAMD64VBROADCASTSDMasked512Merging,
ssa.OpAMD64VPBROADCASTBMasked512Merging,
ssa.OpAMD64VPBROADCASTWMasked512Merging,
ssa.OpAMD64VPBROADCASTDMasked512Merging,
ssa.OpAMD64VPBROADCASTQMasked512Merging,
ssa.OpAMD64VRNDSCALEPSMasked128Merging,
ssa.OpAMD64VRNDSCALEPSMasked256Merging,
ssa.OpAMD64VRNDSCALEPSMasked512Merging,
@@ -2817,23 +2805,23 @@ func ssaGenSIMDValue(s *ssagen.State, v *ssa.Value) bool {
ssa.OpAMD64VPAVGWMasked128,
ssa.OpAMD64VPAVGWMasked256,
ssa.OpAMD64VPAVGWMasked512,
ssa.OpAMD64VPBROADCASTQMasked128,
ssa.OpAMD64VBROADCASTSSMasked128,
ssa.OpAMD64VBROADCASTSDMasked256,
ssa.OpAMD64VPBROADCASTDMasked128,
ssa.OpAMD64VPBROADCASTQMasked256,
ssa.OpAMD64VBROADCASTSSMasked256,
ssa.OpAMD64VBROADCASTSDMasked512,
ssa.OpAMD64VPBROADCASTWMasked128,
ssa.OpAMD64VPBROADCASTDMasked256,
ssa.OpAMD64VPBROADCASTQMasked512,
ssa.OpAMD64VBROADCASTSSMasked512,
ssa.OpAMD64VPBROADCASTQMasked128,
ssa.OpAMD64VPBROADCASTBMasked128,
ssa.OpAMD64VPBROADCASTWMasked256,
ssa.OpAMD64VPBROADCASTDMasked512,
ssa.OpAMD64VPBROADCASTWMasked128,
ssa.OpAMD64VPBROADCASTDMasked128,
ssa.OpAMD64VBROADCASTSSMasked256,
ssa.OpAMD64VBROADCASTSDMasked256,
ssa.OpAMD64VPBROADCASTBMasked256,
ssa.OpAMD64VPBROADCASTWMasked512,
ssa.OpAMD64VPBROADCASTWMasked256,
ssa.OpAMD64VPBROADCASTDMasked256,
ssa.OpAMD64VPBROADCASTQMasked256,
ssa.OpAMD64VBROADCASTSSMasked512,
ssa.OpAMD64VBROADCASTSDMasked512,
ssa.OpAMD64VPBROADCASTBMasked512,
ssa.OpAMD64VPBROADCASTWMasked512,
ssa.OpAMD64VPBROADCASTDMasked512,
ssa.OpAMD64VPBROADCASTQMasked512,
ssa.OpAMD64VRNDSCALEPSMasked128,
ssa.OpAMD64VRNDSCALEPSMasked128load,
ssa.OpAMD64VRNDSCALEPSMasked256,

View File

@@ -1845,13 +1845,7 @@ func ssaGenValue(s *ssagen.State, v *ssa.Value) {
ssa.OpAMD64VPMOVVec32x16ToM,
ssa.OpAMD64VPMOVVec64x2ToM,
ssa.OpAMD64VPMOVVec64x4ToM,
ssa.OpAMD64VPMOVVec64x8ToM,
ssa.OpAMD64VPMOVMSKB128,
ssa.OpAMD64VPMOVMSKB256,
ssa.OpAMD64VMOVMSKPS128,
ssa.OpAMD64VMOVMSKPS256,
ssa.OpAMD64VMOVMSKPD128,
ssa.OpAMD64VMOVMSKPD256:
ssa.OpAMD64VPMOVVec64x8ToM:
p := s.Prog(v.Op.Asm())
p.From.Type = obj.TYPE_REG
p.From.Reg = simdReg(v.Args[0])

View File

@@ -1679,21 +1679,21 @@
(Cvt8toMask64x8 <t> x) => (VPMOVMToVec64x8 <types.TypeVec512> (KMOVBk <t> x))
// masks to integers
(CvtMask8x16to16 ...) => (VPMOVMSKB128 ...)
(CvtMask8x32to32 ...) => (VPMOVMSKB256 ...)
(CvtMask8x64to64 x) => (KMOVQi (VPMOVVec8x64ToM <types.TypeMask> x))
(CvtMask8x16to16 <t> x) => (KMOVWi <t> (VPMOVVec8x16ToM <types.TypeMask> x))
(CvtMask8x32to32 <t> x) => (KMOVDi <t> (VPMOVVec8x32ToM <types.TypeMask> x))
(CvtMask8x64to64 <t> x) => (KMOVQi <t> (VPMOVVec8x64ToM <types.TypeMask> x))
(CvtMask16x8to8 x) => (KMOVBi (VPMOVVec16x8ToM <types.TypeMask> x))
(CvtMask16x16to16 x) => (KMOVWi (VPMOVVec16x16ToM <types.TypeMask> x))
(CvtMask16x32to32 x) => (KMOVDi (VPMOVVec16x32ToM <types.TypeMask> x))
(CvtMask16x8to8 <t> x) => (KMOVBi <t> (VPMOVVec16x8ToM <types.TypeMask> x))
(CvtMask16x16to16 <t> x) => (KMOVWi <t> (VPMOVVec16x16ToM <types.TypeMask> x))
(CvtMask16x32to32 <t> x) => (KMOVDi <t> (VPMOVVec16x32ToM <types.TypeMask> x))
(CvtMask32x4to8 ...) => (VMOVMSKPS128 ...)
(CvtMask32x8to8 ...) => (VMOVMSKPS256 ...)
(CvtMask32x16to16 x) => (KMOVWi (VPMOVVec32x16ToM <types.TypeMask> x))
(CvtMask32x4to8 <t> x) => (KMOVBi <t> (VPMOVVec32x4ToM <types.TypeMask> x))
(CvtMask32x8to8 <t> x) => (KMOVBi <t> (VPMOVVec32x8ToM <types.TypeMask> x))
(CvtMask32x16to16 <t> x) => (KMOVWi <t> (VPMOVVec32x16ToM <types.TypeMask> x))
(CvtMask64x2to8 ...) => (VMOVMSKPD128 ...)
(CvtMask64x4to8 ...) => (VMOVMSKPD256 ...)
(CvtMask64x8to8 x) => (KMOVBi (VPMOVVec64x8ToM <types.TypeMask> x))
(CvtMask64x2to8 <t> x) => (KMOVBi <t> (VPMOVVec64x2ToM <types.TypeMask> x))
(CvtMask64x4to8 <t> x) => (KMOVBi <t> (VPMOVVec64x4ToM <types.TypeMask> x))
(CvtMask64x8to8 <t> x) => (KMOVBi <t> (VPMOVVec64x8ToM <types.TypeMask> x))
// optimizations
(MOVBstore [off] {sym} ptr (KMOVBi mask) mem) => (KMOVBstore [off] {sym} ptr mask mem)

View File

@@ -1368,7 +1368,6 @@ func init() {
{name: "VPMASK64load512", argLength: 3, reg: vloadk, asm: "VMOVDQU64", aux: "SymOff", faultOnNilArg0: true, symEffect: "Read"}, // load from arg0+auxint+aux, arg1=k mask, arg2 = mem
{name: "VPMASK64store512", argLength: 4, reg: vstorek, asm: "VMOVDQU64", aux: "SymOff", faultOnNilArg0: true, symEffect: "Write"}, // store, *(arg0+auxint+aux) = arg2, arg1=k mask, arg3 = mem
// AVX512 moves between int-vector and mask registers
{name: "VPMOVMToVec8x16", argLength: 1, reg: kv, asm: "VPMOVM2B"},
{name: "VPMOVMToVec8x32", argLength: 1, reg: kv, asm: "VPMOVM2B"},
{name: "VPMOVMToVec8x64", argLength: 1, reg: kw, asm: "VPMOVM2B"},
@@ -1401,14 +1400,6 @@ func init() {
{name: "VPMOVVec64x4ToM", argLength: 1, reg: vk, asm: "VPMOVQ2M"},
{name: "VPMOVVec64x8ToM", argLength: 1, reg: wk, asm: "VPMOVQ2M"},
// AVX1/2 moves from int-vector to bitmask (extracting sign bits)
{name: "VPMOVMSKB128", argLength: 1, reg: vgp, asm: "VPMOVMSKB"},
{name: "VPMOVMSKB256", argLength: 1, reg: vgp, asm: "VPMOVMSKB"},
{name: "VMOVMSKPS128", argLength: 1, reg: vgp, asm: "VMOVMSKPS"},
{name: "VMOVMSKPS256", argLength: 1, reg: vgp, asm: "VMOVMSKPS"},
{name: "VMOVMSKPD128", argLength: 1, reg: vgp, asm: "VMOVMSKPD"},
{name: "VMOVMSKPD256", argLength: 1, reg: vgp, asm: "VMOVMSKPD"},
// X15 is the zero register up to 128-bit. For larger values, we zero it on the fly.
{name: "Zero128", argLength: 0, reg: x15only, zeroWidth: true, fixedReg: true},
{name: "Zero256", argLength: 0, reg: v01, asm: "VPXOR"},

View File

@@ -140,36 +140,36 @@
(AverageUint16x8 ...) => (VPAVGW128 ...)
(AverageUint16x16 ...) => (VPAVGW256 ...)
(AverageUint16x32 ...) => (VPAVGW512 ...)
(Broadcast1To2Float64x2 ...) => (VPBROADCASTQ128 ...)
(Broadcast1To2Int64x2 ...) => (VPBROADCASTQ128 ...)
(Broadcast1To2Uint64x2 ...) => (VPBROADCASTQ128 ...)
(Broadcast1To4Float32x4 ...) => (VBROADCASTSS128 ...)
(Broadcast1To4Float64x2 ...) => (VBROADCASTSD256 ...)
(Broadcast1To4Int32x4 ...) => (VPBROADCASTD128 ...)
(Broadcast1To4Int64x2 ...) => (VPBROADCASTQ256 ...)
(Broadcast1To4Uint32x4 ...) => (VPBROADCASTD128 ...)
(Broadcast1To4Uint64x2 ...) => (VPBROADCASTQ256 ...)
(Broadcast1To8Float32x4 ...) => (VBROADCASTSS256 ...)
(Broadcast1To8Float64x2 ...) => (VBROADCASTSD512 ...)
(Broadcast1To8Int16x8 ...) => (VPBROADCASTW128 ...)
(Broadcast1To8Int32x4 ...) => (VPBROADCASTD256 ...)
(Broadcast1To8Int64x2 ...) => (VPBROADCASTQ512 ...)
(Broadcast1To8Uint16x8 ...) => (VPBROADCASTW128 ...)
(Broadcast1To8Uint32x4 ...) => (VPBROADCASTD256 ...)
(Broadcast1To8Uint64x2 ...) => (VPBROADCASTQ512 ...)
(Broadcast1To16Float32x4 ...) => (VBROADCASTSS512 ...)
(Broadcast1To16Int8x16 ...) => (VPBROADCASTB128 ...)
(Broadcast1To16Int16x8 ...) => (VPBROADCASTW256 ...)
(Broadcast1To16Int32x4 ...) => (VPBROADCASTD512 ...)
(Broadcast1To16Uint8x16 ...) => (VPBROADCASTB128 ...)
(Broadcast1To16Uint16x8 ...) => (VPBROADCASTW256 ...)
(Broadcast1To16Uint32x4 ...) => (VPBROADCASTD512 ...)
(Broadcast1To32Int8x16 ...) => (VPBROADCASTB256 ...)
(Broadcast1To32Int16x8 ...) => (VPBROADCASTW512 ...)
(Broadcast1To32Uint8x16 ...) => (VPBROADCASTB256 ...)
(Broadcast1To32Uint16x8 ...) => (VPBROADCASTW512 ...)
(Broadcast1To64Int8x16 ...) => (VPBROADCASTB512 ...)
(Broadcast1To64Uint8x16 ...) => (VPBROADCASTB512 ...)
(Broadcast128Float32x4 ...) => (VBROADCASTSS128 ...)
(Broadcast128Float64x2 ...) => (VPBROADCASTQ128 ...)
(Broadcast128Int8x16 ...) => (VPBROADCASTB128 ...)
(Broadcast128Int16x8 ...) => (VPBROADCASTW128 ...)
(Broadcast128Int32x4 ...) => (VPBROADCASTD128 ...)
(Broadcast128Int64x2 ...) => (VPBROADCASTQ128 ...)
(Broadcast128Uint8x16 ...) => (VPBROADCASTB128 ...)
(Broadcast128Uint16x8 ...) => (VPBROADCASTW128 ...)
(Broadcast128Uint32x4 ...) => (VPBROADCASTD128 ...)
(Broadcast128Uint64x2 ...) => (VPBROADCASTQ128 ...)
(Broadcast256Float32x4 ...) => (VBROADCASTSS256 ...)
(Broadcast256Float64x2 ...) => (VBROADCASTSD256 ...)
(Broadcast256Int8x16 ...) => (VPBROADCASTB256 ...)
(Broadcast256Int16x8 ...) => (VPBROADCASTW256 ...)
(Broadcast256Int32x4 ...) => (VPBROADCASTD256 ...)
(Broadcast256Int64x2 ...) => (VPBROADCASTQ256 ...)
(Broadcast256Uint8x16 ...) => (VPBROADCASTB256 ...)
(Broadcast256Uint16x8 ...) => (VPBROADCASTW256 ...)
(Broadcast256Uint32x4 ...) => (VPBROADCASTD256 ...)
(Broadcast256Uint64x2 ...) => (VPBROADCASTQ256 ...)
(Broadcast512Float32x4 ...) => (VBROADCASTSS512 ...)
(Broadcast512Float64x2 ...) => (VBROADCASTSD512 ...)
(Broadcast512Int8x16 ...) => (VPBROADCASTB512 ...)
(Broadcast512Int16x8 ...) => (VPBROADCASTW512 ...)
(Broadcast512Int32x4 ...) => (VPBROADCASTD512 ...)
(Broadcast512Int64x2 ...) => (VPBROADCASTQ512 ...)
(Broadcast512Uint8x16 ...) => (VPBROADCASTB512 ...)
(Broadcast512Uint16x8 ...) => (VPBROADCASTW512 ...)
(Broadcast512Uint32x4 ...) => (VPBROADCASTD512 ...)
(Broadcast512Uint64x2 ...) => (VPBROADCASTQ512 ...)
(CeilFloat32x4 x) => (VROUNDPS128 [2] x)
(CeilFloat32x8 x) => (VROUNDPS256 [2] x)
(CeilFloat64x2 x) => (VROUNDPD128 [2] x)
@@ -1424,23 +1424,23 @@
(VMOVDQU16Masked128 (VPAVGW128 x y) mask) => (VPAVGWMasked128 x y mask)
(VMOVDQU16Masked256 (VPAVGW256 x y) mask) => (VPAVGWMasked256 x y mask)
(VMOVDQU16Masked512 (VPAVGW512 x y) mask) => (VPAVGWMasked512 x y mask)
(VMOVDQU64Masked128 (VPBROADCASTQ128 x) mask) => (VPBROADCASTQMasked128 x mask)
(VMOVDQU32Masked128 (VBROADCASTSS128 x) mask) => (VBROADCASTSSMasked128 x mask)
(VMOVDQU64Masked256 (VBROADCASTSD256 x) mask) => (VBROADCASTSDMasked256 x mask)
(VMOVDQU32Masked128 (VPBROADCASTD128 x) mask) => (VPBROADCASTDMasked128 x mask)
(VMOVDQU64Masked256 (VPBROADCASTQ256 x) mask) => (VPBROADCASTQMasked256 x mask)
(VMOVDQU32Masked256 (VBROADCASTSS256 x) mask) => (VBROADCASTSSMasked256 x mask)
(VMOVDQU64Masked512 (VBROADCASTSD512 x) mask) => (VBROADCASTSDMasked512 x mask)
(VMOVDQU16Masked128 (VPBROADCASTW128 x) mask) => (VPBROADCASTWMasked128 x mask)
(VMOVDQU32Masked256 (VPBROADCASTD256 x) mask) => (VPBROADCASTDMasked256 x mask)
(VMOVDQU64Masked512 (VPBROADCASTQ512 x) mask) => (VPBROADCASTQMasked512 x mask)
(VMOVDQU32Masked512 (VBROADCASTSS512 x) mask) => (VBROADCASTSSMasked512 x mask)
(VMOVDQU64Masked128 (VPBROADCASTQ128 x) mask) => (VPBROADCASTQMasked128 x mask)
(VMOVDQU8Masked128 (VPBROADCASTB128 x) mask) => (VPBROADCASTBMasked128 x mask)
(VMOVDQU16Masked256 (VPBROADCASTW256 x) mask) => (VPBROADCASTWMasked256 x mask)
(VMOVDQU32Masked512 (VPBROADCASTD512 x) mask) => (VPBROADCASTDMasked512 x mask)
(VMOVDQU16Masked128 (VPBROADCASTW128 x) mask) => (VPBROADCASTWMasked128 x mask)
(VMOVDQU32Masked128 (VPBROADCASTD128 x) mask) => (VPBROADCASTDMasked128 x mask)
(VMOVDQU32Masked256 (VBROADCASTSS256 x) mask) => (VBROADCASTSSMasked256 x mask)
(VMOVDQU64Masked256 (VBROADCASTSD256 x) mask) => (VBROADCASTSDMasked256 x mask)
(VMOVDQU8Masked256 (VPBROADCASTB256 x) mask) => (VPBROADCASTBMasked256 x mask)
(VMOVDQU16Masked512 (VPBROADCASTW512 x) mask) => (VPBROADCASTWMasked512 x mask)
(VMOVDQU16Masked256 (VPBROADCASTW256 x) mask) => (VPBROADCASTWMasked256 x mask)
(VMOVDQU32Masked256 (VPBROADCASTD256 x) mask) => (VPBROADCASTDMasked256 x mask)
(VMOVDQU64Masked256 (VPBROADCASTQ256 x) mask) => (VPBROADCASTQMasked256 x mask)
(VMOVDQU32Masked512 (VBROADCASTSS512 x) mask) => (VBROADCASTSSMasked512 x mask)
(VMOVDQU64Masked512 (VBROADCASTSD512 x) mask) => (VBROADCASTSDMasked512 x mask)
(VMOVDQU8Masked512 (VPBROADCASTB512 x) mask) => (VPBROADCASTBMasked512 x mask)
(VMOVDQU16Masked512 (VPBROADCASTW512 x) mask) => (VPBROADCASTWMasked512 x mask)
(VMOVDQU32Masked512 (VPBROADCASTD512 x) mask) => (VPBROADCASTDMasked512 x mask)
(VMOVDQU64Masked512 (VPBROADCASTQ512 x) mask) => (VPBROADCASTQMasked512 x mask)
(VMOVDQU32Masked128 (VRNDSCALEPS128 [a] x) mask) => (VRNDSCALEPSMasked128 [a] x mask)
(VMOVDQU32Masked256 (VRNDSCALEPS256 [a] x) mask) => (VRNDSCALEPSMasked256 [a] x mask)
(VMOVDQU32Masked512 (VRNDSCALEPS512 [a] x) mask) => (VRNDSCALEPSMasked512 [a] x mask)
@@ -2771,11 +2771,7 @@
(VPMULLQ128 x l:(VMOVDQUload128 {sym} [off] ptr mem)) && canMergeLoad(v, l) && clobber(l) => (VPMULLQ128load {sym} [off] x ptr mem)
(VPMULLQ256 x l:(VMOVDQUload256 {sym} [off] ptr mem)) && canMergeLoad(v, l) && clobber(l) => (VPMULLQ256load {sym} [off] x ptr mem)
(VPMULLQ512 x l:(VMOVDQUload512 {sym} [off] ptr mem)) && canMergeLoad(v, l) && clobber(l) => (VPMULLQ512load {sym} [off] x ptr mem)
(VFMADD213PS128 x y l:(VMOVDQUload128 {sym} [off] ptr mem)) && canMergeLoad(v, l) && clobber(l) => (VFMADD213PS128load {sym} [off] x y ptr mem)
(VFMADD213PS256 x y l:(VMOVDQUload256 {sym} [off] ptr mem)) && canMergeLoad(v, l) && clobber(l) => (VFMADD213PS256load {sym} [off] x y ptr mem)
(VFMADD213PS512 x y l:(VMOVDQUload512 {sym} [off] ptr mem)) && canMergeLoad(v, l) && clobber(l) => (VFMADD213PS512load {sym} [off] x y ptr mem)
(VFMADD213PD128 x y l:(VMOVDQUload128 {sym} [off] ptr mem)) && canMergeLoad(v, l) && clobber(l) => (VFMADD213PD128load {sym} [off] x y ptr mem)
(VFMADD213PD256 x y l:(VMOVDQUload256 {sym} [off] ptr mem)) && canMergeLoad(v, l) && clobber(l) => (VFMADD213PD256load {sym} [off] x y ptr mem)
(VFMADD213PD512 x y l:(VMOVDQUload512 {sym} [off] ptr mem)) && canMergeLoad(v, l) && clobber(l) => (VFMADD213PD512load {sym} [off] x y ptr mem)
(VFMADD213PSMasked128 x y l:(VMOVDQUload128 {sym} [off] ptr mem) mask) && canMergeLoad(v, l) && clobber(l) => (VFMADD213PSMasked128load {sym} [off] x y ptr mask mem)
(VFMADD213PSMasked256 x y l:(VMOVDQUload256 {sym} [off] ptr mem) mask) && canMergeLoad(v, l) && clobber(l) => (VFMADD213PSMasked256load {sym} [off] x y ptr mask mem)
@@ -2783,11 +2779,7 @@
(VFMADD213PDMasked128 x y l:(VMOVDQUload128 {sym} [off] ptr mem) mask) && canMergeLoad(v, l) && clobber(l) => (VFMADD213PDMasked128load {sym} [off] x y ptr mask mem)
(VFMADD213PDMasked256 x y l:(VMOVDQUload256 {sym} [off] ptr mem) mask) && canMergeLoad(v, l) && clobber(l) => (VFMADD213PDMasked256load {sym} [off] x y ptr mask mem)
(VFMADD213PDMasked512 x y l:(VMOVDQUload512 {sym} [off] ptr mem) mask) && canMergeLoad(v, l) && clobber(l) => (VFMADD213PDMasked512load {sym} [off] x y ptr mask mem)
(VFMADDSUB213PS128 x y l:(VMOVDQUload128 {sym} [off] ptr mem)) && canMergeLoad(v, l) && clobber(l) => (VFMADDSUB213PS128load {sym} [off] x y ptr mem)
(VFMADDSUB213PS256 x y l:(VMOVDQUload256 {sym} [off] ptr mem)) && canMergeLoad(v, l) && clobber(l) => (VFMADDSUB213PS256load {sym} [off] x y ptr mem)
(VFMADDSUB213PS512 x y l:(VMOVDQUload512 {sym} [off] ptr mem)) && canMergeLoad(v, l) && clobber(l) => (VFMADDSUB213PS512load {sym} [off] x y ptr mem)
(VFMADDSUB213PD128 x y l:(VMOVDQUload128 {sym} [off] ptr mem)) && canMergeLoad(v, l) && clobber(l) => (VFMADDSUB213PD128load {sym} [off] x y ptr mem)
(VFMADDSUB213PD256 x y l:(VMOVDQUload256 {sym} [off] ptr mem)) && canMergeLoad(v, l) && clobber(l) => (VFMADDSUB213PD256load {sym} [off] x y ptr mem)
(VFMADDSUB213PD512 x y l:(VMOVDQUload512 {sym} [off] ptr mem)) && canMergeLoad(v, l) && clobber(l) => (VFMADDSUB213PD512load {sym} [off] x y ptr mem)
(VFMADDSUB213PSMasked128 x y l:(VMOVDQUload128 {sym} [off] ptr mem) mask) && canMergeLoad(v, l) && clobber(l) => (VFMADDSUB213PSMasked128load {sym} [off] x y ptr mask mem)
(VFMADDSUB213PSMasked256 x y l:(VMOVDQUload256 {sym} [off] ptr mem) mask) && canMergeLoad(v, l) && clobber(l) => (VFMADDSUB213PSMasked256load {sym} [off] x y ptr mask mem)
@@ -2807,11 +2799,7 @@
(VPMULLQMasked128 x l:(VMOVDQUload128 {sym} [off] ptr mem) mask) && canMergeLoad(v, l) && clobber(l) => (VPMULLQMasked128load {sym} [off] x ptr mask mem)
(VPMULLQMasked256 x l:(VMOVDQUload256 {sym} [off] ptr mem) mask) && canMergeLoad(v, l) && clobber(l) => (VPMULLQMasked256load {sym} [off] x ptr mask mem)
(VPMULLQMasked512 x l:(VMOVDQUload512 {sym} [off] ptr mem) mask) && canMergeLoad(v, l) && clobber(l) => (VPMULLQMasked512load {sym} [off] x ptr mask mem)
(VFMSUBADD213PS128 x y l:(VMOVDQUload128 {sym} [off] ptr mem)) && canMergeLoad(v, l) && clobber(l) => (VFMSUBADD213PS128load {sym} [off] x y ptr mem)
(VFMSUBADD213PS256 x y l:(VMOVDQUload256 {sym} [off] ptr mem)) && canMergeLoad(v, l) && clobber(l) => (VFMSUBADD213PS256load {sym} [off] x y ptr mem)
(VFMSUBADD213PS512 x y l:(VMOVDQUload512 {sym} [off] ptr mem)) && canMergeLoad(v, l) && clobber(l) => (VFMSUBADD213PS512load {sym} [off] x y ptr mem)
(VFMSUBADD213PD128 x y l:(VMOVDQUload128 {sym} [off] ptr mem)) && canMergeLoad(v, l) && clobber(l) => (VFMSUBADD213PD128load {sym} [off] x y ptr mem)
(VFMSUBADD213PD256 x y l:(VMOVDQUload256 {sym} [off] ptr mem)) && canMergeLoad(v, l) && clobber(l) => (VFMSUBADD213PD256load {sym} [off] x y ptr mem)
(VFMSUBADD213PD512 x y l:(VMOVDQUload512 {sym} [off] ptr mem)) && canMergeLoad(v, l) && clobber(l) => (VFMSUBADD213PD512load {sym} [off] x y ptr mem)
(VFMSUBADD213PSMasked128 x y l:(VMOVDQUload128 {sym} [off] ptr mem) mask) && canMergeLoad(v, l) && clobber(l) => (VFMSUBADD213PSMasked128load {sym} [off] x y ptr mask mem)
(VFMSUBADD213PSMasked256 x y l:(VMOVDQUload256 {sym} [off] ptr mem) mask) && canMergeLoad(v, l) && clobber(l) => (VFMSUBADD213PSMasked256load {sym} [off] x y ptr mask mem)

View File

@@ -172,38 +172,38 @@ func simdAMD64Ops(v11, v21, v2k, vkv, v2kv, v2kk, v31, v3kv, vgpv, vgp, vfpv, vf
{name: "VEXPANDPSMasked128", argLength: 2, reg: wkw, asm: "VEXPANDPS", commutative: false, typ: "Vec128", resultInArg0: false},
{name: "VEXPANDPSMasked256", argLength: 2, reg: wkw, asm: "VEXPANDPS", commutative: false, typ: "Vec256", resultInArg0: false},
{name: "VEXPANDPSMasked512", argLength: 2, reg: wkw, asm: "VEXPANDPS", commutative: false, typ: "Vec512", resultInArg0: false},
{name: "VFMADD213PD128", argLength: 3, reg: w31, asm: "VFMADD213PD", commutative: false, typ: "Vec128", resultInArg0: true},
{name: "VFMADD213PD256", argLength: 3, reg: w31, asm: "VFMADD213PD", commutative: false, typ: "Vec256", resultInArg0: true},
{name: "VFMADD213PD128", argLength: 3, reg: v31, asm: "VFMADD213PD", commutative: false, typ: "Vec128", resultInArg0: true},
{name: "VFMADD213PD256", argLength: 3, reg: v31, asm: "VFMADD213PD", commutative: false, typ: "Vec256", resultInArg0: true},
{name: "VFMADD213PD512", argLength: 3, reg: w31, asm: "VFMADD213PD", commutative: false, typ: "Vec512", resultInArg0: true},
{name: "VFMADD213PDMasked128", argLength: 4, reg: w3kw, asm: "VFMADD213PD", commutative: false, typ: "Vec128", resultInArg0: true},
{name: "VFMADD213PDMasked256", argLength: 4, reg: w3kw, asm: "VFMADD213PD", commutative: false, typ: "Vec256", resultInArg0: true},
{name: "VFMADD213PDMasked512", argLength: 4, reg: w3kw, asm: "VFMADD213PD", commutative: false, typ: "Vec512", resultInArg0: true},
{name: "VFMADD213PS128", argLength: 3, reg: w31, asm: "VFMADD213PS", commutative: false, typ: "Vec128", resultInArg0: true},
{name: "VFMADD213PS256", argLength: 3, reg: w31, asm: "VFMADD213PS", commutative: false, typ: "Vec256", resultInArg0: true},
{name: "VFMADD213PS128", argLength: 3, reg: v31, asm: "VFMADD213PS", commutative: false, typ: "Vec128", resultInArg0: true},
{name: "VFMADD213PS256", argLength: 3, reg: v31, asm: "VFMADD213PS", commutative: false, typ: "Vec256", resultInArg0: true},
{name: "VFMADD213PS512", argLength: 3, reg: w31, asm: "VFMADD213PS", commutative: false, typ: "Vec512", resultInArg0: true},
{name: "VFMADD213PSMasked128", argLength: 4, reg: w3kw, asm: "VFMADD213PS", commutative: false, typ: "Vec128", resultInArg0: true},
{name: "VFMADD213PSMasked256", argLength: 4, reg: w3kw, asm: "VFMADD213PS", commutative: false, typ: "Vec256", resultInArg0: true},
{name: "VFMADD213PSMasked512", argLength: 4, reg: w3kw, asm: "VFMADD213PS", commutative: false, typ: "Vec512", resultInArg0: true},
{name: "VFMADDSUB213PD128", argLength: 3, reg: w31, asm: "VFMADDSUB213PD", commutative: false, typ: "Vec128", resultInArg0: true},
{name: "VFMADDSUB213PD256", argLength: 3, reg: w31, asm: "VFMADDSUB213PD", commutative: false, typ: "Vec256", resultInArg0: true},
{name: "VFMADDSUB213PD128", argLength: 3, reg: v31, asm: "VFMADDSUB213PD", commutative: false, typ: "Vec128", resultInArg0: true},
{name: "VFMADDSUB213PD256", argLength: 3, reg: v31, asm: "VFMADDSUB213PD", commutative: false, typ: "Vec256", resultInArg0: true},
{name: "VFMADDSUB213PD512", argLength: 3, reg: w31, asm: "VFMADDSUB213PD", commutative: false, typ: "Vec512", resultInArg0: true},
{name: "VFMADDSUB213PDMasked128", argLength: 4, reg: w3kw, asm: "VFMADDSUB213PD", commutative: false, typ: "Vec128", resultInArg0: true},
{name: "VFMADDSUB213PDMasked256", argLength: 4, reg: w3kw, asm: "VFMADDSUB213PD", commutative: false, typ: "Vec256", resultInArg0: true},
{name: "VFMADDSUB213PDMasked512", argLength: 4, reg: w3kw, asm: "VFMADDSUB213PD", commutative: false, typ: "Vec512", resultInArg0: true},
{name: "VFMADDSUB213PS128", argLength: 3, reg: w31, asm: "VFMADDSUB213PS", commutative: false, typ: "Vec128", resultInArg0: true},
{name: "VFMADDSUB213PS256", argLength: 3, reg: w31, asm: "VFMADDSUB213PS", commutative: false, typ: "Vec256", resultInArg0: true},
{name: "VFMADDSUB213PS128", argLength: 3, reg: v31, asm: "VFMADDSUB213PS", commutative: false, typ: "Vec128", resultInArg0: true},
{name: "VFMADDSUB213PS256", argLength: 3, reg: v31, asm: "VFMADDSUB213PS", commutative: false, typ: "Vec256", resultInArg0: true},
{name: "VFMADDSUB213PS512", argLength: 3, reg: w31, asm: "VFMADDSUB213PS", commutative: false, typ: "Vec512", resultInArg0: true},
{name: "VFMADDSUB213PSMasked128", argLength: 4, reg: w3kw, asm: "VFMADDSUB213PS", commutative: false, typ: "Vec128", resultInArg0: true},
{name: "VFMADDSUB213PSMasked256", argLength: 4, reg: w3kw, asm: "VFMADDSUB213PS", commutative: false, typ: "Vec256", resultInArg0: true},
{name: "VFMADDSUB213PSMasked512", argLength: 4, reg: w3kw, asm: "VFMADDSUB213PS", commutative: false, typ: "Vec512", resultInArg0: true},
{name: "VFMSUBADD213PD128", argLength: 3, reg: w31, asm: "VFMSUBADD213PD", commutative: false, typ: "Vec128", resultInArg0: true},
{name: "VFMSUBADD213PD256", argLength: 3, reg: w31, asm: "VFMSUBADD213PD", commutative: false, typ: "Vec256", resultInArg0: true},
{name: "VFMSUBADD213PD128", argLength: 3, reg: v31, asm: "VFMSUBADD213PD", commutative: false, typ: "Vec128", resultInArg0: true},
{name: "VFMSUBADD213PD256", argLength: 3, reg: v31, asm: "VFMSUBADD213PD", commutative: false, typ: "Vec256", resultInArg0: true},
{name: "VFMSUBADD213PD512", argLength: 3, reg: w31, asm: "VFMSUBADD213PD", commutative: false, typ: "Vec512", resultInArg0: true},
{name: "VFMSUBADD213PDMasked128", argLength: 4, reg: w3kw, asm: "VFMSUBADD213PD", commutative: false, typ: "Vec128", resultInArg0: true},
{name: "VFMSUBADD213PDMasked256", argLength: 4, reg: w3kw, asm: "VFMSUBADD213PD", commutative: false, typ: "Vec256", resultInArg0: true},
{name: "VFMSUBADD213PDMasked512", argLength: 4, reg: w3kw, asm: "VFMSUBADD213PD", commutative: false, typ: "Vec512", resultInArg0: true},
{name: "VFMSUBADD213PS128", argLength: 3, reg: w31, asm: "VFMSUBADD213PS", commutative: false, typ: "Vec128", resultInArg0: true},
{name: "VFMSUBADD213PS256", argLength: 3, reg: w31, asm: "VFMSUBADD213PS", commutative: false, typ: "Vec256", resultInArg0: true},
{name: "VFMSUBADD213PS128", argLength: 3, reg: v31, asm: "VFMSUBADD213PS", commutative: false, typ: "Vec128", resultInArg0: true},
{name: "VFMSUBADD213PS256", argLength: 3, reg: v31, asm: "VFMSUBADD213PS", commutative: false, typ: "Vec256", resultInArg0: true},
{name: "VFMSUBADD213PS512", argLength: 3, reg: w31, asm: "VFMSUBADD213PS", commutative: false, typ: "Vec512", resultInArg0: true},
{name: "VFMSUBADD213PSMasked128", argLength: 4, reg: w3kw, asm: "VFMSUBADD213PS", commutative: false, typ: "Vec128", resultInArg0: true},
{name: "VFMSUBADD213PSMasked256", argLength: 4, reg: w3kw, asm: "VFMSUBADD213PS", commutative: false, typ: "Vec256", resultInArg0: true},
@@ -1594,38 +1594,26 @@ func simdAMD64Ops(v11, v21, v2k, vkv, v2kv, v2kk, v31, v3kv, vgpv, vgp, vfpv, vf
{name: "VDIVPSMasked128load", argLength: 4, reg: w2kwload, asm: "VDIVPS", commutative: false, typ: "Vec128", aux: "SymOff", symEffect: "Read", resultInArg0: false},
{name: "VDIVPSMasked256load", argLength: 4, reg: w2kwload, asm: "VDIVPS", commutative: false, typ: "Vec256", aux: "SymOff", symEffect: "Read", resultInArg0: false},
{name: "VDIVPSMasked512load", argLength: 4, reg: w2kwload, asm: "VDIVPS", commutative: false, typ: "Vec512", aux: "SymOff", symEffect: "Read", resultInArg0: false},
{name: "VFMADD213PD128load", argLength: 4, reg: w31load, asm: "VFMADD213PD", commutative: false, typ: "Vec128", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMADD213PD256load", argLength: 4, reg: w31load, asm: "VFMADD213PD", commutative: false, typ: "Vec256", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMADD213PD512load", argLength: 4, reg: w31load, asm: "VFMADD213PD", commutative: false, typ: "Vec512", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMADD213PDMasked128load", argLength: 5, reg: w3kwload, asm: "VFMADD213PD", commutative: false, typ: "Vec128", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMADD213PDMasked256load", argLength: 5, reg: w3kwload, asm: "VFMADD213PD", commutative: false, typ: "Vec256", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMADD213PDMasked512load", argLength: 5, reg: w3kwload, asm: "VFMADD213PD", commutative: false, typ: "Vec512", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMADD213PS128load", argLength: 4, reg: w31load, asm: "VFMADD213PS", commutative: false, typ: "Vec128", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMADD213PS256load", argLength: 4, reg: w31load, asm: "VFMADD213PS", commutative: false, typ: "Vec256", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMADD213PS512load", argLength: 4, reg: w31load, asm: "VFMADD213PS", commutative: false, typ: "Vec512", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMADD213PSMasked128load", argLength: 5, reg: w3kwload, asm: "VFMADD213PS", commutative: false, typ: "Vec128", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMADD213PSMasked256load", argLength: 5, reg: w3kwload, asm: "VFMADD213PS", commutative: false, typ: "Vec256", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMADD213PSMasked512load", argLength: 5, reg: w3kwload, asm: "VFMADD213PS", commutative: false, typ: "Vec512", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMADDSUB213PD128load", argLength: 4, reg: w31load, asm: "VFMADDSUB213PD", commutative: false, typ: "Vec128", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMADDSUB213PD256load", argLength: 4, reg: w31load, asm: "VFMADDSUB213PD", commutative: false, typ: "Vec256", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMADDSUB213PD512load", argLength: 4, reg: w31load, asm: "VFMADDSUB213PD", commutative: false, typ: "Vec512", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMADDSUB213PDMasked128load", argLength: 5, reg: w3kwload, asm: "VFMADDSUB213PD", commutative: false, typ: "Vec128", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMADDSUB213PDMasked256load", argLength: 5, reg: w3kwload, asm: "VFMADDSUB213PD", commutative: false, typ: "Vec256", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMADDSUB213PDMasked512load", argLength: 5, reg: w3kwload, asm: "VFMADDSUB213PD", commutative: false, typ: "Vec512", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMADDSUB213PS128load", argLength: 4, reg: w31load, asm: "VFMADDSUB213PS", commutative: false, typ: "Vec128", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMADDSUB213PS256load", argLength: 4, reg: w31load, asm: "VFMADDSUB213PS", commutative: false, typ: "Vec256", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMADDSUB213PS512load", argLength: 4, reg: w31load, asm: "VFMADDSUB213PS", commutative: false, typ: "Vec512", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMADDSUB213PSMasked128load", argLength: 5, reg: w3kwload, asm: "VFMADDSUB213PS", commutative: false, typ: "Vec128", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMADDSUB213PSMasked256load", argLength: 5, reg: w3kwload, asm: "VFMADDSUB213PS", commutative: false, typ: "Vec256", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMADDSUB213PSMasked512load", argLength: 5, reg: w3kwload, asm: "VFMADDSUB213PS", commutative: false, typ: "Vec512", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMSUBADD213PD128load", argLength: 4, reg: w31load, asm: "VFMSUBADD213PD", commutative: false, typ: "Vec128", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMSUBADD213PD256load", argLength: 4, reg: w31load, asm: "VFMSUBADD213PD", commutative: false, typ: "Vec256", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMSUBADD213PD512load", argLength: 4, reg: w31load, asm: "VFMSUBADD213PD", commutative: false, typ: "Vec512", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMSUBADD213PDMasked128load", argLength: 5, reg: w3kwload, asm: "VFMSUBADD213PD", commutative: false, typ: "Vec128", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMSUBADD213PDMasked256load", argLength: 5, reg: w3kwload, asm: "VFMSUBADD213PD", commutative: false, typ: "Vec256", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMSUBADD213PDMasked512load", argLength: 5, reg: w3kwload, asm: "VFMSUBADD213PD", commutative: false, typ: "Vec512", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMSUBADD213PS128load", argLength: 4, reg: w31load, asm: "VFMSUBADD213PS", commutative: false, typ: "Vec128", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMSUBADD213PS256load", argLength: 4, reg: w31load, asm: "VFMSUBADD213PS", commutative: false, typ: "Vec256", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMSUBADD213PS512load", argLength: 4, reg: w31load, asm: "VFMSUBADD213PS", commutative: false, typ: "Vec512", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMSUBADD213PSMasked128load", argLength: 5, reg: w3kwload, asm: "VFMSUBADD213PS", commutative: false, typ: "Vec128", aux: "SymOff", symEffect: "Read", resultInArg0: true},
{name: "VFMSUBADD213PSMasked256load", argLength: 5, reg: w3kwload, asm: "VFMSUBADD213PS", commutative: false, typ: "Vec256", aux: "SymOff", symEffect: "Read", resultInArg0: true},

View File

@@ -143,36 +143,36 @@ func simdGenericOps() []opData {
{name: "AverageUint16x8", argLength: 2, commutative: true},
{name: "AverageUint16x16", argLength: 2, commutative: true},
{name: "AverageUint16x32", argLength: 2, commutative: true},
{name: "Broadcast1To2Float64x2", argLength: 1, commutative: false},
{name: "Broadcast1To2Int64x2", argLength: 1, commutative: false},
{name: "Broadcast1To2Uint64x2", argLength: 1, commutative: false},
{name: "Broadcast1To4Float32x4", argLength: 1, commutative: false},
{name: "Broadcast1To4Float64x2", argLength: 1, commutative: false},
{name: "Broadcast1To4Int32x4", argLength: 1, commutative: false},
{name: "Broadcast1To4Int64x2", argLength: 1, commutative: false},
{name: "Broadcast1To4Uint32x4", argLength: 1, commutative: false},
{name: "Broadcast1To4Uint64x2", argLength: 1, commutative: false},
{name: "Broadcast1To8Float32x4", argLength: 1, commutative: false},
{name: "Broadcast1To8Float64x2", argLength: 1, commutative: false},
{name: "Broadcast1To8Int16x8", argLength: 1, commutative: false},
{name: "Broadcast1To8Int32x4", argLength: 1, commutative: false},
{name: "Broadcast1To8Int64x2", argLength: 1, commutative: false},
{name: "Broadcast1To8Uint16x8", argLength: 1, commutative: false},
{name: "Broadcast1To8Uint32x4", argLength: 1, commutative: false},
{name: "Broadcast1To8Uint64x2", argLength: 1, commutative: false},
{name: "Broadcast1To16Float32x4", argLength: 1, commutative: false},
{name: "Broadcast1To16Int8x16", argLength: 1, commutative: false},
{name: "Broadcast1To16Int16x8", argLength: 1, commutative: false},
{name: "Broadcast1To16Int32x4", argLength: 1, commutative: false},
{name: "Broadcast1To16Uint8x16", argLength: 1, commutative: false},
{name: "Broadcast1To16Uint16x8", argLength: 1, commutative: false},
{name: "Broadcast1To16Uint32x4", argLength: 1, commutative: false},
{name: "Broadcast1To32Int8x16", argLength: 1, commutative: false},
{name: "Broadcast1To32Int16x8", argLength: 1, commutative: false},
{name: "Broadcast1To32Uint8x16", argLength: 1, commutative: false},
{name: "Broadcast1To32Uint16x8", argLength: 1, commutative: false},
{name: "Broadcast1To64Int8x16", argLength: 1, commutative: false},
{name: "Broadcast1To64Uint8x16", argLength: 1, commutative: false},
{name: "Broadcast128Float32x4", argLength: 1, commutative: false},
{name: "Broadcast128Float64x2", argLength: 1, commutative: false},
{name: "Broadcast128Int8x16", argLength: 1, commutative: false},
{name: "Broadcast128Int16x8", argLength: 1, commutative: false},
{name: "Broadcast128Int32x4", argLength: 1, commutative: false},
{name: "Broadcast128Int64x2", argLength: 1, commutative: false},
{name: "Broadcast128Uint8x16", argLength: 1, commutative: false},
{name: "Broadcast128Uint16x8", argLength: 1, commutative: false},
{name: "Broadcast128Uint32x4", argLength: 1, commutative: false},
{name: "Broadcast128Uint64x2", argLength: 1, commutative: false},
{name: "Broadcast256Float32x4", argLength: 1, commutative: false},
{name: "Broadcast256Float64x2", argLength: 1, commutative: false},
{name: "Broadcast256Int8x16", argLength: 1, commutative: false},
{name: "Broadcast256Int16x8", argLength: 1, commutative: false},
{name: "Broadcast256Int32x4", argLength: 1, commutative: false},
{name: "Broadcast256Int64x2", argLength: 1, commutative: false},
{name: "Broadcast256Uint8x16", argLength: 1, commutative: false},
{name: "Broadcast256Uint16x8", argLength: 1, commutative: false},
{name: "Broadcast256Uint32x4", argLength: 1, commutative: false},
{name: "Broadcast256Uint64x2", argLength: 1, commutative: false},
{name: "Broadcast512Float32x4", argLength: 1, commutative: false},
{name: "Broadcast512Float64x2", argLength: 1, commutative: false},
{name: "Broadcast512Int8x16", argLength: 1, commutative: false},
{name: "Broadcast512Int16x8", argLength: 1, commutative: false},
{name: "Broadcast512Int32x4", argLength: 1, commutative: false},
{name: "Broadcast512Int64x2", argLength: 1, commutative: false},
{name: "Broadcast512Uint8x16", argLength: 1, commutative: false},
{name: "Broadcast512Uint16x8", argLength: 1, commutative: false},
{name: "Broadcast512Uint32x4", argLength: 1, commutative: false},
{name: "Broadcast512Uint64x2", argLength: 1, commutative: false},
{name: "CeilFloat32x4", argLength: 1, commutative: false},
{name: "CeilFloat32x8", argLength: 1, commutative: false},
{name: "CeilFloat64x2", argLength: 1, commutative: false},

View File

@@ -1214,12 +1214,6 @@ const (
OpAMD64VPMOVVec64x2ToM
OpAMD64VPMOVVec64x4ToM
OpAMD64VPMOVVec64x8ToM
OpAMD64VPMOVMSKB128
OpAMD64VPMOVMSKB256
OpAMD64VMOVMSKPS128
OpAMD64VMOVMSKPS256
OpAMD64VMOVMSKPD128
OpAMD64VMOVMSKPD256
OpAMD64Zero128
OpAMD64Zero256
OpAMD64Zero512
@@ -2841,38 +2835,26 @@ const (
OpAMD64VDIVPSMasked128load
OpAMD64VDIVPSMasked256load
OpAMD64VDIVPSMasked512load
OpAMD64VFMADD213PD128load
OpAMD64VFMADD213PD256load
OpAMD64VFMADD213PD512load
OpAMD64VFMADD213PDMasked128load
OpAMD64VFMADD213PDMasked256load
OpAMD64VFMADD213PDMasked512load
OpAMD64VFMADD213PS128load
OpAMD64VFMADD213PS256load
OpAMD64VFMADD213PS512load
OpAMD64VFMADD213PSMasked128load
OpAMD64VFMADD213PSMasked256load
OpAMD64VFMADD213PSMasked512load
OpAMD64VFMADDSUB213PD128load
OpAMD64VFMADDSUB213PD256load
OpAMD64VFMADDSUB213PD512load
OpAMD64VFMADDSUB213PDMasked128load
OpAMD64VFMADDSUB213PDMasked256load
OpAMD64VFMADDSUB213PDMasked512load
OpAMD64VFMADDSUB213PS128load
OpAMD64VFMADDSUB213PS256load
OpAMD64VFMADDSUB213PS512load
OpAMD64VFMADDSUB213PSMasked128load
OpAMD64VFMADDSUB213PSMasked256load
OpAMD64VFMADDSUB213PSMasked512load
OpAMD64VFMSUBADD213PD128load
OpAMD64VFMSUBADD213PD256load
OpAMD64VFMSUBADD213PD512load
OpAMD64VFMSUBADD213PDMasked128load
OpAMD64VFMSUBADD213PDMasked256load
OpAMD64VFMSUBADD213PDMasked512load
OpAMD64VFMSUBADD213PS128load
OpAMD64VFMSUBADD213PS256load
OpAMD64VFMSUBADD213PS512load
OpAMD64VFMSUBADD213PSMasked128load
OpAMD64VFMSUBADD213PSMasked256load
@@ -6309,36 +6291,36 @@ const (
OpAverageUint16x8
OpAverageUint16x16
OpAverageUint16x32
OpBroadcast1To2Float64x2
OpBroadcast1To2Int64x2
OpBroadcast1To2Uint64x2
OpBroadcast1To4Float32x4
OpBroadcast1To4Float64x2
OpBroadcast1To4Int32x4
OpBroadcast1To4Int64x2
OpBroadcast1To4Uint32x4
OpBroadcast1To4Uint64x2
OpBroadcast1To8Float32x4
OpBroadcast1To8Float64x2
OpBroadcast1To8Int16x8
OpBroadcast1To8Int32x4
OpBroadcast1To8Int64x2
OpBroadcast1To8Uint16x8
OpBroadcast1To8Uint32x4
OpBroadcast1To8Uint64x2
OpBroadcast1To16Float32x4
OpBroadcast1To16Int8x16
OpBroadcast1To16Int16x8
OpBroadcast1To16Int32x4
OpBroadcast1To16Uint8x16
OpBroadcast1To16Uint16x8
OpBroadcast1To16Uint32x4
OpBroadcast1To32Int8x16
OpBroadcast1To32Int16x8
OpBroadcast1To32Uint8x16
OpBroadcast1To32Uint16x8
OpBroadcast1To64Int8x16
OpBroadcast1To64Uint8x16
OpBroadcast128Float32x4
OpBroadcast128Float64x2
OpBroadcast128Int8x16
OpBroadcast128Int16x8
OpBroadcast128Int32x4
OpBroadcast128Int64x2
OpBroadcast128Uint8x16
OpBroadcast128Uint16x8
OpBroadcast128Uint32x4
OpBroadcast128Uint64x2
OpBroadcast256Float32x4
OpBroadcast256Float64x2
OpBroadcast256Int8x16
OpBroadcast256Int16x8
OpBroadcast256Int32x4
OpBroadcast256Int64x2
OpBroadcast256Uint8x16
OpBroadcast256Uint16x8
OpBroadcast256Uint32x4
OpBroadcast256Uint64x2
OpBroadcast512Float32x4
OpBroadcast512Float64x2
OpBroadcast512Int8x16
OpBroadcast512Int16x8
OpBroadcast512Int32x4
OpBroadcast512Int64x2
OpBroadcast512Uint8x16
OpBroadcast512Uint16x8
OpBroadcast512Uint32x4
OpBroadcast512Uint64x2
OpCeilFloat32x4
OpCeilFloat32x8
OpCeilFloat64x2
@@ -20357,84 +20339,6 @@ var opcodeTable = [...]opInfo{
},
},
},
{
name: "VPMOVMSKB128",
argLen: 1,
asm: x86.AVPMOVMSKB,
reg: regInfo{
inputs: []inputInfo{
{0, 2147418112}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14
},
outputs: []outputInfo{
{0, 49135}, // AX CX DX BX BP SI DI R8 R9 R10 R11 R12 R13 R15
},
},
},
{
name: "VPMOVMSKB256",
argLen: 1,
asm: x86.AVPMOVMSKB,
reg: regInfo{
inputs: []inputInfo{
{0, 2147418112}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14
},
outputs: []outputInfo{
{0, 49135}, // AX CX DX BX BP SI DI R8 R9 R10 R11 R12 R13 R15
},
},
},
{
name: "VMOVMSKPS128",
argLen: 1,
asm: x86.AVMOVMSKPS,
reg: regInfo{
inputs: []inputInfo{
{0, 2147418112}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14
},
outputs: []outputInfo{
{0, 49135}, // AX CX DX BX BP SI DI R8 R9 R10 R11 R12 R13 R15
},
},
},
{
name: "VMOVMSKPS256",
argLen: 1,
asm: x86.AVMOVMSKPS,
reg: regInfo{
inputs: []inputInfo{
{0, 2147418112}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14
},
outputs: []outputInfo{
{0, 49135}, // AX CX DX BX BP SI DI R8 R9 R10 R11 R12 R13 R15
},
},
},
{
name: "VMOVMSKPD128",
argLen: 1,
asm: x86.AVMOVMSKPD,
reg: regInfo{
inputs: []inputInfo{
{0, 2147418112}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14
},
outputs: []outputInfo{
{0, 49135}, // AX CX DX BX BP SI DI R8 R9 R10 R11 R12 R13 R15
},
},
},
{
name: "VMOVMSKPD256",
argLen: 1,
asm: x86.AVMOVMSKPD,
reg: regInfo{
inputs: []inputInfo{
{0, 2147418112}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14
},
outputs: []outputInfo{
{0, 49135}, // AX CX DX BX BP SI DI R8 R9 R10 R11 R12 R13 R15
},
},
},
{
name: "Zero128",
argLen: 0,
@@ -23179,12 +23083,12 @@ var opcodeTable = [...]opInfo{
asm: x86.AVFMADD213PD,
reg: regInfo{
inputs: []inputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{1, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{2, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{0, 2147418112}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14
{1, 4294901760}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15
{2, 4294901760}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15
},
outputs: []outputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{0, 2147418112}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14
},
},
},
@@ -23195,12 +23099,12 @@ var opcodeTable = [...]opInfo{
asm: x86.AVFMADD213PD,
reg: regInfo{
inputs: []inputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{1, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{2, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{0, 2147418112}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14
{1, 4294901760}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15
{2, 4294901760}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15
},
outputs: []outputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{0, 2147418112}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14
},
},
},
@@ -23278,12 +23182,12 @@ var opcodeTable = [...]opInfo{
asm: x86.AVFMADD213PS,
reg: regInfo{
inputs: []inputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{1, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{2, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{0, 2147418112}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14
{1, 4294901760}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15
{2, 4294901760}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15
},
outputs: []outputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{0, 2147418112}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14
},
},
},
@@ -23294,12 +23198,12 @@ var opcodeTable = [...]opInfo{
asm: x86.AVFMADD213PS,
reg: regInfo{
inputs: []inputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{1, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{2, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{0, 2147418112}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14
{1, 4294901760}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15
{2, 4294901760}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15
},
outputs: []outputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{0, 2147418112}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14
},
},
},
@@ -23377,12 +23281,12 @@ var opcodeTable = [...]opInfo{
asm: x86.AVFMADDSUB213PD,
reg: regInfo{
inputs: []inputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{1, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{2, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{0, 2147418112}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14
{1, 4294901760}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15
{2, 4294901760}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15
},
outputs: []outputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{0, 2147418112}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14
},
},
},
@@ -23393,12 +23297,12 @@ var opcodeTable = [...]opInfo{
asm: x86.AVFMADDSUB213PD,
reg: regInfo{
inputs: []inputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{1, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{2, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{0, 2147418112}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14
{1, 4294901760}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15
{2, 4294901760}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15
},
outputs: []outputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{0, 2147418112}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14
},
},
},
@@ -23476,12 +23380,12 @@ var opcodeTable = [...]opInfo{
asm: x86.AVFMADDSUB213PS,
reg: regInfo{
inputs: []inputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{1, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{2, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{0, 2147418112}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14
{1, 4294901760}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15
{2, 4294901760}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15
},
outputs: []outputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{0, 2147418112}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14
},
},
},
@@ -23492,12 +23396,12 @@ var opcodeTable = [...]opInfo{
asm: x86.AVFMADDSUB213PS,
reg: regInfo{
inputs: []inputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{1, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{2, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{0, 2147418112}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14
{1, 4294901760}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15
{2, 4294901760}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15
},
outputs: []outputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{0, 2147418112}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14
},
},
},
@@ -23575,12 +23479,12 @@ var opcodeTable = [...]opInfo{
asm: x86.AVFMSUBADD213PD,
reg: regInfo{
inputs: []inputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{1, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{2, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{0, 2147418112}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14
{1, 4294901760}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15
{2, 4294901760}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15
},
outputs: []outputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{0, 2147418112}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14
},
},
},
@@ -23591,12 +23495,12 @@ var opcodeTable = [...]opInfo{
asm: x86.AVFMSUBADD213PD,
reg: regInfo{
inputs: []inputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{1, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{2, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{0, 2147418112}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14
{1, 4294901760}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15
{2, 4294901760}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15
},
outputs: []outputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{0, 2147418112}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14
},
},
},
@@ -23674,12 +23578,12 @@ var opcodeTable = [...]opInfo{
asm: x86.AVFMSUBADD213PS,
reg: regInfo{
inputs: []inputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{1, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{2, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{0, 2147418112}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14
{1, 4294901760}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15
{2, 4294901760}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15
},
outputs: []outputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{0, 2147418112}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14
},
},
},
@@ -23690,12 +23594,12 @@ var opcodeTable = [...]opInfo{
asm: x86.AVFMSUBADD213PS,
reg: regInfo{
inputs: []inputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{1, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{2, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{0, 2147418112}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14
{1, 4294901760}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15
{2, 4294901760}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15
},
outputs: []outputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{0, 2147418112}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14
},
},
},
@@ -44193,42 +44097,6 @@ var opcodeTable = [...]opInfo{
},
},
},
{
name: "VFMADD213PD128load",
auxType: auxSymOff,
argLen: 4,
resultInArg0: true,
symEffect: SymRead,
asm: x86.AVFMADD213PD,
reg: regInfo{
inputs: []inputInfo{
{2, 72057594037977087}, // AX CX DX BX SP BP SI DI R8 R9 R10 R11 R12 R13 R15 SB
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{1, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
},
outputs: []outputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
},
},
},
{
name: "VFMADD213PD256load",
auxType: auxSymOff,
argLen: 4,
resultInArg0: true,
symEffect: SymRead,
asm: x86.AVFMADD213PD,
reg: regInfo{
inputs: []inputInfo{
{2, 72057594037977087}, // AX CX DX BX SP BP SI DI R8 R9 R10 R11 R12 R13 R15 SB
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{1, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
},
outputs: []outputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
},
},
},
{
name: "VFMADD213PD512load",
auxType: auxSymOff,
@@ -44304,42 +44172,6 @@ var opcodeTable = [...]opInfo{
},
},
},
{
name: "VFMADD213PS128load",
auxType: auxSymOff,
argLen: 4,
resultInArg0: true,
symEffect: SymRead,
asm: x86.AVFMADD213PS,
reg: regInfo{
inputs: []inputInfo{
{2, 72057594037977087}, // AX CX DX BX SP BP SI DI R8 R9 R10 R11 R12 R13 R15 SB
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{1, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
},
outputs: []outputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
},
},
},
{
name: "VFMADD213PS256load",
auxType: auxSymOff,
argLen: 4,
resultInArg0: true,
symEffect: SymRead,
asm: x86.AVFMADD213PS,
reg: regInfo{
inputs: []inputInfo{
{2, 72057594037977087}, // AX CX DX BX SP BP SI DI R8 R9 R10 R11 R12 R13 R15 SB
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{1, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
},
outputs: []outputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
},
},
},
{
name: "VFMADD213PS512load",
auxType: auxSymOff,
@@ -44415,42 +44247,6 @@ var opcodeTable = [...]opInfo{
},
},
},
{
name: "VFMADDSUB213PD128load",
auxType: auxSymOff,
argLen: 4,
resultInArg0: true,
symEffect: SymRead,
asm: x86.AVFMADDSUB213PD,
reg: regInfo{
inputs: []inputInfo{
{2, 72057594037977087}, // AX CX DX BX SP BP SI DI R8 R9 R10 R11 R12 R13 R15 SB
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{1, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
},
outputs: []outputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
},
},
},
{
name: "VFMADDSUB213PD256load",
auxType: auxSymOff,
argLen: 4,
resultInArg0: true,
symEffect: SymRead,
asm: x86.AVFMADDSUB213PD,
reg: regInfo{
inputs: []inputInfo{
{2, 72057594037977087}, // AX CX DX BX SP BP SI DI R8 R9 R10 R11 R12 R13 R15 SB
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{1, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
},
outputs: []outputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
},
},
},
{
name: "VFMADDSUB213PD512load",
auxType: auxSymOff,
@@ -44526,42 +44322,6 @@ var opcodeTable = [...]opInfo{
},
},
},
{
name: "VFMADDSUB213PS128load",
auxType: auxSymOff,
argLen: 4,
resultInArg0: true,
symEffect: SymRead,
asm: x86.AVFMADDSUB213PS,
reg: regInfo{
inputs: []inputInfo{
{2, 72057594037977087}, // AX CX DX BX SP BP SI DI R8 R9 R10 R11 R12 R13 R15 SB
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{1, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
},
outputs: []outputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
},
},
},
{
name: "VFMADDSUB213PS256load",
auxType: auxSymOff,
argLen: 4,
resultInArg0: true,
symEffect: SymRead,
asm: x86.AVFMADDSUB213PS,
reg: regInfo{
inputs: []inputInfo{
{2, 72057594037977087}, // AX CX DX BX SP BP SI DI R8 R9 R10 R11 R12 R13 R15 SB
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{1, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
},
outputs: []outputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
},
},
},
{
name: "VFMADDSUB213PS512load",
auxType: auxSymOff,
@@ -44637,42 +44397,6 @@ var opcodeTable = [...]opInfo{
},
},
},
{
name: "VFMSUBADD213PD128load",
auxType: auxSymOff,
argLen: 4,
resultInArg0: true,
symEffect: SymRead,
asm: x86.AVFMSUBADD213PD,
reg: regInfo{
inputs: []inputInfo{
{2, 72057594037977087}, // AX CX DX BX SP BP SI DI R8 R9 R10 R11 R12 R13 R15 SB
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{1, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
},
outputs: []outputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
},
},
},
{
name: "VFMSUBADD213PD256load",
auxType: auxSymOff,
argLen: 4,
resultInArg0: true,
symEffect: SymRead,
asm: x86.AVFMSUBADD213PD,
reg: regInfo{
inputs: []inputInfo{
{2, 72057594037977087}, // AX CX DX BX SP BP SI DI R8 R9 R10 R11 R12 R13 R15 SB
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{1, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
},
outputs: []outputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
},
},
},
{
name: "VFMSUBADD213PD512load",
auxType: auxSymOff,
@@ -44748,42 +44472,6 @@ var opcodeTable = [...]opInfo{
},
},
},
{
name: "VFMSUBADD213PS128load",
auxType: auxSymOff,
argLen: 4,
resultInArg0: true,
symEffect: SymRead,
asm: x86.AVFMSUBADD213PS,
reg: regInfo{
inputs: []inputInfo{
{2, 72057594037977087}, // AX CX DX BX SP BP SI DI R8 R9 R10 R11 R12 R13 R15 SB
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{1, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
},
outputs: []outputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
},
},
},
{
name: "VFMSUBADD213PS256load",
auxType: auxSymOff,
argLen: 4,
resultInArg0: true,
symEffect: SymRead,
asm: x86.AVFMSUBADD213PS,
reg: regInfo{
inputs: []inputInfo{
{2, 72057594037977087}, // AX CX DX BX SP BP SI DI R8 R9 R10 R11 R12 R13 R15 SB
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
{1, 281474976645120}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
},
outputs: []outputInfo{
{0, 281472829161472}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31
},
},
},
{
name: "VFMSUBADD213PS512load",
auxType: auxSymOff,
@@ -89875,152 +89563,152 @@ var opcodeTable = [...]opInfo{
generic: true,
},
{
name: "Broadcast1To2Float64x2",
name: "Broadcast128Float32x4",
argLen: 1,
generic: true,
},
{
name: "Broadcast1To2Int64x2",
name: "Broadcast128Float64x2",
argLen: 1,
generic: true,
},
{
name: "Broadcast1To2Uint64x2",
name: "Broadcast128Int8x16",
argLen: 1,
generic: true,
},
{
name: "Broadcast1To4Float32x4",
name: "Broadcast128Int16x8",
argLen: 1,
generic: true,
},
{
name: "Broadcast1To4Float64x2",
name: "Broadcast128Int32x4",
argLen: 1,
generic: true,
},
{
name: "Broadcast1To4Int32x4",
name: "Broadcast128Int64x2",
argLen: 1,
generic: true,
},
{
name: "Broadcast1To4Int64x2",
name: "Broadcast128Uint8x16",
argLen: 1,
generic: true,
},
{
name: "Broadcast1To4Uint32x4",
name: "Broadcast128Uint16x8",
argLen: 1,
generic: true,
},
{
name: "Broadcast1To4Uint64x2",
name: "Broadcast128Uint32x4",
argLen: 1,
generic: true,
},
{
name: "Broadcast1To8Float32x4",
name: "Broadcast128Uint64x2",
argLen: 1,
generic: true,
},
{
name: "Broadcast1To8Float64x2",
name: "Broadcast256Float32x4",
argLen: 1,
generic: true,
},
{
name: "Broadcast1To8Int16x8",
name: "Broadcast256Float64x2",
argLen: 1,
generic: true,
},
{
name: "Broadcast1To8Int32x4",
name: "Broadcast256Int8x16",
argLen: 1,
generic: true,
},
{
name: "Broadcast1To8Int64x2",
name: "Broadcast256Int16x8",
argLen: 1,
generic: true,
},
{
name: "Broadcast1To8Uint16x8",
name: "Broadcast256Int32x4",
argLen: 1,
generic: true,
},
{
name: "Broadcast1To8Uint32x4",
name: "Broadcast256Int64x2",
argLen: 1,
generic: true,
},
{
name: "Broadcast1To8Uint64x2",
name: "Broadcast256Uint8x16",
argLen: 1,
generic: true,
},
{
name: "Broadcast1To16Float32x4",
name: "Broadcast256Uint16x8",
argLen: 1,
generic: true,
},
{
name: "Broadcast1To16Int8x16",
name: "Broadcast256Uint32x4",
argLen: 1,
generic: true,
},
{
name: "Broadcast1To16Int16x8",
name: "Broadcast256Uint64x2",
argLen: 1,
generic: true,
},
{
name: "Broadcast1To16Int32x4",
name: "Broadcast512Float32x4",
argLen: 1,
generic: true,
},
{
name: "Broadcast1To16Uint8x16",
name: "Broadcast512Float64x2",
argLen: 1,
generic: true,
},
{
name: "Broadcast1To16Uint16x8",
name: "Broadcast512Int8x16",
argLen: 1,
generic: true,
},
{
name: "Broadcast1To16Uint32x4",
name: "Broadcast512Int16x8",
argLen: 1,
generic: true,
},
{
name: "Broadcast1To32Int8x16",
name: "Broadcast512Int32x4",
argLen: 1,
generic: true,
},
{
name: "Broadcast1To32Int16x8",
name: "Broadcast512Int64x2",
argLen: 1,
generic: true,
},
{
name: "Broadcast1To32Uint8x16",
name: "Broadcast512Uint8x16",
argLen: 1,
generic: true,
},
{
name: "Broadcast1To32Uint16x8",
name: "Broadcast512Uint16x8",
argLen: 1,
generic: true,
},
{
name: "Broadcast1To64Int8x16",
name: "Broadcast512Uint32x4",
argLen: 1,
generic: true,
},
{
name: "Broadcast1To64Uint8x16",
name: "Broadcast512Uint64x2",
argLen: 1,
generic: true,
},

View File

@@ -782,10 +782,6 @@ func rewriteValueAMD64(v *Value) bool {
return rewriteValueAMD64_OpAMD64VDIVPSMasked256(v)
case OpAMD64VDIVPSMasked512:
return rewriteValueAMD64_OpAMD64VDIVPSMasked512(v)
case OpAMD64VFMADD213PD128:
return rewriteValueAMD64_OpAMD64VFMADD213PD128(v)
case OpAMD64VFMADD213PD256:
return rewriteValueAMD64_OpAMD64VFMADD213PD256(v)
case OpAMD64VFMADD213PD512:
return rewriteValueAMD64_OpAMD64VFMADD213PD512(v)
case OpAMD64VFMADD213PDMasked128:
@@ -794,10 +790,6 @@ func rewriteValueAMD64(v *Value) bool {
return rewriteValueAMD64_OpAMD64VFMADD213PDMasked256(v)
case OpAMD64VFMADD213PDMasked512:
return rewriteValueAMD64_OpAMD64VFMADD213PDMasked512(v)
case OpAMD64VFMADD213PS128:
return rewriteValueAMD64_OpAMD64VFMADD213PS128(v)
case OpAMD64VFMADD213PS256:
return rewriteValueAMD64_OpAMD64VFMADD213PS256(v)
case OpAMD64VFMADD213PS512:
return rewriteValueAMD64_OpAMD64VFMADD213PS512(v)
case OpAMD64VFMADD213PSMasked128:
@@ -806,10 +798,6 @@ func rewriteValueAMD64(v *Value) bool {
return rewriteValueAMD64_OpAMD64VFMADD213PSMasked256(v)
case OpAMD64VFMADD213PSMasked512:
return rewriteValueAMD64_OpAMD64VFMADD213PSMasked512(v)
case OpAMD64VFMADDSUB213PD128:
return rewriteValueAMD64_OpAMD64VFMADDSUB213PD128(v)
case OpAMD64VFMADDSUB213PD256:
return rewriteValueAMD64_OpAMD64VFMADDSUB213PD256(v)
case OpAMD64VFMADDSUB213PD512:
return rewriteValueAMD64_OpAMD64VFMADDSUB213PD512(v)
case OpAMD64VFMADDSUB213PDMasked128:
@@ -818,10 +806,6 @@ func rewriteValueAMD64(v *Value) bool {
return rewriteValueAMD64_OpAMD64VFMADDSUB213PDMasked256(v)
case OpAMD64VFMADDSUB213PDMasked512:
return rewriteValueAMD64_OpAMD64VFMADDSUB213PDMasked512(v)
case OpAMD64VFMADDSUB213PS128:
return rewriteValueAMD64_OpAMD64VFMADDSUB213PS128(v)
case OpAMD64VFMADDSUB213PS256:
return rewriteValueAMD64_OpAMD64VFMADDSUB213PS256(v)
case OpAMD64VFMADDSUB213PS512:
return rewriteValueAMD64_OpAMD64VFMADDSUB213PS512(v)
case OpAMD64VFMADDSUB213PSMasked128:
@@ -830,10 +814,6 @@ func rewriteValueAMD64(v *Value) bool {
return rewriteValueAMD64_OpAMD64VFMADDSUB213PSMasked256(v)
case OpAMD64VFMADDSUB213PSMasked512:
return rewriteValueAMD64_OpAMD64VFMADDSUB213PSMasked512(v)
case OpAMD64VFMSUBADD213PD128:
return rewriteValueAMD64_OpAMD64VFMSUBADD213PD128(v)
case OpAMD64VFMSUBADD213PD256:
return rewriteValueAMD64_OpAMD64VFMSUBADD213PD256(v)
case OpAMD64VFMSUBADD213PD512:
return rewriteValueAMD64_OpAMD64VFMSUBADD213PD512(v)
case OpAMD64VFMSUBADD213PDMasked128:
@@ -842,10 +822,6 @@ func rewriteValueAMD64(v *Value) bool {
return rewriteValueAMD64_OpAMD64VFMSUBADD213PDMasked256(v)
case OpAMD64VFMSUBADD213PDMasked512:
return rewriteValueAMD64_OpAMD64VFMSUBADD213PDMasked512(v)
case OpAMD64VFMSUBADD213PS128:
return rewriteValueAMD64_OpAMD64VFMSUBADD213PS128(v)
case OpAMD64VFMSUBADD213PS256:
return rewriteValueAMD64_OpAMD64VFMSUBADD213PS256(v)
case OpAMD64VFMSUBADD213PS512:
return rewriteValueAMD64_OpAMD64VFMSUBADD213PS512(v)
case OpAMD64VFMSUBADD213PSMasked128:
@@ -2479,96 +2455,96 @@ func rewriteValueAMD64(v *Value) bool {
return rewriteValueAMD64_OpBitLen64(v)
case OpBitLen8:
return rewriteValueAMD64_OpBitLen8(v)
case OpBroadcast1To16Float32x4:
v.Op = OpAMD64VBROADCASTSS512
return true
case OpBroadcast1To16Int16x8:
v.Op = OpAMD64VPBROADCASTW256
return true
case OpBroadcast1To16Int32x4:
v.Op = OpAMD64VPBROADCASTD512
return true
case OpBroadcast1To16Int8x16:
v.Op = OpAMD64VPBROADCASTB128
return true
case OpBroadcast1To16Uint16x8:
v.Op = OpAMD64VPBROADCASTW256
return true
case OpBroadcast1To16Uint32x4:
v.Op = OpAMD64VPBROADCASTD512
return true
case OpBroadcast1To16Uint8x16:
v.Op = OpAMD64VPBROADCASTB128
return true
case OpBroadcast1To2Float64x2:
v.Op = OpAMD64VPBROADCASTQ128
return true
case OpBroadcast1To2Int64x2:
v.Op = OpAMD64VPBROADCASTQ128
return true
case OpBroadcast1To2Uint64x2:
v.Op = OpAMD64VPBROADCASTQ128
return true
case OpBroadcast1To32Int16x8:
v.Op = OpAMD64VPBROADCASTW512
return true
case OpBroadcast1To32Int8x16:
v.Op = OpAMD64VPBROADCASTB256
return true
case OpBroadcast1To32Uint16x8:
v.Op = OpAMD64VPBROADCASTW512
return true
case OpBroadcast1To32Uint8x16:
v.Op = OpAMD64VPBROADCASTB256
return true
case OpBroadcast1To4Float32x4:
case OpBroadcast128Float32x4:
v.Op = OpAMD64VBROADCASTSS128
return true
case OpBroadcast1To4Float64x2:
v.Op = OpAMD64VBROADCASTSD256
case OpBroadcast128Float64x2:
v.Op = OpAMD64VPBROADCASTQ128
return true
case OpBroadcast1To4Int32x4:
case OpBroadcast128Int16x8:
v.Op = OpAMD64VPBROADCASTW128
return true
case OpBroadcast128Int32x4:
v.Op = OpAMD64VPBROADCASTD128
return true
case OpBroadcast1To4Int64x2:
v.Op = OpAMD64VPBROADCASTQ256
case OpBroadcast128Int64x2:
v.Op = OpAMD64VPBROADCASTQ128
return true
case OpBroadcast1To4Uint32x4:
case OpBroadcast128Int8x16:
v.Op = OpAMD64VPBROADCASTB128
return true
case OpBroadcast128Uint16x8:
v.Op = OpAMD64VPBROADCASTW128
return true
case OpBroadcast128Uint32x4:
v.Op = OpAMD64VPBROADCASTD128
return true
case OpBroadcast1To4Uint64x2:
v.Op = OpAMD64VPBROADCASTQ256
case OpBroadcast128Uint64x2:
v.Op = OpAMD64VPBROADCASTQ128
return true
case OpBroadcast1To64Int8x16:
v.Op = OpAMD64VPBROADCASTB512
case OpBroadcast128Uint8x16:
v.Op = OpAMD64VPBROADCASTB128
return true
case OpBroadcast1To64Uint8x16:
v.Op = OpAMD64VPBROADCASTB512
return true
case OpBroadcast1To8Float32x4:
case OpBroadcast256Float32x4:
v.Op = OpAMD64VBROADCASTSS256
return true
case OpBroadcast1To8Float64x2:
case OpBroadcast256Float64x2:
v.Op = OpAMD64VBROADCASTSD256
return true
case OpBroadcast256Int16x8:
v.Op = OpAMD64VPBROADCASTW256
return true
case OpBroadcast256Int32x4:
v.Op = OpAMD64VPBROADCASTD256
return true
case OpBroadcast256Int64x2:
v.Op = OpAMD64VPBROADCASTQ256
return true
case OpBroadcast256Int8x16:
v.Op = OpAMD64VPBROADCASTB256
return true
case OpBroadcast256Uint16x8:
v.Op = OpAMD64VPBROADCASTW256
return true
case OpBroadcast256Uint32x4:
v.Op = OpAMD64VPBROADCASTD256
return true
case OpBroadcast256Uint64x2:
v.Op = OpAMD64VPBROADCASTQ256
return true
case OpBroadcast256Uint8x16:
v.Op = OpAMD64VPBROADCASTB256
return true
case OpBroadcast512Float32x4:
v.Op = OpAMD64VBROADCASTSS512
return true
case OpBroadcast512Float64x2:
v.Op = OpAMD64VBROADCASTSD512
return true
case OpBroadcast1To8Int16x8:
v.Op = OpAMD64VPBROADCASTW128
case OpBroadcast512Int16x8:
v.Op = OpAMD64VPBROADCASTW512
return true
case OpBroadcast1To8Int32x4:
v.Op = OpAMD64VPBROADCASTD256
case OpBroadcast512Int32x4:
v.Op = OpAMD64VPBROADCASTD512
return true
case OpBroadcast1To8Int64x2:
case OpBroadcast512Int64x2:
v.Op = OpAMD64VPBROADCASTQ512
return true
case OpBroadcast1To8Uint16x8:
v.Op = OpAMD64VPBROADCASTW128
case OpBroadcast512Int8x16:
v.Op = OpAMD64VPBROADCASTB512
return true
case OpBroadcast1To8Uint32x4:
v.Op = OpAMD64VPBROADCASTD256
case OpBroadcast512Uint16x8:
v.Op = OpAMD64VPBROADCASTW512
return true
case OpBroadcast1To8Uint64x2:
case OpBroadcast512Uint32x4:
v.Op = OpAMD64VPBROADCASTD512
return true
case OpBroadcast512Uint64x2:
v.Op = OpAMD64VPBROADCASTQ512
return true
case OpBroadcast512Uint8x16:
v.Op = OpAMD64VPBROADCASTB512
return true
case OpBswap16:
return rewriteValueAMD64_OpBswap16(v)
case OpBswap32:
@@ -3050,25 +3026,19 @@ func rewriteValueAMD64(v *Value) bool {
case OpCvtMask32x16to16:
return rewriteValueAMD64_OpCvtMask32x16to16(v)
case OpCvtMask32x4to8:
v.Op = OpAMD64VMOVMSKPS128
return true
return rewriteValueAMD64_OpCvtMask32x4to8(v)
case OpCvtMask32x8to8:
v.Op = OpAMD64VMOVMSKPS256
return true
return rewriteValueAMD64_OpCvtMask32x8to8(v)
case OpCvtMask64x2to8:
v.Op = OpAMD64VMOVMSKPD128
return true
return rewriteValueAMD64_OpCvtMask64x2to8(v)
case OpCvtMask64x4to8:
v.Op = OpAMD64VMOVMSKPD256
return true
return rewriteValueAMD64_OpCvtMask64x4to8(v)
case OpCvtMask64x8to8:
return rewriteValueAMD64_OpCvtMask64x8to8(v)
case OpCvtMask8x16to16:
v.Op = OpAMD64VPMOVMSKB128
return true
return rewriteValueAMD64_OpCvtMask8x16to16(v)
case OpCvtMask8x32to32:
v.Op = OpAMD64VPMOVMSKB256
return true
return rewriteValueAMD64_OpCvtMask8x32to32(v)
case OpCvtMask8x64to64:
return rewriteValueAMD64_OpCvtMask8x64to64(v)
case OpDiv128u:
@@ -31492,64 +31462,6 @@ func rewriteValueAMD64_OpAMD64VDIVPSMasked512(v *Value) bool {
}
return false
}
func rewriteValueAMD64_OpAMD64VFMADD213PD128(v *Value) bool {
v_2 := v.Args[2]
v_1 := v.Args[1]
v_0 := v.Args[0]
// match: (VFMADD213PD128 x y l:(VMOVDQUload128 {sym} [off] ptr mem))
// cond: canMergeLoad(v, l) && clobber(l)
// result: (VFMADD213PD128load {sym} [off] x y ptr mem)
for {
x := v_0
y := v_1
l := v_2
if l.Op != OpAMD64VMOVDQUload128 {
break
}
off := auxIntToInt32(l.AuxInt)
sym := auxToSym(l.Aux)
mem := l.Args[1]
ptr := l.Args[0]
if !(canMergeLoad(v, l) && clobber(l)) {
break
}
v.reset(OpAMD64VFMADD213PD128load)
v.AuxInt = int32ToAuxInt(off)
v.Aux = symToAux(sym)
v.AddArg4(x, y, ptr, mem)
return true
}
return false
}
func rewriteValueAMD64_OpAMD64VFMADD213PD256(v *Value) bool {
v_2 := v.Args[2]
v_1 := v.Args[1]
v_0 := v.Args[0]
// match: (VFMADD213PD256 x y l:(VMOVDQUload256 {sym} [off] ptr mem))
// cond: canMergeLoad(v, l) && clobber(l)
// result: (VFMADD213PD256load {sym} [off] x y ptr mem)
for {
x := v_0
y := v_1
l := v_2
if l.Op != OpAMD64VMOVDQUload256 {
break
}
off := auxIntToInt32(l.AuxInt)
sym := auxToSym(l.Aux)
mem := l.Args[1]
ptr := l.Args[0]
if !(canMergeLoad(v, l) && clobber(l)) {
break
}
v.reset(OpAMD64VFMADD213PD256load)
v.AuxInt = int32ToAuxInt(off)
v.Aux = symToAux(sym)
v.AddArg4(x, y, ptr, mem)
return true
}
return false
}
func rewriteValueAMD64_OpAMD64VFMADD213PD512(v *Value) bool {
v_2 := v.Args[2]
v_1 := v.Args[1]
@@ -31672,64 +31584,6 @@ func rewriteValueAMD64_OpAMD64VFMADD213PDMasked512(v *Value) bool {
}
return false
}
func rewriteValueAMD64_OpAMD64VFMADD213PS128(v *Value) bool {
v_2 := v.Args[2]
v_1 := v.Args[1]
v_0 := v.Args[0]
// match: (VFMADD213PS128 x y l:(VMOVDQUload128 {sym} [off] ptr mem))
// cond: canMergeLoad(v, l) && clobber(l)
// result: (VFMADD213PS128load {sym} [off] x y ptr mem)
for {
x := v_0
y := v_1
l := v_2
if l.Op != OpAMD64VMOVDQUload128 {
break
}
off := auxIntToInt32(l.AuxInt)
sym := auxToSym(l.Aux)
mem := l.Args[1]
ptr := l.Args[0]
if !(canMergeLoad(v, l) && clobber(l)) {
break
}
v.reset(OpAMD64VFMADD213PS128load)
v.AuxInt = int32ToAuxInt(off)
v.Aux = symToAux(sym)
v.AddArg4(x, y, ptr, mem)
return true
}
return false
}
func rewriteValueAMD64_OpAMD64VFMADD213PS256(v *Value) bool {
v_2 := v.Args[2]
v_1 := v.Args[1]
v_0 := v.Args[0]
// match: (VFMADD213PS256 x y l:(VMOVDQUload256 {sym} [off] ptr mem))
// cond: canMergeLoad(v, l) && clobber(l)
// result: (VFMADD213PS256load {sym} [off] x y ptr mem)
for {
x := v_0
y := v_1
l := v_2
if l.Op != OpAMD64VMOVDQUload256 {
break
}
off := auxIntToInt32(l.AuxInt)
sym := auxToSym(l.Aux)
mem := l.Args[1]
ptr := l.Args[0]
if !(canMergeLoad(v, l) && clobber(l)) {
break
}
v.reset(OpAMD64VFMADD213PS256load)
v.AuxInt = int32ToAuxInt(off)
v.Aux = symToAux(sym)
v.AddArg4(x, y, ptr, mem)
return true
}
return false
}
func rewriteValueAMD64_OpAMD64VFMADD213PS512(v *Value) bool {
v_2 := v.Args[2]
v_1 := v.Args[1]
@@ -31852,64 +31706,6 @@ func rewriteValueAMD64_OpAMD64VFMADD213PSMasked512(v *Value) bool {
}
return false
}
func rewriteValueAMD64_OpAMD64VFMADDSUB213PD128(v *Value) bool {
v_2 := v.Args[2]
v_1 := v.Args[1]
v_0 := v.Args[0]
// match: (VFMADDSUB213PD128 x y l:(VMOVDQUload128 {sym} [off] ptr mem))
// cond: canMergeLoad(v, l) && clobber(l)
// result: (VFMADDSUB213PD128load {sym} [off] x y ptr mem)
for {
x := v_0
y := v_1
l := v_2
if l.Op != OpAMD64VMOVDQUload128 {
break
}
off := auxIntToInt32(l.AuxInt)
sym := auxToSym(l.Aux)
mem := l.Args[1]
ptr := l.Args[0]
if !(canMergeLoad(v, l) && clobber(l)) {
break
}
v.reset(OpAMD64VFMADDSUB213PD128load)
v.AuxInt = int32ToAuxInt(off)
v.Aux = symToAux(sym)
v.AddArg4(x, y, ptr, mem)
return true
}
return false
}
func rewriteValueAMD64_OpAMD64VFMADDSUB213PD256(v *Value) bool {
v_2 := v.Args[2]
v_1 := v.Args[1]
v_0 := v.Args[0]
// match: (VFMADDSUB213PD256 x y l:(VMOVDQUload256 {sym} [off] ptr mem))
// cond: canMergeLoad(v, l) && clobber(l)
// result: (VFMADDSUB213PD256load {sym} [off] x y ptr mem)
for {
x := v_0
y := v_1
l := v_2
if l.Op != OpAMD64VMOVDQUload256 {
break
}
off := auxIntToInt32(l.AuxInt)
sym := auxToSym(l.Aux)
mem := l.Args[1]
ptr := l.Args[0]
if !(canMergeLoad(v, l) && clobber(l)) {
break
}
v.reset(OpAMD64VFMADDSUB213PD256load)
v.AuxInt = int32ToAuxInt(off)
v.Aux = symToAux(sym)
v.AddArg4(x, y, ptr, mem)
return true
}
return false
}
func rewriteValueAMD64_OpAMD64VFMADDSUB213PD512(v *Value) bool {
v_2 := v.Args[2]
v_1 := v.Args[1]
@@ -32032,64 +31828,6 @@ func rewriteValueAMD64_OpAMD64VFMADDSUB213PDMasked512(v *Value) bool {
}
return false
}
func rewriteValueAMD64_OpAMD64VFMADDSUB213PS128(v *Value) bool {
v_2 := v.Args[2]
v_1 := v.Args[1]
v_0 := v.Args[0]
// match: (VFMADDSUB213PS128 x y l:(VMOVDQUload128 {sym} [off] ptr mem))
// cond: canMergeLoad(v, l) && clobber(l)
// result: (VFMADDSUB213PS128load {sym} [off] x y ptr mem)
for {
x := v_0
y := v_1
l := v_2
if l.Op != OpAMD64VMOVDQUload128 {
break
}
off := auxIntToInt32(l.AuxInt)
sym := auxToSym(l.Aux)
mem := l.Args[1]
ptr := l.Args[0]
if !(canMergeLoad(v, l) && clobber(l)) {
break
}
v.reset(OpAMD64VFMADDSUB213PS128load)
v.AuxInt = int32ToAuxInt(off)
v.Aux = symToAux(sym)
v.AddArg4(x, y, ptr, mem)
return true
}
return false
}
func rewriteValueAMD64_OpAMD64VFMADDSUB213PS256(v *Value) bool {
v_2 := v.Args[2]
v_1 := v.Args[1]
v_0 := v.Args[0]
// match: (VFMADDSUB213PS256 x y l:(VMOVDQUload256 {sym} [off] ptr mem))
// cond: canMergeLoad(v, l) && clobber(l)
// result: (VFMADDSUB213PS256load {sym} [off] x y ptr mem)
for {
x := v_0
y := v_1
l := v_2
if l.Op != OpAMD64VMOVDQUload256 {
break
}
off := auxIntToInt32(l.AuxInt)
sym := auxToSym(l.Aux)
mem := l.Args[1]
ptr := l.Args[0]
if !(canMergeLoad(v, l) && clobber(l)) {
break
}
v.reset(OpAMD64VFMADDSUB213PS256load)
v.AuxInt = int32ToAuxInt(off)
v.Aux = symToAux(sym)
v.AddArg4(x, y, ptr, mem)
return true
}
return false
}
func rewriteValueAMD64_OpAMD64VFMADDSUB213PS512(v *Value) bool {
v_2 := v.Args[2]
v_1 := v.Args[1]
@@ -32212,64 +31950,6 @@ func rewriteValueAMD64_OpAMD64VFMADDSUB213PSMasked512(v *Value) bool {
}
return false
}
func rewriteValueAMD64_OpAMD64VFMSUBADD213PD128(v *Value) bool {
v_2 := v.Args[2]
v_1 := v.Args[1]
v_0 := v.Args[0]
// match: (VFMSUBADD213PD128 x y l:(VMOVDQUload128 {sym} [off] ptr mem))
// cond: canMergeLoad(v, l) && clobber(l)
// result: (VFMSUBADD213PD128load {sym} [off] x y ptr mem)
for {
x := v_0
y := v_1
l := v_2
if l.Op != OpAMD64VMOVDQUload128 {
break
}
off := auxIntToInt32(l.AuxInt)
sym := auxToSym(l.Aux)
mem := l.Args[1]
ptr := l.Args[0]
if !(canMergeLoad(v, l) && clobber(l)) {
break
}
v.reset(OpAMD64VFMSUBADD213PD128load)
v.AuxInt = int32ToAuxInt(off)
v.Aux = symToAux(sym)
v.AddArg4(x, y, ptr, mem)
return true
}
return false
}
func rewriteValueAMD64_OpAMD64VFMSUBADD213PD256(v *Value) bool {
v_2 := v.Args[2]
v_1 := v.Args[1]
v_0 := v.Args[0]
// match: (VFMSUBADD213PD256 x y l:(VMOVDQUload256 {sym} [off] ptr mem))
// cond: canMergeLoad(v, l) && clobber(l)
// result: (VFMSUBADD213PD256load {sym} [off] x y ptr mem)
for {
x := v_0
y := v_1
l := v_2
if l.Op != OpAMD64VMOVDQUload256 {
break
}
off := auxIntToInt32(l.AuxInt)
sym := auxToSym(l.Aux)
mem := l.Args[1]
ptr := l.Args[0]
if !(canMergeLoad(v, l) && clobber(l)) {
break
}
v.reset(OpAMD64VFMSUBADD213PD256load)
v.AuxInt = int32ToAuxInt(off)
v.Aux = symToAux(sym)
v.AddArg4(x, y, ptr, mem)
return true
}
return false
}
func rewriteValueAMD64_OpAMD64VFMSUBADD213PD512(v *Value) bool {
v_2 := v.Args[2]
v_1 := v.Args[1]
@@ -32392,64 +32072,6 @@ func rewriteValueAMD64_OpAMD64VFMSUBADD213PDMasked512(v *Value) bool {
}
return false
}
func rewriteValueAMD64_OpAMD64VFMSUBADD213PS128(v *Value) bool {
v_2 := v.Args[2]
v_1 := v.Args[1]
v_0 := v.Args[0]
// match: (VFMSUBADD213PS128 x y l:(VMOVDQUload128 {sym} [off] ptr mem))
// cond: canMergeLoad(v, l) && clobber(l)
// result: (VFMSUBADD213PS128load {sym} [off] x y ptr mem)
for {
x := v_0
y := v_1
l := v_2
if l.Op != OpAMD64VMOVDQUload128 {
break
}
off := auxIntToInt32(l.AuxInt)
sym := auxToSym(l.Aux)
mem := l.Args[1]
ptr := l.Args[0]
if !(canMergeLoad(v, l) && clobber(l)) {
break
}
v.reset(OpAMD64VFMSUBADD213PS128load)
v.AuxInt = int32ToAuxInt(off)
v.Aux = symToAux(sym)
v.AddArg4(x, y, ptr, mem)
return true
}
return false
}
func rewriteValueAMD64_OpAMD64VFMSUBADD213PS256(v *Value) bool {
v_2 := v.Args[2]
v_1 := v.Args[1]
v_0 := v.Args[0]
// match: (VFMSUBADD213PS256 x y l:(VMOVDQUload256 {sym} [off] ptr mem))
// cond: canMergeLoad(v, l) && clobber(l)
// result: (VFMSUBADD213PS256load {sym} [off] x y ptr mem)
for {
x := v_0
y := v_1
l := v_2
if l.Op != OpAMD64VMOVDQUload256 {
break
}
off := auxIntToInt32(l.AuxInt)
sym := auxToSym(l.Aux)
mem := l.Args[1]
ptr := l.Args[0]
if !(canMergeLoad(v, l) && clobber(l)) {
break
}
v.reset(OpAMD64VFMSUBADD213PS256load)
v.AuxInt = int32ToAuxInt(off)
v.Aux = symToAux(sym)
v.AddArg4(x, y, ptr, mem)
return true
}
return false
}
func rewriteValueAMD64_OpAMD64VFMSUBADD213PS512(v *Value) bool {
v_2 := v.Args[2]
v_1 := v.Args[1]
@@ -68728,11 +68350,13 @@ func rewriteValueAMD64_OpCvt8toMask64x8(v *Value) bool {
func rewriteValueAMD64_OpCvtMask16x16to16(v *Value) bool {
v_0 := v.Args[0]
b := v.Block
// match: (CvtMask16x16to16 x)
// result: (KMOVWi (VPMOVVec16x16ToM <types.TypeMask> x))
// match: (CvtMask16x16to16 <t> x)
// result: (KMOVWi <t> (VPMOVVec16x16ToM <types.TypeMask> x))
for {
t := v.Type
x := v_0
v.reset(OpAMD64KMOVWi)
v.Type = t
v0 := b.NewValue0(v.Pos, OpAMD64VPMOVVec16x16ToM, types.TypeMask)
v0.AddArg(x)
v.AddArg(v0)
@@ -68742,11 +68366,13 @@ func rewriteValueAMD64_OpCvtMask16x16to16(v *Value) bool {
func rewriteValueAMD64_OpCvtMask16x32to32(v *Value) bool {
v_0 := v.Args[0]
b := v.Block
// match: (CvtMask16x32to32 x)
// result: (KMOVDi (VPMOVVec16x32ToM <types.TypeMask> x))
// match: (CvtMask16x32to32 <t> x)
// result: (KMOVDi <t> (VPMOVVec16x32ToM <types.TypeMask> x))
for {
t := v.Type
x := v_0
v.reset(OpAMD64KMOVDi)
v.Type = t
v0 := b.NewValue0(v.Pos, OpAMD64VPMOVVec16x32ToM, types.TypeMask)
v0.AddArg(x)
v.AddArg(v0)
@@ -68756,11 +68382,13 @@ func rewriteValueAMD64_OpCvtMask16x32to32(v *Value) bool {
func rewriteValueAMD64_OpCvtMask16x8to8(v *Value) bool {
v_0 := v.Args[0]
b := v.Block
// match: (CvtMask16x8to8 x)
// result: (KMOVBi (VPMOVVec16x8ToM <types.TypeMask> x))
// match: (CvtMask16x8to8 <t> x)
// result: (KMOVBi <t> (VPMOVVec16x8ToM <types.TypeMask> x))
for {
t := v.Type
x := v_0
v.reset(OpAMD64KMOVBi)
v.Type = t
v0 := b.NewValue0(v.Pos, OpAMD64VPMOVVec16x8ToM, types.TypeMask)
v0.AddArg(x)
v.AddArg(v0)
@@ -68770,39 +68398,141 @@ func rewriteValueAMD64_OpCvtMask16x8to8(v *Value) bool {
func rewriteValueAMD64_OpCvtMask32x16to16(v *Value) bool {
v_0 := v.Args[0]
b := v.Block
// match: (CvtMask32x16to16 x)
// result: (KMOVWi (VPMOVVec32x16ToM <types.TypeMask> x))
// match: (CvtMask32x16to16 <t> x)
// result: (KMOVWi <t> (VPMOVVec32x16ToM <types.TypeMask> x))
for {
t := v.Type
x := v_0
v.reset(OpAMD64KMOVWi)
v.Type = t
v0 := b.NewValue0(v.Pos, OpAMD64VPMOVVec32x16ToM, types.TypeMask)
v0.AddArg(x)
v.AddArg(v0)
return true
}
}
func rewriteValueAMD64_OpCvtMask32x4to8(v *Value) bool {
v_0 := v.Args[0]
b := v.Block
// match: (CvtMask32x4to8 <t> x)
// result: (KMOVBi <t> (VPMOVVec32x4ToM <types.TypeMask> x))
for {
t := v.Type
x := v_0
v.reset(OpAMD64KMOVBi)
v.Type = t
v0 := b.NewValue0(v.Pos, OpAMD64VPMOVVec32x4ToM, types.TypeMask)
v0.AddArg(x)
v.AddArg(v0)
return true
}
}
func rewriteValueAMD64_OpCvtMask32x8to8(v *Value) bool {
v_0 := v.Args[0]
b := v.Block
// match: (CvtMask32x8to8 <t> x)
// result: (KMOVBi <t> (VPMOVVec32x8ToM <types.TypeMask> x))
for {
t := v.Type
x := v_0
v.reset(OpAMD64KMOVBi)
v.Type = t
v0 := b.NewValue0(v.Pos, OpAMD64VPMOVVec32x8ToM, types.TypeMask)
v0.AddArg(x)
v.AddArg(v0)
return true
}
}
func rewriteValueAMD64_OpCvtMask64x2to8(v *Value) bool {
v_0 := v.Args[0]
b := v.Block
// match: (CvtMask64x2to8 <t> x)
// result: (KMOVBi <t> (VPMOVVec64x2ToM <types.TypeMask> x))
for {
t := v.Type
x := v_0
v.reset(OpAMD64KMOVBi)
v.Type = t
v0 := b.NewValue0(v.Pos, OpAMD64VPMOVVec64x2ToM, types.TypeMask)
v0.AddArg(x)
v.AddArg(v0)
return true
}
}
func rewriteValueAMD64_OpCvtMask64x4to8(v *Value) bool {
v_0 := v.Args[0]
b := v.Block
// match: (CvtMask64x4to8 <t> x)
// result: (KMOVBi <t> (VPMOVVec64x4ToM <types.TypeMask> x))
for {
t := v.Type
x := v_0
v.reset(OpAMD64KMOVBi)
v.Type = t
v0 := b.NewValue0(v.Pos, OpAMD64VPMOVVec64x4ToM, types.TypeMask)
v0.AddArg(x)
v.AddArg(v0)
return true
}
}
func rewriteValueAMD64_OpCvtMask64x8to8(v *Value) bool {
v_0 := v.Args[0]
b := v.Block
// match: (CvtMask64x8to8 x)
// result: (KMOVBi (VPMOVVec64x8ToM <types.TypeMask> x))
// match: (CvtMask64x8to8 <t> x)
// result: (KMOVBi <t> (VPMOVVec64x8ToM <types.TypeMask> x))
for {
t := v.Type
x := v_0
v.reset(OpAMD64KMOVBi)
v.Type = t
v0 := b.NewValue0(v.Pos, OpAMD64VPMOVVec64x8ToM, types.TypeMask)
v0.AddArg(x)
v.AddArg(v0)
return true
}
}
func rewriteValueAMD64_OpCvtMask8x16to16(v *Value) bool {
v_0 := v.Args[0]
b := v.Block
// match: (CvtMask8x16to16 <t> x)
// result: (KMOVWi <t> (VPMOVVec8x16ToM <types.TypeMask> x))
for {
t := v.Type
x := v_0
v.reset(OpAMD64KMOVWi)
v.Type = t
v0 := b.NewValue0(v.Pos, OpAMD64VPMOVVec8x16ToM, types.TypeMask)
v0.AddArg(x)
v.AddArg(v0)
return true
}
}
func rewriteValueAMD64_OpCvtMask8x32to32(v *Value) bool {
v_0 := v.Args[0]
b := v.Block
// match: (CvtMask8x32to32 <t> x)
// result: (KMOVDi <t> (VPMOVVec8x32ToM <types.TypeMask> x))
for {
t := v.Type
x := v_0
v.reset(OpAMD64KMOVDi)
v.Type = t
v0 := b.NewValue0(v.Pos, OpAMD64VPMOVVec8x32ToM, types.TypeMask)
v0.AddArg(x)
v.AddArg(v0)
return true
}
}
func rewriteValueAMD64_OpCvtMask8x64to64(v *Value) bool {
v_0 := v.Args[0]
b := v.Block
// match: (CvtMask8x64to64 x)
// result: (KMOVQi (VPMOVVec8x64ToM <types.TypeMask> x))
// match: (CvtMask8x64to64 <t> x)
// result: (KMOVQi <t> (VPMOVVec8x64ToM <types.TypeMask> x))
for {
t := v.Type
x := v_0
v.reset(OpAMD64KMOVQi)
v.Type = t
v0 := b.NewValue0(v.Pos, OpAMD64VPMOVVec8x64ToM, types.TypeMask)
v0.AddArg(x)
v.AddArg(v0)

View File

@@ -152,36 +152,36 @@ func simdIntrinsics(addF func(pkg, fn string, b intrinsicBuilder, archFamilies .
addF(simdPackage, "Uint16x8.Average", opLen2(ssa.OpAverageUint16x8, types.TypeVec128), sys.AMD64)
addF(simdPackage, "Uint16x16.Average", opLen2(ssa.OpAverageUint16x16, types.TypeVec256), sys.AMD64)
addF(simdPackage, "Uint16x32.Average", opLen2(ssa.OpAverageUint16x32, types.TypeVec512), sys.AMD64)
addF(simdPackage, "Float64x2.Broadcast1To2", opLen1(ssa.OpBroadcast1To2Float64x2, types.TypeVec128), sys.AMD64)
addF(simdPackage, "Int64x2.Broadcast1To2", opLen1(ssa.OpBroadcast1To2Int64x2, types.TypeVec128), sys.AMD64)
addF(simdPackage, "Uint64x2.Broadcast1To2", opLen1(ssa.OpBroadcast1To2Uint64x2, types.TypeVec128), sys.AMD64)
addF(simdPackage, "Float32x4.Broadcast1To4", opLen1(ssa.OpBroadcast1To4Float32x4, types.TypeVec128), sys.AMD64)
addF(simdPackage, "Float64x2.Broadcast1To4", opLen1(ssa.OpBroadcast1To4Float64x2, types.TypeVec256), sys.AMD64)
addF(simdPackage, "Int32x4.Broadcast1To4", opLen1(ssa.OpBroadcast1To4Int32x4, types.TypeVec128), sys.AMD64)
addF(simdPackage, "Int64x2.Broadcast1To4", opLen1(ssa.OpBroadcast1To4Int64x2, types.TypeVec256), sys.AMD64)
addF(simdPackage, "Uint32x4.Broadcast1To4", opLen1(ssa.OpBroadcast1To4Uint32x4, types.TypeVec128), sys.AMD64)
addF(simdPackage, "Uint64x2.Broadcast1To4", opLen1(ssa.OpBroadcast1To4Uint64x2, types.TypeVec256), sys.AMD64)
addF(simdPackage, "Float32x4.Broadcast1To8", opLen1(ssa.OpBroadcast1To8Float32x4, types.TypeVec256), sys.AMD64)
addF(simdPackage, "Float64x2.Broadcast1To8", opLen1(ssa.OpBroadcast1To8Float64x2, types.TypeVec512), sys.AMD64)
addF(simdPackage, "Int16x8.Broadcast1To8", opLen1(ssa.OpBroadcast1To8Int16x8, types.TypeVec128), sys.AMD64)
addF(simdPackage, "Int32x4.Broadcast1To8", opLen1(ssa.OpBroadcast1To8Int32x4, types.TypeVec256), sys.AMD64)
addF(simdPackage, "Int64x2.Broadcast1To8", opLen1(ssa.OpBroadcast1To8Int64x2, types.TypeVec512), sys.AMD64)
addF(simdPackage, "Uint16x8.Broadcast1To8", opLen1(ssa.OpBroadcast1To8Uint16x8, types.TypeVec128), sys.AMD64)
addF(simdPackage, "Uint32x4.Broadcast1To8", opLen1(ssa.OpBroadcast1To8Uint32x4, types.TypeVec256), sys.AMD64)
addF(simdPackage, "Uint64x2.Broadcast1To8", opLen1(ssa.OpBroadcast1To8Uint64x2, types.TypeVec512), sys.AMD64)
addF(simdPackage, "Float32x4.Broadcast1To16", opLen1(ssa.OpBroadcast1To16Float32x4, types.TypeVec512), sys.AMD64)
addF(simdPackage, "Int8x16.Broadcast1To16", opLen1(ssa.OpBroadcast1To16Int8x16, types.TypeVec128), sys.AMD64)
addF(simdPackage, "Int16x8.Broadcast1To16", opLen1(ssa.OpBroadcast1To16Int16x8, types.TypeVec256), sys.AMD64)
addF(simdPackage, "Int32x4.Broadcast1To16", opLen1(ssa.OpBroadcast1To16Int32x4, types.TypeVec512), sys.AMD64)
addF(simdPackage, "Uint8x16.Broadcast1To16", opLen1(ssa.OpBroadcast1To16Uint8x16, types.TypeVec128), sys.AMD64)
addF(simdPackage, "Uint16x8.Broadcast1To16", opLen1(ssa.OpBroadcast1To16Uint16x8, types.TypeVec256), sys.AMD64)
addF(simdPackage, "Uint32x4.Broadcast1To16", opLen1(ssa.OpBroadcast1To16Uint32x4, types.TypeVec512), sys.AMD64)
addF(simdPackage, "Int8x16.Broadcast1To32", opLen1(ssa.OpBroadcast1To32Int8x16, types.TypeVec256), sys.AMD64)
addF(simdPackage, "Int16x8.Broadcast1To32", opLen1(ssa.OpBroadcast1To32Int16x8, types.TypeVec512), sys.AMD64)
addF(simdPackage, "Uint8x16.Broadcast1To32", opLen1(ssa.OpBroadcast1To32Uint8x16, types.TypeVec256), sys.AMD64)
addF(simdPackage, "Uint16x8.Broadcast1To32", opLen1(ssa.OpBroadcast1To32Uint16x8, types.TypeVec512), sys.AMD64)
addF(simdPackage, "Int8x16.Broadcast1To64", opLen1(ssa.OpBroadcast1To64Int8x16, types.TypeVec512), sys.AMD64)
addF(simdPackage, "Uint8x16.Broadcast1To64", opLen1(ssa.OpBroadcast1To64Uint8x16, types.TypeVec512), sys.AMD64)
addF(simdPackage, "Float32x4.Broadcast128", opLen1(ssa.OpBroadcast128Float32x4, types.TypeVec128), sys.AMD64)
addF(simdPackage, "Float64x2.Broadcast128", opLen1(ssa.OpBroadcast128Float64x2, types.TypeVec128), sys.AMD64)
addF(simdPackage, "Int8x16.Broadcast128", opLen1(ssa.OpBroadcast128Int8x16, types.TypeVec128), sys.AMD64)
addF(simdPackage, "Int16x8.Broadcast128", opLen1(ssa.OpBroadcast128Int16x8, types.TypeVec128), sys.AMD64)
addF(simdPackage, "Int32x4.Broadcast128", opLen1(ssa.OpBroadcast128Int32x4, types.TypeVec128), sys.AMD64)
addF(simdPackage, "Int64x2.Broadcast128", opLen1(ssa.OpBroadcast128Int64x2, types.TypeVec128), sys.AMD64)
addF(simdPackage, "Uint8x16.Broadcast128", opLen1(ssa.OpBroadcast128Uint8x16, types.TypeVec128), sys.AMD64)
addF(simdPackage, "Uint16x8.Broadcast128", opLen1(ssa.OpBroadcast128Uint16x8, types.TypeVec128), sys.AMD64)
addF(simdPackage, "Uint32x4.Broadcast128", opLen1(ssa.OpBroadcast128Uint32x4, types.TypeVec128), sys.AMD64)
addF(simdPackage, "Uint64x2.Broadcast128", opLen1(ssa.OpBroadcast128Uint64x2, types.TypeVec128), sys.AMD64)
addF(simdPackage, "Float32x4.Broadcast256", opLen1(ssa.OpBroadcast256Float32x4, types.TypeVec256), sys.AMD64)
addF(simdPackage, "Float64x2.Broadcast256", opLen1(ssa.OpBroadcast256Float64x2, types.TypeVec256), sys.AMD64)
addF(simdPackage, "Int8x16.Broadcast256", opLen1(ssa.OpBroadcast256Int8x16, types.TypeVec256), sys.AMD64)
addF(simdPackage, "Int16x8.Broadcast256", opLen1(ssa.OpBroadcast256Int16x8, types.TypeVec256), sys.AMD64)
addF(simdPackage, "Int32x4.Broadcast256", opLen1(ssa.OpBroadcast256Int32x4, types.TypeVec256), sys.AMD64)
addF(simdPackage, "Int64x2.Broadcast256", opLen1(ssa.OpBroadcast256Int64x2, types.TypeVec256), sys.AMD64)
addF(simdPackage, "Uint8x16.Broadcast256", opLen1(ssa.OpBroadcast256Uint8x16, types.TypeVec256), sys.AMD64)
addF(simdPackage, "Uint16x8.Broadcast256", opLen1(ssa.OpBroadcast256Uint16x8, types.TypeVec256), sys.AMD64)
addF(simdPackage, "Uint32x4.Broadcast256", opLen1(ssa.OpBroadcast256Uint32x4, types.TypeVec256), sys.AMD64)
addF(simdPackage, "Uint64x2.Broadcast256", opLen1(ssa.OpBroadcast256Uint64x2, types.TypeVec256), sys.AMD64)
addF(simdPackage, "Float32x4.Broadcast512", opLen1(ssa.OpBroadcast512Float32x4, types.TypeVec512), sys.AMD64)
addF(simdPackage, "Float64x2.Broadcast512", opLen1(ssa.OpBroadcast512Float64x2, types.TypeVec512), sys.AMD64)
addF(simdPackage, "Int8x16.Broadcast512", opLen1(ssa.OpBroadcast512Int8x16, types.TypeVec512), sys.AMD64)
addF(simdPackage, "Int16x8.Broadcast512", opLen1(ssa.OpBroadcast512Int16x8, types.TypeVec512), sys.AMD64)
addF(simdPackage, "Int32x4.Broadcast512", opLen1(ssa.OpBroadcast512Int32x4, types.TypeVec512), sys.AMD64)
addF(simdPackage, "Int64x2.Broadcast512", opLen1(ssa.OpBroadcast512Int64x2, types.TypeVec512), sys.AMD64)
addF(simdPackage, "Uint8x16.Broadcast512", opLen1(ssa.OpBroadcast512Uint8x16, types.TypeVec512), sys.AMD64)
addF(simdPackage, "Uint16x8.Broadcast512", opLen1(ssa.OpBroadcast512Uint16x8, types.TypeVec512), sys.AMD64)
addF(simdPackage, "Uint32x4.Broadcast512", opLen1(ssa.OpBroadcast512Uint32x4, types.TypeVec512), sys.AMD64)
addF(simdPackage, "Uint64x2.Broadcast512", opLen1(ssa.OpBroadcast512Uint64x2, types.TypeVec512), sys.AMD64)
addF(simdPackage, "Float32x4.Ceil", opLen1(ssa.OpCeilFloat32x4, types.TypeVec128), sys.AMD64)
addF(simdPackage, "Float32x8.Ceil", opLen1(ssa.OpCeilFloat32x8, types.TypeVec256), sys.AMD64)
addF(simdPackage, "Float64x2.Ceil", opLen1(ssa.OpCeilFloat64x2, types.TypeVec128), sys.AMD64)

View File

@@ -11,7 +11,7 @@ require (
golang.org/x/sys v0.39.0
golang.org/x/telemetry v0.0.0-20251128220624-abf20d0e57ec
golang.org/x/term v0.38.0
golang.org/x/tools v0.39.1-0.20251230210517-d44be789a05c
golang.org/x/tools v0.39.1-0.20251205000126-062ef7b6ced2
)
require (

View File

@@ -22,7 +22,7 @@ golang.org/x/term v0.38.0 h1:PQ5pkm/rLO6HnxFR7N2lJHOZX6Kez5Y1gDSJla6jo7Q=
golang.org/x/term v0.38.0/go.mod h1:bSEAKrOT1W+VSu9TSCMtoGEOUcKxOKgl3LE5QEF/xVg=
golang.org/x/text v0.32.0 h1:ZD01bjUt1FQ9WJ0ClOL5vxgxOI/sVCNgX1YtKwcY0mU=
golang.org/x/text v0.32.0/go.mod h1:o/rUWzghvpD5TXrTIBuJU77MTaN0ljMWE47kxGJQ7jY=
golang.org/x/tools v0.39.1-0.20251230210517-d44be789a05c h1:0pZej6BQOooNbOfjJEu4v5qx9hdwFX8HnvHCcNXcs2w=
golang.org/x/tools v0.39.1-0.20251230210517-d44be789a05c/go.mod h1:JnefbkDPyD8UU2kI5fuf8ZX4/yUeh9W877ZeBONxUqQ=
golang.org/x/tools v0.39.1-0.20251205000126-062ef7b6ced2 h1:2Qqv605Nus9iUp3ErvEU/q92Q3HAzeROztzl9pzAno8=
golang.org/x/tools v0.39.1-0.20251205000126-062ef7b6ced2/go.mod h1:JnefbkDPyD8UU2kI5fuf8ZX4/yUeh9W877ZeBONxUqQ=
rsc.io/markdown v0.0.0-20240306144322-0bf8f97ee8ef h1:mqLYrXCXYEZOop9/Dbo6RPX11539nwiCNBb1icVPmw8=
rsc.io/markdown v0.0.0-20240306144322-0bf8f97ee8ef/go.mod h1:8xcPgWmwlZONN1D9bjxtHEjrUtSEa3fakVF8iaewYKQ=

View File

@@ -328,10 +328,7 @@ func runEdit(ctx context.Context, cmd *base.Command, args []string) {
// parsePathVersion parses -flag=arg expecting arg to be path@version.
func parsePathVersion(flag, arg string) (path, version string) {
before, after, found, err := modload.ParsePathVersion(arg)
if err != nil {
base.Fatalf("go: -%s=%s: %v", flag, arg, err)
}
before, after, found := strings.Cut(arg, "@")
if !found {
base.Fatalf("go: -%s=%s: need path@version", flag, arg)
}
@@ -365,10 +362,7 @@ func parsePathVersionOptional(adj, arg string, allowDirPath bool) (path, version
if allowDirPath && modfile.IsDirectoryPath(arg) {
return arg, "", nil
}
before, after, found, err := modload.ParsePathVersion(arg)
if err != nil {
return "", "", err
}
before, after, found := strings.Cut(arg, "@")
if !found {
path = arg
} else {

View File

@@ -261,7 +261,7 @@ func (r *gitRepo) loadRefs(ctx context.Context) (map[string]string, error) {
r.refsErr = err
return
}
out, gitErr := r.runGit(ctx, "git", "ls-remote", "-q", "--end-of-options", r.remote)
out, gitErr := r.runGit(ctx, "git", "ls-remote", "-q", r.remote)
release()
if gitErr != nil {
@@ -530,7 +530,7 @@ func (r *gitRepo) stat(ctx context.Context, rev string) (info *RevInfo, err erro
if fromTag && !slices.Contains(info.Tags, tag) {
// The local repo includes the commit hash we want, but it is missing
// the corresponding tag. Add that tag and try again.
_, err := r.runGit(ctx, "git", "tag", "--end-of-options", tag, hash)
_, err := r.runGit(ctx, "git", "tag", tag, hash)
if err != nil {
return nil, err
}
@@ -579,7 +579,7 @@ func (r *gitRepo) stat(ctx context.Context, rev string) (info *RevInfo, err erro
// an apparent Git bug introduced in Git 2.21 (commit 61c771),
// which causes the handler for protocol version 1 to sometimes miss
// tags that point to the requested commit (see https://go.dev/issue/56881).
_, err = r.runGit(ctx, "git", "-c", "protocol.version=2", "fetch", "-f", "--depth=1", "--end-of-options", r.remote, refspec)
_, err = r.runGit(ctx, "git", "-c", "protocol.version=2", "fetch", "-f", "--depth=1", r.remote, refspec)
release()
if err == nil {
@@ -625,12 +625,12 @@ func (r *gitRepo) fetchRefsLocked(ctx context.Context) error {
}
defer release()
if _, err := r.runGit(ctx, "git", "fetch", "-f", "--end-of-options", r.remote, "refs/heads/*:refs/heads/*", "refs/tags/*:refs/tags/*"); err != nil {
if _, err := r.runGit(ctx, "git", "fetch", "-f", r.remote, "refs/heads/*:refs/heads/*", "refs/tags/*:refs/tags/*"); err != nil {
return err
}
if _, err := os.Stat(filepath.Join(r.dir, "shallow")); err == nil {
if _, err := r.runGit(ctx, "git", "fetch", "--unshallow", "-f", "--end-of-options", r.remote); err != nil {
if _, err := r.runGit(ctx, "git", "fetch", "--unshallow", "-f", r.remote); err != nil {
return err
}
}
@@ -643,7 +643,7 @@ func (r *gitRepo) fetchRefsLocked(ctx context.Context) error {
// statLocal returns a new RevInfo describing rev in the local git repository.
// It uses version as info.Version.
func (r *gitRepo) statLocal(ctx context.Context, version, rev string) (*RevInfo, error) {
out, err := r.runGit(ctx, "git", "-c", "log.showsignature=false", "log", "--no-decorate", "-n1", "--format=format:%H %ct %D", "--end-of-options", rev, "--")
out, err := r.runGit(ctx, "git", "-c", "log.showsignature=false", "log", "--no-decorate", "-n1", "--format=format:%H %ct %D", rev, "--")
if err != nil {
// Return info with Origin.RepoSum if possible to allow caching of negative lookup.
var info *RevInfo
@@ -733,7 +733,7 @@ func (r *gitRepo) ReadFile(ctx context.Context, rev, file string, maxSize int64)
if err != nil {
return nil, err
}
out, err := r.runGit(ctx, "git", "cat-file", "--end-of-options", "blob", info.Name+":"+file)
out, err := r.runGit(ctx, "git", "cat-file", "blob", info.Name+":"+file)
if err != nil {
return nil, fs.ErrNotExist
}
@@ -751,7 +751,7 @@ func (r *gitRepo) RecentTag(ctx context.Context, rev, prefix string, allowed fun
// result is definitive.
describe := func() (definitive bool) {
var out []byte
out, err = r.runGit(ctx, "git", "for-each-ref", "--format=%(refname)", "--merged="+rev)
out, err = r.runGit(ctx, "git", "for-each-ref", "--format", "%(refname)", "refs/tags", "--merged", rev)
if err != nil {
return true
}
@@ -903,7 +903,7 @@ func (r *gitRepo) ReadZip(ctx context.Context, rev, subdir string, maxSize int64
// TODO: Use maxSize or drop it.
args := []string{}
if subdir != "" {
args = append(args, subdir)
args = append(args, "--", subdir)
}
info, err := r.Stat(ctx, rev) // download rev into local git repo
if err != nil {
@@ -925,7 +925,7 @@ func (r *gitRepo) ReadZip(ctx context.Context, rev, subdir string, maxSize int64
// text file line endings. Setting -c core.autocrlf=input means only
// translate files on the way into the repo, not on the way out (archive).
// The -c core.eol=lf should be unnecessary but set it anyway.
archive, err := r.runGit(ctx, "git", "-c", "core.autocrlf=input", "-c", "core.eol=lf", "archive", "--format=zip", "--prefix=prefix/", "--end-of-options", info.Name, args)
archive, err := r.runGit(ctx, "git", "-c", "core.autocrlf=input", "-c", "core.eol=lf", "archive", "--format=zip", "--prefix=prefix/", info.Name, args)
if err != nil {
if bytes.Contains(err.(*RunError).Stderr, []byte("did not match any files")) {
return nil, fs.ErrNotExist

View File

@@ -188,7 +188,6 @@ var vcsCmds = map[string]*vcsCmd{
"hg",
"--config=extensions.goreposum=" + filepath.Join(cfg.GOROOT, "lib/hg/goreposum.py"),
"goreposum",
"--",
remote,
}
},
@@ -197,7 +196,6 @@ var vcsCmds = map[string]*vcsCmd{
"hg",
"--config=extensions.goreposum=" + filepath.Join(cfg.GOROOT, "lib/hg/goreposum.py"),
"golookup",
"--",
remote,
ref,
}
@@ -218,26 +216,26 @@ var vcsCmds = map[string]*vcsCmd{
branchRE: re(`(?m)^[^\n]+$`),
badLocalRevRE: re(`(?m)^(tip)$`),
statLocal: func(rev, remote string) []string {
return []string{"hg", "log", "-l1", fmt.Sprintf("--rev=%s", rev), "--template", "{node} {date|hgdate} {tags}"}
return []string{"hg", "log", "-l1", "-r", rev, "--template", "{node} {date|hgdate} {tags}"}
},
parseStat: hgParseStat,
fetch: []string{"hg", "pull", "-f"},
latest: "tip",
descendsFrom: func(rev, tag string) []string {
return []string{"hg", "log", "--rev=ancestors(" + rev + ") and " + tag}
return []string{"hg", "log", "-r", "ancestors(" + rev + ") and " + tag}
},
recentTags: func(rev string) []string {
return []string{"hg", "log", "--rev=ancestors(" + rev + ") and tag()", "--template", "{tags}\n"}
return []string{"hg", "log", "-r", "ancestors(" + rev + ") and tag()", "--template", "{tags}\n"}
},
readFile: func(rev, file, remote string) []string {
return []string{"hg", "cat", fmt.Sprintf("--rev=%s", rev), "--", file}
return []string{"hg", "cat", "-r", rev, file}
},
readZip: func(rev, subdir, remote, target string) []string {
pattern := []string{}
if subdir != "" {
pattern = []string{fmt.Sprintf("--include=%s", subdir+"/**")}
pattern = []string{"-I", subdir + "/**"}
}
return str.StringList("hg", "archive", "-t", "zip", "--no-decode", fmt.Sprintf("--rev=%s", rev), "--prefix=prefix/", pattern, "--", target)
return str.StringList("hg", "archive", "-t", "zip", "--no-decode", "-r", rev, "--prefix=prefix/", pattern, "--", target)
},
},
@@ -277,19 +275,19 @@ var vcsCmds = map[string]*vcsCmd{
tagRE: re(`(?m)^\S+`),
badLocalRevRE: re(`^revno:-`),
statLocal: func(rev, remote string) []string {
return []string{"bzr", "log", "-l1", "--long", "--show-ids", fmt.Sprintf("--revision=%s", rev)}
return []string{"bzr", "log", "-l1", "--long", "--show-ids", "-r", rev}
},
parseStat: bzrParseStat,
latest: "revno:-1",
readFile: func(rev, file, remote string) []string {
return []string{"bzr", "cat", fmt.Sprintf("--revision=%s", rev), "--", file}
return []string{"bzr", "cat", "-r", rev, file}
},
readZip: func(rev, subdir, remote, target string) []string {
extra := []string{}
if subdir != "" {
extra = []string{"./" + subdir}
}
return str.StringList("bzr", "export", "--format=zip", fmt.Sprintf("--revision=%s", rev), "--root=prefix/", "--", target, extra)
return str.StringList("bzr", "export", "--format=zip", "-r", rev, "--root=prefix/", "--", target, extra)
},
},
@@ -304,17 +302,17 @@ var vcsCmds = map[string]*vcsCmd{
},
tagRE: re(`XXXTODO`),
statLocal: func(rev, remote string) []string {
return []string{"fossil", "info", "-R", ".fossil", "--", rev}
return []string{"fossil", "info", "-R", ".fossil", rev}
},
parseStat: fossilParseStat,
latest: "trunk",
readFile: func(rev, file, remote string) []string {
return []string{"fossil", "cat", "-R", ".fossil", fmt.Sprintf("-r=%s", rev), "--", file}
return []string{"fossil", "cat", "-R", ".fossil", "-r", rev, file}
},
readZip: func(rev, subdir, remote, target string) []string {
extra := []string{}
if subdir != "" && !strings.ContainsAny(subdir, "*?[],") {
extra = []string{fmt.Sprintf("--include=%s", subdir)}
extra = []string{"--include", subdir}
}
// Note that vcsRepo.ReadZip below rewrites this command
// to run in a different directory, to work around a fossil bug.

View File

@@ -140,10 +140,7 @@ func errSet(err error) pathSet { return pathSet{err: err} }
// newQuery returns a new query parsed from the raw argument,
// which must be either path or path@version.
func newQuery(loaderstate *modload.State, raw string) (*query, error) {
pattern, rawVers, found, err := modload.ParsePathVersion(raw)
if err != nil {
return nil, err
}
pattern, rawVers, found := strings.Cut(raw, "@")
if found && (strings.Contains(rawVers, "@") || rawVers == "") {
return nil, fmt.Errorf("invalid module version syntax %q", raw)
}

View File

@@ -12,6 +12,7 @@ import (
"io/fs"
"os"
"path/filepath"
"strings"
"cmd/go/internal/base"
"cmd/go/internal/cfg"
@@ -87,16 +88,7 @@ func ModuleInfo(loaderstate *State, ctx context.Context, path string) *modinfo.M
return nil
}
path, vers, found, err := ParsePathVersion(path)
if err != nil {
return &modinfo.ModulePublic{
Path: path,
Error: &modinfo.ModuleError{
Err: err.Error(),
},
}
}
if found {
if path, vers, found := strings.Cut(path, "@"); found {
m := module.Version{Path: path, Version: vers}
return moduleInfo(loaderstate, ctx, nil, m, 0, nil)
}

View File

@@ -150,11 +150,7 @@ func listModules(loaderstate *State, ctx context.Context, rs *Requirements, args
}
continue
}
path, vers, found, err := ParsePathVersion(arg)
if err != nil {
base.Fatalf("go: %v", err)
}
if found {
if path, vers, found := strings.Cut(arg, "@"); found {
if vers == "upgrade" || vers == "patch" {
if _, ok := rs.rootSelected(loaderstate, path); !ok || rs.pruning == unpruned {
needFullGraph = true
@@ -180,11 +176,7 @@ func listModules(loaderstate *State, ctx context.Context, rs *Requirements, args
matchedModule := map[module.Version]bool{}
for _, arg := range args {
path, vers, found, err := ParsePathVersion(arg)
if err != nil {
base.Fatalf("go: %v", err)
}
if found {
if path, vers, found := strings.Cut(arg, "@"); found {
var current string
if mg == nil {
current, _ = rs.rootSelected(loaderstate, path)
@@ -325,21 +317,3 @@ func modinfoError(path, vers string, err error) *modinfo.ModuleError {
return &modinfo.ModuleError{Err: err.Error()}
}
// ParsePathVersion parses arg expecting arg to be path@version. If there is no
// '@' in arg, found is false, vers is "", and path is arg. This mirrors the
// typical usage of strings.Cut. ParsePathVersion is meant to be a general
// replacement for strings.Cut in module version parsing. If the version is
// invalid, an error is returned. The version is considered invalid if it is
// prefixed with '-' or '/', which can cause security problems when constructing
// commands to execute that use the version.
func ParsePathVersion(arg string) (path, vers string, found bool, err error) {
path, vers, found = strings.Cut(arg, "@")
if !found {
return arg, "", false, nil
}
if len(vers) > 0 && (vers[0] == '-' || vers[0] == '/') {
return "", "", false, fmt.Errorf("invalid module version %q", vers)
}
return path, vers, true, nil
}

View File

@@ -667,10 +667,7 @@ func maybeSwitchForGoInstallVersion(loaderstate *modload.State, minVers string)
if !strings.Contains(pkgArg, "@") || build.IsLocalImport(pkgArg) || filepath.IsAbs(pkgArg) {
return
}
path, version, _, err := modload.ParsePathVersion(pkgArg)
if err != nil {
base.Fatalf("go: %v", err)
}
path, version, _ := strings.Cut(pkgArg, "@")
if path == "" || version == "" || gover.IsToolchain(path) {
return
}
@@ -705,7 +702,7 @@ func maybeSwitchForGoInstallVersion(loaderstate *modload.State, minVers string)
allowed = nil
}
noneSelected := func(path string) (version string) { return "none" }
_, err = modload.QueryPackages(loaderstate, ctx, path, version, noneSelected, allowed)
_, err := modload.QueryPackages(loaderstate, ctx, path, version, noneSelected, allowed)
if errors.Is(err, gover.ErrTooNew) {
// Run early switch, same one go install or go run would eventually do,
// if it understood all the command-line flags.

View File

@@ -17,6 +17,7 @@ import (
"os"
"os/exec"
"path/filepath"
"regexp"
"strconv"
"strings"
"sync"
@@ -40,10 +41,20 @@ type Cmd struct {
Env []string // any environment values to set/override
RootNames []rootName // filename and mode indicating the root of a checkout directory
CreateCmd []string // commands to download a fresh copy of a repository
DownloadCmd []string // commands to download updates into an existing repository
TagCmd []tagCmd // commands to list tags
TagLookupCmd []tagCmd // commands to lookup tags before running tagSyncCmd
TagSyncCmd []string // commands to sync to specific tag
TagSyncDefault []string // commands to sync to default tag
Scheme []string
PingCmd string
Status func(v *Cmd, rootDir string) (Status, error)
RemoteRepo func(v *Cmd, rootDir string) (remoteRepo string, err error)
ResolveRepo func(v *Cmd, rootDir, remoteRepo string) (realRepo string, err error)
Status func(v *Cmd, rootDir string) (Status, error)
}
// Status is the current state of a local repository.
@@ -146,16 +157,40 @@ var vcsHg = &Cmd{
Name: "Mercurial",
Cmd: "hg",
// HGPLAIN=+strictflags turns off additional output that a user may have
// enabled via config options or certain extensions.
Env: []string{"HGPLAIN=+strictflags"},
// HGPLAIN=1 turns off additional output that a user may have enabled via
// config options or certain extensions.
Env: []string{"HGPLAIN=1"},
RootNames: []rootName{
{filename: ".hg", isDir: true},
},
Scheme: []string{"https", "http", "ssh"},
PingCmd: "identify -- {scheme}://{repo}",
Status: hgStatus,
CreateCmd: []string{"clone -U -- {repo} {dir}"},
DownloadCmd: []string{"pull"},
// We allow both tag and branch names as 'tags'
// for selecting a version. This lets people have
// a go.release.r60 branch and a go1 branch
// and make changes in both, without constantly
// editing .hgtags.
TagCmd: []tagCmd{
{"tags", `^(\S+)`},
{"branches", `^(\S+)`},
},
TagSyncCmd: []string{"update -r {tag}"},
TagSyncDefault: []string{"update default"},
Scheme: []string{"https", "http", "ssh"},
PingCmd: "identify -- {scheme}://{repo}",
RemoteRepo: hgRemoteRepo,
Status: hgStatus,
}
func hgRemoteRepo(vcsHg *Cmd, rootDir string) (remoteRepo string, err error) {
out, err := vcsHg.runOutput(rootDir, "paths default")
if err != nil {
return "", err
}
return strings.TrimSpace(string(out)), nil
}
func hgStatus(vcsHg *Cmd, rootDir string) (Status, error) {
@@ -218,6 +253,25 @@ var vcsGit = &Cmd{
{filename: ".git", isDir: true},
},
CreateCmd: []string{"clone -- {repo} {dir}", "-go-internal-cd {dir} submodule update --init --recursive"},
DownloadCmd: []string{"pull --ff-only", "submodule update --init --recursive"},
TagCmd: []tagCmd{
// tags/xxx matches a git tag named xxx
// origin/xxx matches a git branch named xxx on the default remote repository
{"show-ref", `(?:tags|origin)/(\S+)$`},
},
TagLookupCmd: []tagCmd{
{"show-ref tags/{tag} origin/{tag}", `((?:tags|origin)/\S+)$`},
},
TagSyncCmd: []string{"checkout {tag}", "submodule update --init --recursive"},
// both createCmd and downloadCmd update the working dir.
// No need to do more here. We used to 'checkout master'
// but that doesn't work if the default branch is not named master.
// DO NOT add 'checkout master' here.
// See golang.org/issue/9032.
TagSyncDefault: []string{"submodule update --init --recursive"},
Scheme: []string{"git", "https", "http", "git+ssh", "ssh"},
// Leave out the '--' separator in the ls-remote command: git 2.7.4 does not
@@ -226,7 +280,54 @@ var vcsGit = &Cmd{
// See golang.org/issue/33836.
PingCmd: "ls-remote {scheme}://{repo}",
Status: gitStatus,
RemoteRepo: gitRemoteRepo,
Status: gitStatus,
}
// scpSyntaxRe matches the SCP-like addresses used by Git to access
// repositories by SSH.
var scpSyntaxRe = lazyregexp.New(`^(\w+)@([\w.-]+):(.*)$`)
func gitRemoteRepo(vcsGit *Cmd, rootDir string) (remoteRepo string, err error) {
const cmd = "config remote.origin.url"
outb, err := vcsGit.run1(rootDir, cmd, nil, false)
if err != nil {
// if it doesn't output any message, it means the config argument is correct,
// but the config value itself doesn't exist
if outb != nil && len(outb) == 0 {
return "", errors.New("remote origin not found")
}
return "", err
}
out := strings.TrimSpace(string(outb))
var repoURL *urlpkg.URL
if m := scpSyntaxRe.FindStringSubmatch(out); m != nil {
// Match SCP-like syntax and convert it to a URL.
// Eg, "git@github.com:user/repo" becomes
// "ssh://git@github.com/user/repo".
repoURL = &urlpkg.URL{
Scheme: "ssh",
User: urlpkg.User(m[1]),
Host: m[2],
Path: m[3],
}
} else {
repoURL, err = urlpkg.Parse(out)
if err != nil {
return "", err
}
}
// Iterate over insecure schemes too, because this function simply
// reports the state of the repo. If we can't see insecure schemes then
// we can't report the actual repo URL.
for _, s := range vcsGit.Scheme {
if repoURL.Scheme == s {
return repoURL.String(), nil
}
}
return "", errors.New("unable to parse output of git " + cmd)
}
func gitStatus(vcsGit *Cmd, rootDir string) (Status, error) {
@@ -266,9 +367,62 @@ var vcsBzr = &Cmd{
{filename: ".bzr", isDir: true},
},
Scheme: []string{"https", "http", "bzr", "bzr+ssh"},
PingCmd: "info -- {scheme}://{repo}",
Status: bzrStatus,
CreateCmd: []string{"branch -- {repo} {dir}"},
// Without --overwrite bzr will not pull tags that changed.
// Replace by --overwrite-tags after http://pad.lv/681792 goes in.
DownloadCmd: []string{"pull --overwrite"},
TagCmd: []tagCmd{{"tags", `^(\S+)`}},
TagSyncCmd: []string{"update -r {tag}"},
TagSyncDefault: []string{"update -r revno:-1"},
Scheme: []string{"https", "http", "bzr", "bzr+ssh"},
PingCmd: "info -- {scheme}://{repo}",
RemoteRepo: bzrRemoteRepo,
ResolveRepo: bzrResolveRepo,
Status: bzrStatus,
}
func bzrRemoteRepo(vcsBzr *Cmd, rootDir string) (remoteRepo string, err error) {
outb, err := vcsBzr.runOutput(rootDir, "config parent_location")
if err != nil {
return "", err
}
return strings.TrimSpace(string(outb)), nil
}
func bzrResolveRepo(vcsBzr *Cmd, rootDir, remoteRepo string) (realRepo string, err error) {
outb, err := vcsBzr.runOutput(rootDir, "info "+remoteRepo)
if err != nil {
return "", err
}
out := string(outb)
// Expect:
// ...
// (branch root|repository branch): <URL>
// ...
found := false
for _, prefix := range []string{"\n branch root: ", "\n repository branch: "} {
i := strings.Index(out, prefix)
if i >= 0 {
out = out[i+len(prefix):]
found = true
break
}
}
if !found {
return "", fmt.Errorf("unable to parse output of bzr info")
}
i := strings.Index(out, "\n")
if i < 0 {
return "", fmt.Errorf("unable to parse output of bzr info")
}
out = out[:i]
return strings.TrimSpace(out), nil
}
func bzrStatus(vcsBzr *Cmd, rootDir string) (Status, error) {
@@ -336,12 +490,46 @@ var vcsSvn = &Cmd{
{filename: ".svn", isDir: true},
},
CreateCmd: []string{"checkout -- {repo} {dir}"},
DownloadCmd: []string{"update"},
// There is no tag command in subversion.
// The branch information is all in the path names.
Scheme: []string{"https", "http", "svn", "svn+ssh"},
PingCmd: "info -- {scheme}://{repo}",
Status: svnStatus,
Scheme: []string{"https", "http", "svn", "svn+ssh"},
PingCmd: "info -- {scheme}://{repo}",
RemoteRepo: svnRemoteRepo,
Status: svnStatus,
}
func svnRemoteRepo(vcsSvn *Cmd, rootDir string) (remoteRepo string, err error) {
outb, err := vcsSvn.runOutput(rootDir, "info")
if err != nil {
return "", err
}
out := string(outb)
// Expect:
//
// ...
// URL: <URL>
// ...
//
// Note that we're not using the Repository Root line,
// because svn allows checking out subtrees.
// The URL will be the URL of the subtree (what we used with 'svn co')
// while the Repository Root may be a much higher parent.
i := strings.Index(out, "\nURL: ")
if i < 0 {
return "", fmt.Errorf("unable to parse output of svn info")
}
out = out[i+len("\nURL: "):]
i = strings.Index(out, "\n")
if i < 0 {
return "", fmt.Errorf("unable to parse output of svn info")
}
out = out[:i]
return strings.TrimSpace(out), nil
}
func svnStatus(vcsSvn *Cmd, rootDir string) (Status, error) {
@@ -386,8 +574,24 @@ var vcsFossil = &Cmd{
{filename: "_FOSSIL_", isDir: false},
},
Scheme: []string{"https", "http"},
Status: fossilStatus,
CreateCmd: []string{"-go-internal-mkdir {dir} clone -- {repo} " + filepath.Join("{dir}", fossilRepoName), "-go-internal-cd {dir} open .fossil"},
DownloadCmd: []string{"up"},
TagCmd: []tagCmd{{"tag ls", `(.*)`}},
TagSyncCmd: []string{"up tag:{tag}"},
TagSyncDefault: []string{"up trunk"},
Scheme: []string{"https", "http"},
RemoteRepo: fossilRemoteRepo,
Status: fossilStatus,
}
func fossilRemoteRepo(vcsFossil *Cmd, rootDir string) (remoteRepo string, err error) {
out, err := vcsFossil.runOutput(rootDir, "remote-url")
if err != nil {
return "", err
}
return strings.TrimSpace(string(out)), nil
}
var errFossilInfo = errors.New("unable to parse output of fossil info")
@@ -488,7 +692,7 @@ func (v *Cmd) run1(dir string, cmdline string, keyval []string, verbose bool) ([
args[i] = expand(m, arg)
}
if len(args) >= 2 && args[0] == "--go-internal-mkdir" {
if len(args) >= 2 && args[0] == "-go-internal-mkdir" {
var err error
if filepath.IsAbs(args[1]) {
err = os.Mkdir(args[1], fs.ModePerm)
@@ -501,7 +705,7 @@ func (v *Cmd) run1(dir string, cmdline string, keyval []string, verbose bool) ([
args = args[2:]
}
if len(args) >= 2 && args[0] == "--go-internal-cd" {
if len(args) >= 2 && args[0] == "-go-internal-cd" {
if filepath.IsAbs(args[1]) {
dir = args[1]
} else {
@@ -562,6 +766,99 @@ func (v *Cmd) Ping(scheme, repo string) error {
return v.runVerboseOnly(dir, v.PingCmd, "scheme", scheme, "repo", repo)
}
// Create creates a new copy of repo in dir.
// The parent of dir must exist; dir must not.
func (v *Cmd) Create(dir, repo string) error {
release, err := base.AcquireNet()
if err != nil {
return err
}
defer release()
for _, cmd := range v.CreateCmd {
if err := v.run(filepath.Dir(dir), cmd, "dir", dir, "repo", repo); err != nil {
return err
}
}
return nil
}
// Download downloads any new changes for the repo in dir.
func (v *Cmd) Download(dir string) error {
release, err := base.AcquireNet()
if err != nil {
return err
}
defer release()
for _, cmd := range v.DownloadCmd {
if err := v.run(dir, cmd); err != nil {
return err
}
}
return nil
}
// Tags returns the list of available tags for the repo in dir.
func (v *Cmd) Tags(dir string) ([]string, error) {
var tags []string
for _, tc := range v.TagCmd {
out, err := v.runOutput(dir, tc.cmd)
if err != nil {
return nil, err
}
re := regexp.MustCompile(`(?m-s)` + tc.pattern)
for _, m := range re.FindAllStringSubmatch(string(out), -1) {
tags = append(tags, m[1])
}
}
return tags, nil
}
// TagSync syncs the repo in dir to the named tag,
// which either is a tag returned by tags or is v.tagDefault.
func (v *Cmd) TagSync(dir, tag string) error {
if v.TagSyncCmd == nil {
return nil
}
if tag != "" {
for _, tc := range v.TagLookupCmd {
out, err := v.runOutput(dir, tc.cmd, "tag", tag)
if err != nil {
return err
}
re := regexp.MustCompile(`(?m-s)` + tc.pattern)
m := re.FindStringSubmatch(string(out))
if len(m) > 1 {
tag = m[1]
break
}
}
}
release, err := base.AcquireNet()
if err != nil {
return err
}
defer release()
if tag == "" && v.TagSyncDefault != nil {
for _, cmd := range v.TagSyncDefault {
if err := v.run(dir, cmd); err != nil {
return err
}
}
return nil
}
for _, cmd := range v.TagSyncCmd {
if err := v.run(dir, cmd, "tag", tag); err != nil {
return err
}
}
return nil
}
// A vcsPath describes how to convert an import path into a
// version control system and repository name.
type vcsPath struct {
@@ -1088,10 +1385,6 @@ func repoRootForImportDynamic(importPath string, mod ModuleMode, security web.Se
}
}
if err := validateRepoSubDir(mmi.SubDir); err != nil {
return nil, fmt.Errorf("%s: invalid subdirectory %q: %v", resp.URL, mmi.SubDir, err)
}
if err := validateRepoRoot(mmi.RepoRoot); err != nil {
return nil, fmt.Errorf("%s: invalid repo root %q: %v", resp.URL, mmi.RepoRoot, err)
}
@@ -1123,22 +1416,6 @@ func repoRootForImportDynamic(importPath string, mod ModuleMode, security web.Se
return rr, nil
}
// validateRepoSubDir returns an error if subdir is not a valid subdirectory path.
// We consider a subdirectory path to be valid as long as it doesn't have a leading
// slash (/) or hyphen (-).
func validateRepoSubDir(subdir string) error {
if subdir == "" {
return nil
}
if subdir[0] == '/' {
return errors.New("leading slash")
}
if subdir[0] == '-' {
return errors.New("leading hyphen")
}
return nil
}
// validateRepoRoot returns an error if repoRoot does not seem to be
// a valid URL with scheme.
func validateRepoRoot(repoRoot string) error {

View File

@@ -507,42 +507,6 @@ func TestValidateRepoRoot(t *testing.T) {
}
}
func TestValidateRepoSubDir(t *testing.T) {
tests := []struct {
subdir string
ok bool
}{
{
subdir: "",
ok: true,
},
{
subdir: "sub/dir",
ok: true,
},
{
subdir: "/leading/slash",
ok: false,
},
{
subdir: "-leading/hyphen",
ok: false,
},
}
for _, test := range tests {
err := validateRepoSubDir(test.subdir)
ok := err == nil
if ok != test.ok {
want := "error"
if test.ok {
want = "nil"
}
t.Errorf("validateRepoSubDir(%q) = %q, want %s", test.subdir, err, want)
}
}
}
var govcsTests = []struct {
govcs string
path string

View File

@@ -248,11 +248,6 @@ func (b *Builder) Do(ctx context.Context, root *Action) {
wg.Wait()
if tokens != totalTokens || concurrentProcesses != 0 {
base.Fatalf("internal error: tokens not restored at end of build: tokens: %d, totalTokens: %d, concurrentProcesses: %d",
tokens, totalTokens, concurrentProcesses)
}
// Write action graph again, this time with timing information.
writeActionGraph()
}
@@ -1788,14 +1783,6 @@ func (b *Builder) getPkgConfigFlags(a *Action, p *load.Package) (cflags, ldflags
return nil, nil, fmt.Errorf("invalid pkg-config package name: %s", pkg)
}
}
// Running 'pkg-config' can cause execution of
// arbitrary code using flags that are not in
// the safelist.
if err := checkCompilerFlags("CFLAGS", "pkg-config --cflags", pcflags); err != nil {
return nil, nil, err
}
var out []byte
out, err = sh.runOut(p.Dir, nil, b.PkgconfigCmd(), "--cflags", pcflags, "--", pkgs)
if err != nil {

View File

@@ -217,17 +217,16 @@ func compilerConcurrency() (int, func()) {
concurrentProcesses++
// Set aside tokens so that we don't run out if we were running cfg.BuildP concurrent compiles.
// We'll set aside one token for each of the action goroutines that aren't currently running a compile.
setAside := (cfg.BuildP - concurrentProcesses) * minTokens
setAside := cfg.BuildP - concurrentProcesses
availableTokens := tokens - setAside
// Grab half the remaining tokens: but with a floor of at least minTokens token, and
// Grab half the remaining tokens: but with a floor of at least 1 token, and
// a ceiling of the max backend concurrency.
c := max(min(availableTokens/2, maxCompilerConcurrency), minTokens)
c := max(min(availableTokens/2, maxCompilerConcurrency), 1)
tokens -= c
// Successfully grabbed the tokens.
return c, func() {
tokensMu.Lock()
defer tokensMu.Unlock()
concurrentProcesses--
tokens += c
}
}
@@ -236,22 +235,17 @@ var maxCompilerConcurrency = runtime.GOMAXPROCS(0) // max value we will use for
var (
tokensMu sync.Mutex
totalTokens int // total number of tokens: this is used for checking that we get them all back in the end
tokens int // number of available tokens
concurrentProcesses int // number of currently running compiles
minTokens int // minimum number of tokens to give out
)
// initCompilerConcurrencyPool sets the number of tokens in the pool. It needs
// to be run after init, so that it can use the value of cfg.BuildP.
func initCompilerConcurrencyPool() {
// Size the pool to allow 2*maxCompilerConcurrency extra tokens to
// be distributed amongst the compile actions in addition to the minimum
// of min(4,GOMAXPROCS) tokens for each of the potentially cfg.BuildP
// concurrently running compile actions.
minTokens = min(4, maxCompilerConcurrency)
tokens = 2*maxCompilerConcurrency + minTokens*cfg.BuildP
totalTokens = tokens
// Size the pool so that the worst case total number of compiles is not more
// than what it was when we capped the concurrency to 4.
oldConcurrencyCap := min(4, maxCompilerConcurrency)
tokens = oldConcurrencyCap * cfg.BuildP
}
// trimpath returns the -trimpath argument to use

View File

@@ -129,7 +129,6 @@ var validCompilerFlags = []*lazyregexp.Regexp{
re(`-pedantic(-errors)?`),
re(`-pipe`),
re(`-pthread`),
re(`--static`),
re(`-?-std=([^@\-].*)`),
re(`-?-stdlib=([^@\-].*)`),
re(`--sysroot=([^@\-].*)`),

View File

@@ -279,10 +279,7 @@ func allowedVersionArg(arg string) bool {
// parsePathVersionOptional parses path[@version], using adj to
// describe any errors.
func parsePathVersionOptional(adj, arg string, allowDirPath bool) (path, version string, err error) {
before, after, found, err := modload.ParsePathVersion(arg)
if err != nil {
return "", "", err
}
before, after, found := strings.Cut(arg, "@")
if !found {
path = arg
} else {

View File

@@ -6,7 +6,7 @@ env GIT_COMMITTER_NAME=$GIT_AUTHOR_NAME
env GIT_COMMITTER_EMAIL=$GIT_AUTHOR_EMAIL
git init
git checkout -b master
git branch -M master
at 2018-07-17T12:41:39-04:00
cp x_cf92c7b.go x.go

View File

@@ -80,8 +80,6 @@ or b.ResetTimer within the same function will also be removed.
Caveats: The b.Loop() method is designed to prevent the compiler from
optimizing away the benchmark loop, which can occasionally result in
slower execution due to increased allocations in some specific cases.
Since its fix may change the performance of nanosecond-scale benchmarks,
bloop is disabled by default in the `go fix` analyzer suite; see golang/go#74967.
# Analyzer any

View File

@@ -231,28 +231,9 @@ func mapsloop(pass *analysis.Pass) (any, error) {
// Have: for k, v := range x { lhs = rhs }
assign := rng.Body.List[0].(*ast.AssignStmt)
// usesKV reports whether e references vars k or v.
usesKV := func(e ast.Expr) bool {
k := info.Defs[rng.Key.(*ast.Ident)]
v := info.Defs[rng.Value.(*ast.Ident)]
for n := range ast.Preorder(e) {
if id, ok := n.(*ast.Ident); ok {
obj := info.Uses[id]
if obj != nil && // don't rely on k, v being non-nil
(obj == k || obj == v) {
return true
}
}
}
return false
}
if index, ok := assign.Lhs[0].(*ast.IndexExpr); ok &&
len(assign.Lhs) == 1 &&
astutil.EqualSyntax(rng.Key, index.Index) &&
astutil.EqualSyntax(rng.Value, assign.Rhs[0]) &&
!usesKV(index.X) { // reject (e.g.) f(k, v)[k] = v
astutil.EqualSyntax(rng.Value, assign.Rhs[0]) {
if tmap, ok := typeparams.CoreType(info.TypeOf(index.X)).(*types.Map); ok &&
types.Identical(info.TypeOf(index), info.TypeOf(rng.Value)) && // m[k], v
types.Identical(tmap.Key(), info.TypeOf(rng.Key)) {

View File

@@ -34,7 +34,7 @@ var doc string
var Suite = []*analysis.Analyzer{
AnyAnalyzer,
// AppendClippedAnalyzer, // not nil-preserving!
// BLoopAnalyzer, // may skew benchmark results, see golang/go#74967
BLoopAnalyzer,
FmtAppendfAnalyzer,
ForVarAnalyzer,
MapsLoopAnalyzer,

View File

@@ -73,7 +73,7 @@ golang.org/x/text/internal/tag
golang.org/x/text/language
golang.org/x/text/transform
golang.org/x/text/unicode/norm
# golang.org/x/tools v0.39.1-0.20251230210517-d44be789a05c
# golang.org/x/tools v0.39.1-0.20251205000126-062ef7b6ced2
## explicit; go 1.24.0
golang.org/x/tools/cmd/bisect
golang.org/x/tools/cover

View File

@@ -980,10 +980,6 @@ const maxSessionTicketLifetime = 7 * 24 * time.Hour
// Clone returns a shallow clone of c or nil if c is nil. It is safe to clone a [Config] that is
// being used concurrently by a TLS client or server.
//
// If Config.SessionTicketKey is unpopulated, and Config.SetSessionTicketKeys has not been
// called, the clone will not share the same auto-rotated session ticket keys as the original
// Config in order to prevent sessions from being resumed across Configs.
func (c *Config) Clone() *Config {
if c == nil {
return nil
@@ -1024,8 +1020,7 @@ func (c *Config) Clone() *Config {
EncryptedClientHelloRejectionVerify: c.EncryptedClientHelloRejectionVerify,
EncryptedClientHelloKeys: c.EncryptedClientHelloKeys,
sessionTicketKeys: c.sessionTicketKeys,
// We explicitly do not copy autoSessionTicketKeys, so that Configs do
// not share the same auto-rotated keys.
autoSessionTicketKeys: c.autoSessionTicketKeys,
}
}

View File

@@ -520,13 +520,8 @@ func (hs *serverHandshakeState) checkForResumption() error {
if sessionHasClientCerts && c.config.ClientAuth == NoClientCert {
return nil
}
if sessionHasClientCerts {
now := c.config.time()
for _, c := range sessionState.peerCertificates {
if now.After(c.NotAfter) {
return nil
}
}
if sessionHasClientCerts && c.config.time().After(sessionState.peerCertificates[0].NotAfter) {
return nil
}
if sessionHasClientCerts && c.config.ClientAuth >= VerifyClientCertIfGiven &&
len(sessionState.verifiedChains) == 0 {

View File

@@ -13,7 +13,6 @@ import (
"crypto/rand"
"crypto/tls/internal/fips140tls"
"crypto/x509"
"crypto/x509/pkix"
"encoding/pem"
"errors"
"fmt"
@@ -2154,103 +2153,3 @@ func TestHandshakeContextHierarchy(t *testing.T) {
t.Errorf("Unexpected client error: %v", err)
}
}
func TestHandshakeChainExpiryResumptionTLS12(t *testing.T) {
t.Run("TLS1.2", func(t *testing.T) {
testHandshakeChainExpiryResumption(t, VersionTLS12)
})
t.Run("TLS1.3", func(t *testing.T) {
testHandshakeChainExpiryResumption(t, VersionTLS13)
})
}
func testHandshakeChainExpiryResumption(t *testing.T, version uint16) {
now := time.Now()
createChain := func(leafNotAfter, rootNotAfter time.Time) (certDER []byte, root *x509.Certificate) {
tmpl := &x509.Certificate{
Subject: pkix.Name{CommonName: "root"},
NotBefore: rootNotAfter.Add(-time.Hour * 24),
NotAfter: rootNotAfter,
IsCA: true,
BasicConstraintsValid: true,
}
rootDER, err := x509.CreateCertificate(rand.Reader, tmpl, tmpl, &testECDSAPrivateKey.PublicKey, testECDSAPrivateKey)
if err != nil {
t.Fatalf("CreateCertificate: %v", err)
}
root, err = x509.ParseCertificate(rootDER)
if err != nil {
t.Fatalf("ParseCertificate: %v", err)
}
tmpl = &x509.Certificate{
Subject: pkix.Name{},
DNSNames: []string{"expired-resume.example.com"},
NotBefore: leafNotAfter.Add(-time.Hour * 24),
NotAfter: leafNotAfter,
KeyUsage: x509.KeyUsageDigitalSignature,
}
certDER, err = x509.CreateCertificate(rand.Reader, tmpl, root, &testECDSAPrivateKey.PublicKey, testECDSAPrivateKey)
if err != nil {
t.Fatalf("CreateCertificate: %v", err)
}
return certDER, root
}
initialLeafDER, initialRoot := createChain(now.Add(time.Hour), now.Add(2*time.Hour))
serverConfig := testConfig.Clone()
serverConfig.MaxVersion = version
serverConfig.Certificates = []Certificate{{
Certificate: [][]byte{initialLeafDER},
PrivateKey: testECDSAPrivateKey,
}}
serverConfig.ClientCAs = x509.NewCertPool()
serverConfig.ClientCAs.AddCert(initialRoot)
serverConfig.ClientAuth = RequireAndVerifyClientCert
serverConfig.Time = func() time.Time {
return now
}
clientConfig := testConfig.Clone()
clientConfig.MaxVersion = version
clientConfig.Certificates = []Certificate{{
Certificate: [][]byte{initialLeafDER},
PrivateKey: testECDSAPrivateKey,
}}
clientConfig.RootCAs = x509.NewCertPool()
clientConfig.RootCAs.AddCert(initialRoot)
clientConfig.ServerName = "expired-resume.example.com"
clientConfig.ClientSessionCache = NewLRUClientSessionCache(32)
testResume := func(t *testing.T, sc, cc *Config, expectResume bool) {
t.Helper()
ss, cs, err := testHandshake(t, cc, sc)
if err != nil {
t.Fatalf("handshake: %v", err)
}
if cs.DidResume != expectResume {
t.Fatalf("DidResume = %v; want %v", cs.DidResume, expectResume)
}
if ss.DidResume != expectResume {
t.Fatalf("DidResume = %v; want %v", cs.DidResume, expectResume)
}
}
testResume(t, serverConfig, clientConfig, false)
testResume(t, serverConfig, clientConfig, true)
freshLeafDER, freshRoot := createChain(now.Add(2*time.Hour), now.Add(3*time.Hour))
clientConfig.Certificates = []Certificate{{
Certificate: [][]byte{freshLeafDER},
PrivateKey: testECDSAPrivateKey,
}}
serverConfig.Time = func() time.Time {
return now.Add(1*time.Hour + 30*time.Minute)
}
serverConfig.ClientCAs = x509.NewCertPool()
serverConfig.ClientCAs.AddCert(freshRoot)
testResume(t, serverConfig, clientConfig, false)
}

View File

@@ -314,7 +314,6 @@ func (hs *serverHandshakeStateTLS13) checkForResumption() error {
return nil
}
pskIdentityLoop:
for i, identity := range hs.clientHello.pskIdentities {
if i >= maxClientPSKIdentities {
break
@@ -367,13 +366,8 @@ pskIdentityLoop:
if sessionHasClientCerts && c.config.ClientAuth == NoClientCert {
continue
}
if sessionHasClientCerts {
now := c.config.time()
for _, c := range sessionState.peerCertificates {
if now.After(c.NotAfter) {
continue pskIdentityLoop
}
}
if sessionHasClientCerts && c.config.time().After(sessionState.peerCertificates[0].NotAfter) {
continue
}
if sessionHasClientCerts && c.config.ClientAuth >= VerifyClientCertIfGiven &&
len(sessionState.verifiedChains) == 0 {

View File

@@ -935,8 +935,8 @@ func TestCloneNonFuncFields(t *testing.T) {
}
}
// Set the unexported fields related to session ticket keys, which are copied with Clone().
c1.autoSessionTicketKeys = []ticketKey{c1.ticketKeyFromBytes(c1.SessionTicketKey)}
c1.sessionTicketKeys = []ticketKey{c1.ticketKeyFromBytes(c1.SessionTicketKey)}
// We explicitly don't copy autoSessionTicketKeys in Clone, so don't set it.
c2 := c1.Clone()
if !reflect.DeepEqual(&c1, c2) {
@@ -2461,12 +2461,3 @@ func (s messageOnlySigner) SignMessage(rand io.Reader, msg []byte, opts crypto.S
digest := h.Sum(nil)
return s.Signer.Sign(rand, digest, opts)
}
func TestConfigCloneAutoSessionTicketKeys(t *testing.T) {
orig := &Config{}
orig.ticketKeys(nil)
clone := orig.Clone()
if slices.Equal(orig.autoSessionTicketKeys, clone.autoSessionTicketKeys) {
t.Fatal("autoSessionTicketKeys slice copied in Clone")
}
}

View File

@@ -84,6 +84,7 @@ func ParseGOEXPERIMENT(goos, goarch, goexp string) (*ExperimentFlags, error) {
RegabiWrappers: regabiSupported,
RegabiArgs: regabiSupported,
Dwarf5: dwarf5Supported,
SIMD: goarch == "amd64", // TODO: remove this (default to false) when dev.simd is merged
RandomizedHeapBase64: true,
SizeSpecializedMalloc: true,
GreenTeaGC: true,

View File

@@ -136,12 +136,6 @@ func doinit() {
// e.g. setting the xsavedisable boot option on Windows 10.
X86.HasOSXSAVE = isSet(ecx1, cpuid_OSXSAVE)
// The FMA instruction set extension only has VEX prefixed instructions.
// VEX prefixed instructions require OSXSAVE to be enabled.
// See Intel 64 and IA-32 Architecture Software Developers Manual Volume 2
// Section 2.4 "AVX and SSE Instruction Exception Specification"
X86.HasFMA = isSet(ecx1, cpuid_FMA) && X86.HasOSXSAVE
osSupportsAVX := false
osSupportsAVX512 := false
// For XGETBV, OSXSAVE bit is required and sufficient.
@@ -159,6 +153,14 @@ func doinit() {
X86.HasAVX = isSet(ecx1, cpuid_AVX) && osSupportsAVX
// The FMA instruction set extension requires both the FMA and AVX flags.
//
// Furthermore, the FMA instructions are all VEX prefixed instructions.
// VEX prefixed instructions require OSXSAVE to be enabled.
// See Intel 64 and IA-32 Architecture Software Developers Manual Volume 2
// Section 2.4 "AVX and SSE Instruction Exception Specification"
X86.HasFMA = isSet(ecx1, cpuid_FMA) && X86.HasAVX && X86.HasOSXSAVE
if maxID < 7 {
osInit()
return

View File

@@ -69,7 +69,6 @@ var All = []Info{
{Name: "tlssha1", Package: "crypto/tls", Changed: 25, Old: "1"},
{Name: "tlsunsafeekm", Package: "crypto/tls", Changed: 22, Old: "1"},
{Name: "updatemaxprocs", Package: "runtime", Changed: 25, Old: "0"},
{Name: "urlmaxqueryparams", Package: "net/url", Changed: 24, Old: "0"},
{Name: "urlstrictcolons", Package: "net/url", Changed: 26, Old: "0"},
{Name: "winreadlinkvolume", Package: "os", Changed: 23, Old: "0"},
{Name: "winsymlink", Package: "os", Changed: 23, Old: "0"},

View File

@@ -1,71 +0,0 @@
// Copyright 2026 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package p
import "unsafe"
// Below are the pieces of syntax corresponding to functions which can produce a
// type T without first having a value of type T. Notice that each causes a
// value of type T to be passed to unsafe.Sizeof while T is incomplete.
// literal on type
type T0 /* ERROR "invalid recursive type" */ [unsafe.Sizeof(T0{})]int
// literal on value (not applicable)
// literal on pointer (not applicable)
// call on type
type T1 /* ERROR "invalid recursive type" */ [unsafe.Sizeof(T1(42))]int
// call on value
func f2() T2
type T2 /* ERROR "invalid recursive type" */ [unsafe.Sizeof(f2())]int
// call on pointer (not applicable)
// assert on type
var i3 interface{}
type T3 /* ERROR "invalid recursive type" */ [unsafe.Sizeof(i3.(T3))]int
// assert on value (not applicable)
// assert on pointer (not applicable)
// receive on type (not applicable)
// receive on value
func f4() <-chan T4
type T4 /* ERROR "invalid recursive type" */ [unsafe.Sizeof(<-f4())]int
// receive on pointer (not applicable)
// star on type (not applicable)
// star on value (not applicable)
// star on pointer
func f5() *T5
type T5 /* ERROR "invalid recursive type" */ [unsafe.Sizeof(*f5())]int
// Below is additional syntax which interacts with incomplete types. Notice that
// each of the below falls into 1 of 3 cases:
// 1. It cannot produce a value of (incomplete) type T.
// 2. It can, but only because it already has a value of type T.
// 3. It can, but only because it performs an implicit dereference.
// select on type (case 1)
// select on value (case 2)
type T6 /* ERROR "invalid recursive type" */ struct {
f T7
}
type T7 [unsafe.Sizeof(T6{}.f)]int
// select on pointer (case 3)
type T8 /* ERROR "invalid recursive type" */ struct {
f T9
}
type T9 [unsafe.Sizeof(new(T8).f)]int
// slice on type (not applicable)
// slice on value (case 2)
type T10 /* ERROR "invalid recursive type" */ [unsafe.Sizeof(T10{}[:])]int
// slice on pointer (case 3)
type T11 /* ERROR "invalid recursive type" */ [unsafe.Sizeof(new(T11)[:])]int
// index on type (case 1)
// index on value (case 2)
type T12 /* ERROR "invalid recursive type" */ [unsafe.Sizeof(T12{}[42])]int
// index on pointer (case 3)
type T13 /* ERROR "invalid recursive type" */ [unsafe.Sizeof(new(T13)[42])]int

View File

@@ -929,30 +929,7 @@ func ParseQuery(query string) (Values, error) {
return m, err
}
var urlmaxqueryparams = godebug.New("urlmaxqueryparams")
const defaultMaxParams = 10000
func urlParamsWithinMax(params int) bool {
withinDefaultMax := params <= defaultMaxParams
if urlmaxqueryparams.Value() == "" {
return withinDefaultMax
}
customMax, err := strconv.Atoi(urlmaxqueryparams.Value())
if err != nil {
return withinDefaultMax
}
withinCustomMax := customMax == 0 || params < customMax
if withinDefaultMax != withinCustomMax {
urlmaxqueryparams.IncNonDefault()
}
return withinCustomMax
}
func parseQuery(m Values, query string) (err error) {
if !urlParamsWithinMax(strings.Count(query, "&") + 1) {
return errors.New("number of URL query parameters exceeded limit")
}
for query != "" {
var key string
key, query, _ = strings.Cut(query, "&")

View File

@@ -1521,54 +1521,6 @@ func TestParseQuery(t *testing.T) {
}
}
func TestParseQueryLimits(t *testing.T) {
for _, test := range []struct {
params int
godebug string
wantErr bool
}{{
params: 10,
wantErr: false,
}, {
params: defaultMaxParams,
wantErr: false,
}, {
params: defaultMaxParams + 1,
wantErr: true,
}, {
params: 10,
godebug: "urlmaxqueryparams=9",
wantErr: true,
}, {
params: defaultMaxParams + 1,
godebug: "urlmaxqueryparams=0",
wantErr: false,
}} {
t.Setenv("GODEBUG", test.godebug)
want := Values{}
var b strings.Builder
for i := range test.params {
if i > 0 {
b.WriteString("&")
}
p := fmt.Sprintf("p%v", i)
b.WriteString(p)
want[p] = []string{""}
}
query := b.String()
got, err := ParseQuery(query)
if gotErr, wantErr := err != nil, test.wantErr; gotErr != wantErr {
t.Errorf("GODEBUG=%v ParseQuery(%v params) = %v, want error: %v", test.godebug, test.params, err, wantErr)
}
if err != nil {
continue
}
if got, want := len(got), test.params; got != want {
t.Errorf("GODEBUG=%v ParseQuery(%v params): got %v params, want %v", test.godebug, test.params, got, want)
}
}
}
type RequestURITest struct {
url *URL
out string

View File

@@ -357,9 +357,7 @@ type Cmd struct {
cachedLookExtensions struct{ in, out string }
// startCalled records that Start was attempted, regardless of outcome.
// (Until go.dev/issue/77075 is resolved, we use atomic.SwapInt32,
// not atomic.Bool.Swap, to avoid triggering the copylocks vet check.)
startCalled int32
startCalled atomic.Bool
}
// A ctxResult reports the result of watching the Context associated with a
@@ -642,7 +640,7 @@ func (c *Cmd) Start() error {
// Check for doubled Start calls before we defer failure cleanup. If the prior
// call to Start succeeded, we don't want to spuriously close its pipes.
// It is an error to call Start twice even if the first call did not create a process.
if atomic.SwapInt32(&c.startCalled, 1) != 0 {
if c.startCalled.Swap(true) {
return errors.New("exec: already started")
}

View File

@@ -404,11 +404,6 @@ Below is the full list of supported metrics, ordered lexicographically.
The number of non-default behaviors executed by the runtime
package due to a non-default GODEBUG=updatemaxprocs=... setting.
/godebug/non-default-behavior/urlmaxqueryparams:events
The number of non-default behaviors executed by the net/url
package due to a non-default GODEBUG=urlmaxqueryparams=...
setting.
/godebug/non-default-behavior/urlstrictcolons:events
The number of non-default behaviors executed by the net/url
package due to a non-default GODEBUG=urlstrictcolons=...

View File

@@ -39,7 +39,7 @@ func (w *recorder) Write(b []byte) (n int, err error) {
w.headerReceived = true
}
if len(b) == n {
return n, nil
return 0, nil
}
ba, nb, err := readBatch(b[n:]) // Every write from the runtime is guaranteed to be a complete batch.
if err != nil {

View File

@@ -93,33 +93,6 @@ func (x simdType) MaskedStoreDoc() string {
}
}
func (x simdType) ToBitsDoc() string {
if x.Size == 512 || x.ElemBits() == 16 {
return fmt.Sprintf("// Asm: KMOV%s, CPU Features: AVX512", x.IntelSizeSuffix())
}
// 128/256 bit vectors with 8, 32, 64 bit elements
var asm string
var feat string
switch x.ElemBits() {
case 8:
asm = "VPMOVMSKB"
if x.Size == 256 {
feat = "AVX2"
} else {
feat = "AVX"
}
case 32:
asm = "VMOVMSKPS"
feat = "AVX"
case 64:
asm = "VMOVMSKPD"
feat = "AVX"
default:
panic("unexpected ElemBits")
}
return fmt.Sprintf("// Asm: %s, CPU Features: %s", asm, feat)
}
func compareSimdTypes(x, y simdType) int {
// "vreg" then "mask"
if c := -compareNatural(x.Type, y.Type); c != 0 {
@@ -189,6 +162,7 @@ type X86Features struct {}
var X86 X86Features
{{range .}}
{{$f := .}}
{{- if eq .Feature "AVX512"}}
// {{.Feature}} returns whether the CPU supports the AVX512F+CD+BW+DQ+VL features.
//
@@ -199,11 +173,19 @@ var X86 X86Features
{{- else -}}
// {{.Feature}} returns whether the CPU supports the {{.Feature}} feature.
{{- end}}
{{- if ne .ImpliesAll ""}}
//
// If it returns true, then the CPU also supports {{.ImpliesAll}}.
{{- end}}
//
// {{.Feature}} is defined on all GOARCHes, but will only return true on
// GOARCH {{.GoArch}}.
func (X86Features) {{.Feature}}() bool {
return cpu.X86.Has{{.Feature}}
func ({{.FeatureVar}}Features) {{.Feature}}() bool {
{{- if .Virtual}}
return {{range $i, $dep := .Implies}}{{if $i}} && {{end}}cpu.{{$f.FeatureVar}}.Has{{$dep}}{{end}}
{{- else}}
return cpu.{{.FeatureVar}}.Has{{.Feature}}
{{- end}}
}
{{end}}
`
@@ -237,7 +219,7 @@ func {{.Name}}FromBits(y uint{{.LanesContainer}}) {{.Name}}
// Only the lower {{.Lanes}} bits of y are used.
{{- end}}
//
{{.ToBitsDoc}}
// Asm: KMOV{{.IntelSizeSuffix}}, CPU Features: AVX512
func (x {{.Name}}) ToBits() uint{{.LanesContainer}}
`
@@ -591,6 +573,65 @@ func writeSIMDTypes(typeMap simdTypeMap) *bytes.Buffer {
return buffer
}
type goarchFeatures struct {
// featureVar is the name of the exported feature-check variable for this
// architecture.
featureVar string
// features records per-feature information.
features map[string]featureInfo
}
type featureInfo struct {
// Implies is a list of other CPU features that are required for this
// feature. These are allowed to chain.
//
// For example, if the Frob feature lists "Baz", then if X.Frob() returns
// true, it must also be true that the CPU has feature Baz.
Implies []string
// Virtual means this feature is not represented directly in internal/cpu,
// but is instead the logical AND of the features in Implies.
Virtual bool
}
// goarchFeatureInfo maps from GOARCH to CPU feature to additional information
// about that feature. Not all features need to be in this map.
var goarchFeatureInfo = make(map[string]goarchFeatures)
func registerFeatureInfo(goArch string, features goarchFeatures) {
goarchFeatureInfo[goArch] = features
}
func featureImplies(goarch string, base string) string {
// Compute the transitive closure of base.
var list []string
var visit func(f string)
visit = func(f string) {
list = append(list, f)
for _, dep := range goarchFeatureInfo[goarch].features[f].Implies {
visit(dep)
}
}
visit(base)
// Drop base
list = list[1:]
// Put in "nice" order
slices.Reverse(list)
// Combine into a comment-ready form
switch len(list) {
case 0:
return ""
case 1:
return list[0]
case 2:
return list[0] + " and " + list[1]
default:
list[len(list)-1] = "and " + list[len(list)-1]
return strings.Join(list, ", ")
}
}
func writeSIMDFeatures(ops []Operation) *bytes.Buffer {
// Gather all features
type featureKey struct {
@@ -606,13 +647,36 @@ func writeSIMDFeatures(ops []Operation) *bytes.Buffer {
featureSet[featureKey{op.GoArch, feature}] = struct{}{}
}
}
features := slices.SortedFunc(maps.Keys(featureSet), func(a, b featureKey) int {
featureKeys := slices.SortedFunc(maps.Keys(featureSet), func(a, b featureKey) int {
if c := cmp.Compare(a.GoArch, b.GoArch); c != 0 {
return c
}
return compareNatural(a.Feature, b.Feature)
})
// TODO: internal/cpu doesn't enforce these at all. You can even do
// GODEBUG=cpu.avx=off and it will happily turn off AVX without turning off
// AVX2. We need to push these dependencies into it somehow.
type feature struct {
featureKey
FeatureVar string
Virtual bool
Implies []string
ImpliesAll string
}
var features []feature
for _, k := range featureKeys {
featureVar := goarchFeatureInfo[k.GoArch].featureVar
fi := goarchFeatureInfo[k.GoArch].features[k.Feature]
features = append(features, feature{
featureKey: k,
FeatureVar: featureVar,
Virtual: fi.Virtual,
Implies: fi.Implies,
ImpliesAll: featureImplies(k.GoArch, k.Feature),
})
}
// If we ever have the same feature name on more than one GOARCH, we'll have
// to be more careful about this.
t := templateOf(simdFeaturesTemplate, "features")

View File

@@ -69,36 +69,21 @@
documentation: !string |-
// NAME performs an expansion on a vector x whose elements are packed to lower parts.
// The expansion is to distribute elements as indexed by mask, from lower mask elements to upper in order.
- go: Broadcast1To2
- go: Broadcast128
commutative: false
documentation: !string |-
// NAME copies the lowest element of its input to all 2 elements of
// the output vector.
- go: Broadcast1To4
// NAME copies element zero of its (128-bit) input to all elements of
// the 128-bit output vector.
- go: Broadcast256
commutative: false
documentation: !string |-
// NAME copies the lowest element of its input to all 4 elements of
// the output vector.
- go: Broadcast1To8
// NAME copies element zero of its (128-bit) input to all elements of
// the 256-bit output vector.
- go: Broadcast512
commutative: false
documentation: !string |-
// NAME copies the lowest element of its input to all 8 elements of
// the output vector.
- go: Broadcast1To16
commutative: false
documentation: !string |-
// NAME copies the lowest element of its input to all 16 elements of
// the output vector.
- go: Broadcast1To32
commutative: false
documentation: !string |-
// NAME copies the lowest element of its input to all 32 elements of
// the output vector.
- go: Broadcast1To64
commutative: false
documentation: !string |-
// NAME copies the lowest element of its input to all 64 elements of
// the output vector.
// NAME copies element zero of its (128-bit) input to all elements of
// the 512-bit output vector.
- go: PermuteOrZeroGrouped
commutative: false
documentation: !string |- # Detailed documentation will rely on the specific ops.

View File

@@ -376,21 +376,21 @@
out:
- *any
- go: Broadcast1To2
asm: VPBROADCASTQ
- go: Broadcast128
asm: VPBROADCAST[BWDQ]
in:
- class: vreg
bits: 128
elemBits: 64
elemBits: $e
base: $b
out:
- class: vreg
bits: 128
elemBits: 64
elemBits: $e
base: $b
# weirdly, this one case on AVX2 is memory-operand-only
- go: Broadcast1To2
- go: Broadcast128
asm: VPBROADCASTQ
in:
- class: vreg
@@ -405,93 +405,70 @@
base: int
OverwriteBase: float
- go: Broadcast1To4
- go: Broadcast256
asm: VPBROADCAST[BWDQ]
in:
- class: vreg
bits: 128
elemBits: $e
base: $b
out:
- class: vreg
lanes: 4
bits: 256
elemBits: $e
base: $b
- go: Broadcast1To8
- go: Broadcast512
asm: VPBROADCAST[BWDQ]
in:
- class: vreg
bits: 128
elemBits: $e
base: $b
out:
- class: vreg
lanes: 8
bits: 512
elemBits: $e
base: $b
- go: Broadcast1To16
asm: VPBROADCAST[BWDQ]
in:
- class: vreg
bits: 128
base: $b
out:
- class: vreg
lanes: 16
base: $b
- go: Broadcast1To32
asm: VPBROADCAST[BWDQ]
in:
- class: vreg
bits: 128
base: $b
out:
- class: vreg
lanes: 32
base: $b
- go: Broadcast1To64
asm: VPBROADCASTB
in:
- class: vreg
bits: 128
base: $b
out:
- class: vreg
lanes: 64
base: $b
- go: Broadcast1To4
- go: Broadcast128
asm: VBROADCASTS[SD]
in:
- class: vreg
bits: 128
base: float
elemBits: $e
base: $b
out:
- class: vreg
lanes: 4
base: float
bits: 128
elemBits: $e
base: $b
- go: Broadcast1To8
- go: Broadcast256
asm: VBROADCASTS[SD]
in:
- class: vreg
bits: 128
base: float
elemBits: $e
base: $b
out:
- class: vreg
lanes: 8
base: float
bits: 256
elemBits: $e
base: $b
- go: Broadcast1To16
- go: Broadcast512
asm: VBROADCASTS[SD]
in:
- class: vreg
bits: 128
base: float
elemBits: $e
base: $b
out:
- class: vreg
lanes: 16
base: float
bits: 512
elemBits: $e
base: $b
# VPSHUFB for 128-bit byte shuffles will be picked with higher priority than VPERMB, given its lower CPU feature requirement. (It's AVX)
- go: PermuteOrZero

View File

@@ -5,7 +5,6 @@
package main
import (
"cmp"
"fmt"
"log"
"maps"
@@ -78,7 +77,7 @@ func loadXED(xedPath string) []*unify.Value {
switch {
case inst.RealOpcode == "N":
return // Skip unstable instructions
case !(strings.HasPrefix(inst.Extension, "AVX") || strings.HasPrefix(inst.Extension, "SHA")):
case !(strings.HasPrefix(inst.Extension, "AVX") || strings.HasPrefix(inst.Extension, "SHA") || inst.Extension == "FMA"):
// We're only interested in AVX and SHA instructions.
return
}
@@ -210,16 +209,9 @@ func loadXED(xedPath string) []*unify.Value {
}
log.Printf("%d unhandled CPU features for %d instructions (use -v for details)", len(unknownFeatures), nInst)
} else {
keys := slices.SortedFunc(maps.Keys(unknownFeatures), func(a, b cpuFeatureKey) int {
return cmp.Or(cmp.Compare(a.Extension, b.Extension),
cmp.Compare(a.ISASet, b.ISASet))
})
keys := slices.Sorted(maps.Keys(unknownFeatures))
for _, key := range keys {
if key.ISASet == "" || key.ISASet == key.Extension {
log.Printf("unhandled Extension %s", key.Extension)
} else {
log.Printf("unhandled Extension %s and ISASet %s", key.Extension, key.ISASet)
}
log.Printf("unhandled ISASet %s", key)
log.Printf(" opcodes: %s", slices.Sorted(maps.Keys(unknownFeatures[key])))
}
}
@@ -763,16 +755,24 @@ func instToUVal1(inst *xeddata.Inst, ops []operand, feature string, variant inst
// decodeCPUFeature returns the CPU feature name required by inst. These match
// the names of the "Has*" feature checks in the simd package.
func decodeCPUFeature(inst *xeddata.Inst) (string, bool) {
key := cpuFeatureKey{
Extension: inst.Extension,
ISASet: isaSetStrip.ReplaceAllLiteralString(inst.ISASet, ""),
isaSet := inst.ISASet
if isaSet == "" {
// Older instructions don't have an ISA set. Use their "extension"
// instead.
isaSet = inst.Extension
}
feat, ok := cpuFeatureMap[key]
// We require AVX512VL to use AVX512 at all, so strip off the vector length
// suffixes.
if strings.HasPrefix(isaSet, "AVX512") {
isaSet = isaSetVL.ReplaceAllLiteralString(isaSet, "")
}
feat, ok := cpuFeatureMap[isaSet]
if !ok {
imap := unknownFeatures[key]
imap := unknownFeatures[isaSet]
if imap == nil {
imap = make(map[string]struct{})
unknownFeatures[key] = imap
unknownFeatures[isaSet] = imap
}
imap[inst.Opcode()] = struct{}{}
return "", false
@@ -783,45 +783,76 @@ func decodeCPUFeature(inst *xeddata.Inst) (string, bool) {
return feat, true
}
var isaSetStrip = regexp.MustCompile("_(128N?|256N?|512)$")
var isaSetVL = regexp.MustCompile("_(128N?|256N?|512)$")
type cpuFeatureKey struct {
Extension, ISASet string
}
// cpuFeatureMap maps from XED's "EXTENSION" and "ISA_SET" to a CPU feature name
// that can be used in the SIMD API.
var cpuFeatureMap = map[cpuFeatureKey]string{
{"SHA", "SHA"}: "SHA",
{"AVX", ""}: "AVX",
{"AVX_VNNI", "AVX_VNNI"}: "AVXVNNI",
{"AVX2", ""}: "AVX2",
{"AVXAES", ""}: "AVX, AES",
// cpuFeatureMap maps from XED's "ISA_SET" (or "EXTENSION") to a CPU feature
// name to expose in the SIMD feature check API.
//
// See XED's datafiles/*/cpuid.xed.txt for how ISA set names map to CPUID flags.
var cpuFeatureMap = map[string]string{
"AVX": "AVX",
"AVX_VNNI": "AVXVNNI",
"AVX2": "AVX2",
"AVXAES": "AVXAES",
"SHA": "SHA",
"FMA": "FMA",
// AVX-512 foundational features. We combine all of these into one "AVX512" feature.
{"AVX512EVEX", "AVX512F"}: "AVX512",
{"AVX512EVEX", "AVX512CD"}: "AVX512",
{"AVX512EVEX", "AVX512BW"}: "AVX512",
{"AVX512EVEX", "AVX512DQ"}: "AVX512",
// AVX512VL doesn't appear explicitly in the ISASet. I guess it's implied by
// the vector length suffix.
"AVX512F": "AVX512",
"AVX512BW": "AVX512",
"AVX512CD": "AVX512",
"AVX512DQ": "AVX512",
// AVX512VL doesn't appear as its own ISASet; instead, the CPUID flag is
// required by the *_128 and *_256 ISASets. We fold it into "AVX512" anyway.
// AVX-512 extension features
{"AVX512EVEX", "AVX512_BITALG"}: "AVX512BITALG",
{"AVX512EVEX", "AVX512_GFNI"}: "AVX512GFNI",
{"AVX512EVEX", "AVX512_VBMI2"}: "AVX512VBMI2",
{"AVX512EVEX", "AVX512_VBMI"}: "AVX512VBMI",
{"AVX512EVEX", "AVX512_VNNI"}: "AVX512VNNI",
{"AVX512EVEX", "AVX512_VPOPCNTDQ"}: "AVX512VPOPCNTDQ",
{"AVX512EVEX", "AVX512_VAES"}: "AVX512VAES",
{"AVX512EVEX", "AVX512_VPCLMULQDQ"}: "AVX512VPCLMULQDQ",
"AVX512_BITALG": "AVX512BITALG",
"AVX512_GFNI": "AVX512GFNI",
"AVX512_VBMI": "AVX512VBMI",
"AVX512_VBMI2": "AVX512VBMI2",
"AVX512_VNNI": "AVX512VNNI",
"AVX512_VPOPCNTDQ": "AVX512VPOPCNTDQ",
"AVX512_VAES": "AVX512VAES",
"AVX512_VPCLMULQDQ": "AVX512VPCLMULQDQ",
// AVX 10.2 (not yet supported)
{"AVX512EVEX", "AVX10_2_RC"}: "ignore",
"AVX10_2_RC": "ignore",
}
var unknownFeatures = map[cpuFeatureKey]map[string]struct{}{}
func init() {
// TODO: In general, Intel doesn't make any guarantees about what flags are
// set, so this means our feature checks need to ensure these, just to be
// sure.
var features = map[string]featureInfo{
"AVX2": {Implies: []string{"AVX"}},
"AVX512": {Implies: []string{"AVX2"}},
"AVXAES": {Virtual: true, Implies: []string{"AVX", "AES"}},
"FMA": {Implies: []string{"AVX"}},
// AVX-512 subfeatures.
"AVX512BITALG": {Implies: []string{"AVX512"}},
"AVX512GFNI": {Implies: []string{"AVX512"}},
"AVX512VBMI": {Implies: []string{"AVX512"}},
"AVX512VBMI2": {Implies: []string{"AVX512"}},
"AVX512VNNI": {Implies: []string{"AVX512"}},
"AVX512VPOPCNTDQ": {Implies: []string{"AVX512"}},
"AVX512VAES": {Implies: []string{"AVX512"}},
// AVX-VNNI and AVX-IFMA are "backports" of the AVX512-VNNI/IFMA
// instructions to VEX encoding, limited to 256 bit vectors. They're
// intended for lower end CPUs that want to support VNNI/IFMA without
// supporting AVX-512. As such, they're built on AVX2's VEX encoding.
"AVXVNNI": {Implies: []string{"AVX2"}},
"AVXIFMA": {Implies: []string{"AVX2"}},
}
registerFeatureInfo("amd64", goarchFeatures{
featureVar: "X86",
features: features,
})
}
var unknownFeatures = map[string]map[string]struct{}{}
// hasOptionalMask returns whether there is an optional mask operand in ops.
func hasOptionalMask(ops []operand) bool {

View File

@@ -873,7 +873,7 @@ var broadcastTemplate = templateOf("Broadcast functions", `
// Emulated, CPU Feature: {{.CPUfeatureBC}}
func Broadcast{{.VType}}(x {{.Etype}}) {{.VType}} {
var z {{.As128BitVec }}
return z.SetElem(0, x).Broadcast1To{{.Count}}()
return z.SetElem(0, x).Broadcast{{.Vwidth}}()
}
`)

View File

@@ -10,14 +10,6 @@ type X86Features struct{}
var X86 X86Features
// AES returns whether the CPU supports the AES feature.
//
// AES is defined on all GOARCHes, but will only return true on
// GOARCH amd64.
func (X86Features) AES() bool {
return cpu.X86.HasAES
}
// AVX returns whether the CPU supports the AVX feature.
//
// AVX is defined on all GOARCHes, but will only return true on
@@ -28,6 +20,8 @@ func (X86Features) AVX() bool {
// AVX2 returns whether the CPU supports the AVX2 feature.
//
// If it returns true, then the CPU also supports AVX.
//
// AVX2 is defined on all GOARCHes, but will only return true on
// GOARCH amd64.
func (X86Features) AVX2() bool {
@@ -41,6 +35,8 @@ func (X86Features) AVX2() bool {
// Nearly every CPU that has shipped with any support for AVX-512 has
// supported all five of these features.
//
// If it returns true, then the CPU also supports AVX and AVX2.
//
// AVX512 is defined on all GOARCHes, but will only return true on
// GOARCH amd64.
func (X86Features) AVX512() bool {
@@ -49,6 +45,8 @@ func (X86Features) AVX512() bool {
// AVX512BITALG returns whether the CPU supports the AVX512BITALG feature.
//
// If it returns true, then the CPU also supports AVX, AVX2, and AVX512.
//
// AVX512BITALG is defined on all GOARCHes, but will only return true on
// GOARCH amd64.
func (X86Features) AVX512BITALG() bool {
@@ -57,6 +55,8 @@ func (X86Features) AVX512BITALG() bool {
// AVX512GFNI returns whether the CPU supports the AVX512GFNI feature.
//
// If it returns true, then the CPU also supports AVX, AVX2, and AVX512.
//
// AVX512GFNI is defined on all GOARCHes, but will only return true on
// GOARCH amd64.
func (X86Features) AVX512GFNI() bool {
@@ -65,6 +65,8 @@ func (X86Features) AVX512GFNI() bool {
// AVX512VAES returns whether the CPU supports the AVX512VAES feature.
//
// If it returns true, then the CPU also supports AVX, AVX2, and AVX512.
//
// AVX512VAES is defined on all GOARCHes, but will only return true on
// GOARCH amd64.
func (X86Features) AVX512VAES() bool {
@@ -73,6 +75,8 @@ func (X86Features) AVX512VAES() bool {
// AVX512VBMI returns whether the CPU supports the AVX512VBMI feature.
//
// If it returns true, then the CPU also supports AVX, AVX2, and AVX512.
//
// AVX512VBMI is defined on all GOARCHes, but will only return true on
// GOARCH amd64.
func (X86Features) AVX512VBMI() bool {
@@ -81,6 +85,8 @@ func (X86Features) AVX512VBMI() bool {
// AVX512VBMI2 returns whether the CPU supports the AVX512VBMI2 feature.
//
// If it returns true, then the CPU also supports AVX, AVX2, and AVX512.
//
// AVX512VBMI2 is defined on all GOARCHes, but will only return true on
// GOARCH amd64.
func (X86Features) AVX512VBMI2() bool {
@@ -89,6 +95,8 @@ func (X86Features) AVX512VBMI2() bool {
// AVX512VNNI returns whether the CPU supports the AVX512VNNI feature.
//
// If it returns true, then the CPU also supports AVX, AVX2, and AVX512.
//
// AVX512VNNI is defined on all GOARCHes, but will only return true on
// GOARCH amd64.
func (X86Features) AVX512VNNI() bool {
@@ -105,20 +113,44 @@ func (X86Features) AVX512VPCLMULQDQ() bool {
// AVX512VPOPCNTDQ returns whether the CPU supports the AVX512VPOPCNTDQ feature.
//
// If it returns true, then the CPU also supports AVX, AVX2, and AVX512.
//
// AVX512VPOPCNTDQ is defined on all GOARCHes, but will only return true on
// GOARCH amd64.
func (X86Features) AVX512VPOPCNTDQ() bool {
return cpu.X86.HasAVX512VPOPCNTDQ
}
// AVXAES returns whether the CPU supports the AVXAES feature.
//
// If it returns true, then the CPU also supports AES and AVX.
//
// AVXAES is defined on all GOARCHes, but will only return true on
// GOARCH amd64.
func (X86Features) AVXAES() bool {
return cpu.X86.HasAVX && cpu.X86.HasAES
}
// AVXVNNI returns whether the CPU supports the AVXVNNI feature.
//
// If it returns true, then the CPU also supports AVX and AVX2.
//
// AVXVNNI is defined on all GOARCHes, but will only return true on
// GOARCH amd64.
func (X86Features) AVXVNNI() bool {
return cpu.X86.HasAVXVNNI
}
// FMA returns whether the CPU supports the FMA feature.
//
// If it returns true, then the CPU also supports AVX.
//
// FMA is defined on all GOARCHes, but will only return true on
// GOARCH amd64.
func (X86Features) FMA() bool {
return cpu.X86.HasFMA
}
// SHA returns whether the CPU supports the SHA feature.
//
// SHA is defined on all GOARCHes, but will only return true on

View File

@@ -379,79 +379,12 @@ func TestBitMaskFromBitsLoad(t *testing.T) {
}
func TestBitMaskToBits(t *testing.T) {
int8s := []int8{
0, 1, 1, 0, 0, 1, 0, 1,
1, 0, 1, 1, 0, 0, 1, 0,
1, 0, 0, 1, 1, 0, 1, 0,
0, 1, 1, 0, 0, 1, 0, 1,
1, 0, 0, 1, 0, 1, 1, 0,
0, 1, 0, 1, 1, 0, 0, 1,
1, 0, 1, 0, 0, 1, 1, 0,
0, 1, 1, 0, 1, 0, 0, 1,
if !archsimd.X86.AVX512() {
t.Skip("Test requires X86.AVX512, not available on this hardware")
return
}
int16s := make([]int16, 32)
for i := range int16s {
int16s[i] = int16(int8s[i])
}
int32s := make([]int32, 16)
for i := range int32s {
int32s[i] = int32(int8s[i])
}
int64s := make([]int64, 8)
for i := range int64s {
int64s[i] = int64(int8s[i])
}
want64 := uint64(0)
for i := range int8s {
want64 |= uint64(int8s[i]) << i
}
want32 := uint32(want64)
want16 := uint16(want64)
want8 := uint8(want64)
want4 := want8 & 0b1111
want2 := want4 & 0b11
if v := archsimd.LoadInt8x16Slice(int8s[:16]).ToMask().ToBits(); v != want16 {
t.Errorf("want %b, got %b", want16, v)
}
if v := archsimd.LoadInt32x4Slice(int32s[:4]).ToMask().ToBits(); v != want4 {
t.Errorf("want %b, got %b", want4, v)
}
if v := archsimd.LoadInt32x8Slice(int32s[:8]).ToMask().ToBits(); v != want8 {
t.Errorf("want %b, got %b", want8, v)
}
if v := archsimd.LoadInt64x2Slice(int64s[:2]).ToMask().ToBits(); v != want2 {
t.Errorf("want %b, got %b", want2, v)
}
if v := archsimd.LoadInt64x4Slice(int64s[:4]).ToMask().ToBits(); v != want4 {
t.Errorf("want %b, got %b", want4, v)
}
if archsimd.X86.AVX2() {
if v := archsimd.LoadInt8x32Slice(int8s[:32]).ToMask().ToBits(); v != want32 {
t.Errorf("want %b, got %b", want32, v)
}
}
if archsimd.X86.AVX512() {
if v := archsimd.LoadInt8x64Slice(int8s).ToMask().ToBits(); v != want64 {
t.Errorf("want %b, got %b", want64, v)
}
if v := archsimd.LoadInt16x8Slice(int16s[:8]).ToMask().ToBits(); v != want8 {
t.Errorf("want %b, got %b", want8, v)
}
if v := archsimd.LoadInt16x16Slice(int16s[:16]).ToMask().ToBits(); v != want16 {
t.Errorf("want %b, got %b", want16, v)
}
if v := archsimd.LoadInt16x32Slice(int16s).ToMask().ToBits(); v != want32 {
t.Errorf("want %b, got %b", want32, v)
}
if v := archsimd.LoadInt32x16Slice(int32s).ToMask().ToBits(); v != want16 {
t.Errorf("want %b, got %b", want16, v)
}
if v := archsimd.LoadInt64x8Slice(int64s).ToMask().ToBits(); v != want8 {
t.Errorf("want %b, got %b", want8, v)
}
if v := archsimd.LoadInt16x8Slice([]int16{1, 0, 1, 0, 0, 0, 0, 0}).ToMask().ToBits(); v != 0b101 {
t.Errorf("Want 0b101, got %b", v)
}
}

View File

@@ -11,7 +11,7 @@ package archsimd
// y is the chunk of dw array in use.
// result = AddRoundKey(InvShiftRows(InvSubBytes(x)), y)
//
// Asm: VAESDECLAST, CPU Feature: AVX, AES
// Asm: VAESDECLAST, CPU Feature: AVXAES
func (x Uint8x16) AESDecryptLastRound(y Uint32x4) Uint8x16
// AESDecryptLastRound performs a series of operations in AES cipher algorithm defined in FIPS 197.
@@ -37,7 +37,7 @@ func (x Uint8x64) AESDecryptLastRound(y Uint32x16) Uint8x64
// y is the chunk of dw array in use.
// result = AddRoundKey(InvMixColumns(InvShiftRows(InvSubBytes(x))), y)
//
// Asm: VAESDEC, CPU Feature: AVX, AES
// Asm: VAESDEC, CPU Feature: AVXAES
func (x Uint8x16) AESDecryptOneRound(y Uint32x4) Uint8x16
// AESDecryptOneRound performs a series of operations in AES cipher algorithm defined in FIPS 197.
@@ -63,7 +63,7 @@ func (x Uint8x64) AESDecryptOneRound(y Uint32x16) Uint8x64
// y is the chunk of w array in use.
// result = AddRoundKey((ShiftRows(SubBytes(x))), y)
//
// Asm: VAESENCLAST, CPU Feature: AVX, AES
// Asm: VAESENCLAST, CPU Feature: AVXAES
func (x Uint8x16) AESEncryptLastRound(y Uint32x4) Uint8x16
// AESEncryptLastRound performs a series of operations in AES cipher algorithm defined in FIPS 197.
@@ -89,7 +89,7 @@ func (x Uint8x64) AESEncryptLastRound(y Uint32x16) Uint8x64
// y is the chunk of w array in use.
// result = AddRoundKey(MixColumns(ShiftRows(SubBytes(x))), y)
//
// Asm: VAESENC, CPU Feature: AVX, AES
// Asm: VAESENC, CPU Feature: AVXAES
func (x Uint8x16) AESEncryptOneRound(y Uint32x4) Uint8x16
// AESEncryptOneRound performs a series of operations in AES cipher algorithm defined in FIPS 197.
@@ -114,7 +114,7 @@ func (x Uint8x64) AESEncryptOneRound(y Uint32x16) Uint8x64
// x is the chunk of w array in use.
// result = InvMixColumns(x)
//
// Asm: VAESIMC, CPU Feature: AVX, AES
// Asm: VAESIMC, CPU Feature: AVXAES
func (x Uint32x4) AESInvMixColumns() Uint32x4
/* AESRoundKeyGenAssist */
@@ -129,7 +129,7 @@ func (x Uint32x4) AESInvMixColumns() Uint32x4
//
// rconVal results in better performance when it's a constant, a non-constant value will be translated into a jump table.
//
// Asm: VAESKEYGENASSIST, CPU Feature: AVX, AES
// Asm: VAESKEYGENASSIST, CPU Feature: AVXAES
func (x Uint32x4) AESRoundKeyGenAssist(rconVal uint8) Uint32x4
/* Abs */
@@ -805,197 +805,191 @@ func (x Uint16x16) Average(y Uint16x16) Uint16x16
// Asm: VPAVGW, CPU Feature: AVX512
func (x Uint16x32) Average(y Uint16x32) Uint16x32
/* Broadcast1To2 */
/* Broadcast128 */
// Broadcast1To2 copies the lowest element of its input to all 2 elements of
// the output vector.
//
// Asm: VPBROADCASTQ, CPU Feature: AVX2
func (x Float64x2) Broadcast1To2() Float64x2
// Broadcast1To2 copies the lowest element of its input to all 2 elements of
// the output vector.
//
// Asm: VPBROADCASTQ, CPU Feature: AVX2
func (x Int64x2) Broadcast1To2() Int64x2
// Broadcast1To2 copies the lowest element of its input to all 2 elements of
// the output vector.
//
// Asm: VPBROADCASTQ, CPU Feature: AVX2
func (x Uint64x2) Broadcast1To2() Uint64x2
/* Broadcast1To4 */
// Broadcast1To4 copies the lowest element of its input to all 4 elements of
// the output vector.
// Broadcast128 copies element zero of its (128-bit) input to all elements of
// the 128-bit output vector.
//
// Asm: VBROADCASTSS, CPU Feature: AVX2
func (x Float32x4) Broadcast1To4() Float32x4
func (x Float32x4) Broadcast128() Float32x4
// Broadcast1To4 copies the lowest element of its input to all 4 elements of
// the output vector.
// Broadcast128 copies element zero of its (128-bit) input to all elements of
// the 128-bit output vector.
//
// Asm: VPBROADCASTQ, CPU Feature: AVX2
func (x Float64x2) Broadcast128() Float64x2
// Broadcast128 copies element zero of its (128-bit) input to all elements of
// the 128-bit output vector.
//
// Asm: VPBROADCASTB, CPU Feature: AVX2
func (x Int8x16) Broadcast128() Int8x16
// Broadcast128 copies element zero of its (128-bit) input to all elements of
// the 128-bit output vector.
//
// Asm: VPBROADCASTW, CPU Feature: AVX2
func (x Int16x8) Broadcast128() Int16x8
// Broadcast128 copies element zero of its (128-bit) input to all elements of
// the 128-bit output vector.
//
// Asm: VPBROADCASTD, CPU Feature: AVX2
func (x Int32x4) Broadcast128() Int32x4
// Broadcast128 copies element zero of its (128-bit) input to all elements of
// the 128-bit output vector.
//
// Asm: VPBROADCASTQ, CPU Feature: AVX2
func (x Int64x2) Broadcast128() Int64x2
// Broadcast128 copies element zero of its (128-bit) input to all elements of
// the 128-bit output vector.
//
// Asm: VPBROADCASTB, CPU Feature: AVX2
func (x Uint8x16) Broadcast128() Uint8x16
// Broadcast128 copies element zero of its (128-bit) input to all elements of
// the 128-bit output vector.
//
// Asm: VPBROADCASTW, CPU Feature: AVX2
func (x Uint16x8) Broadcast128() Uint16x8
// Broadcast128 copies element zero of its (128-bit) input to all elements of
// the 128-bit output vector.
//
// Asm: VPBROADCASTD, CPU Feature: AVX2
func (x Uint32x4) Broadcast128() Uint32x4
// Broadcast128 copies element zero of its (128-bit) input to all elements of
// the 128-bit output vector.
//
// Asm: VPBROADCASTQ, CPU Feature: AVX2
func (x Uint64x2) Broadcast128() Uint64x2
/* Broadcast256 */
// Broadcast256 copies element zero of its (128-bit) input to all elements of
// the 256-bit output vector.
//
// Asm: VBROADCASTSS, CPU Feature: AVX2
func (x Float32x4) Broadcast256() Float32x8
// Broadcast256 copies element zero of its (128-bit) input to all elements of
// the 256-bit output vector.
//
// Asm: VBROADCASTSD, CPU Feature: AVX2
func (x Float64x2) Broadcast1To4() Float64x4
func (x Float64x2) Broadcast256() Float64x4
// Broadcast1To4 copies the lowest element of its input to all 4 elements of
// the output vector.
// Broadcast256 copies element zero of its (128-bit) input to all elements of
// the 256-bit output vector.
//
// Asm: VPBROADCASTD, CPU Feature: AVX2
func (x Int32x4) Broadcast1To4() Int32x4
// Asm: VPBROADCASTB, CPU Feature: AVX2
func (x Int8x16) Broadcast256() Int8x32
// Broadcast1To4 copies the lowest element of its input to all 4 elements of
// the output vector.
//
// Asm: VPBROADCASTQ, CPU Feature: AVX2
func (x Int64x2) Broadcast1To4() Int64x4
// Broadcast1To4 copies the lowest element of its input to all 4 elements of
// the output vector.
//
// Asm: VPBROADCASTD, CPU Feature: AVX2
func (x Uint32x4) Broadcast1To4() Uint32x4
// Broadcast1To4 copies the lowest element of its input to all 4 elements of
// the output vector.
//
// Asm: VPBROADCASTQ, CPU Feature: AVX2
func (x Uint64x2) Broadcast1To4() Uint64x4
/* Broadcast1To8 */
// Broadcast1To8 copies the lowest element of its input to all 8 elements of
// the output vector.
//
// Asm: VBROADCASTSS, CPU Feature: AVX2
func (x Float32x4) Broadcast1To8() Float32x8
// Broadcast1To8 copies the lowest element of its input to all 8 elements of
// the output vector.
//
// Asm: VBROADCASTSD, CPU Feature: AVX512
func (x Float64x2) Broadcast1To8() Float64x8
// Broadcast1To8 copies the lowest element of its input to all 8 elements of
// the output vector.
// Broadcast256 copies element zero of its (128-bit) input to all elements of
// the 256-bit output vector.
//
// Asm: VPBROADCASTW, CPU Feature: AVX2
func (x Int16x8) Broadcast1To8() Int16x8
func (x Int16x8) Broadcast256() Int16x16
// Broadcast1To8 copies the lowest element of its input to all 8 elements of
// the output vector.
// Broadcast256 copies element zero of its (128-bit) input to all elements of
// the 256-bit output vector.
//
// Asm: VPBROADCASTD, CPU Feature: AVX2
func (x Int32x4) Broadcast1To8() Int32x8
func (x Int32x4) Broadcast256() Int32x8
// Broadcast1To8 copies the lowest element of its input to all 8 elements of
// the output vector.
// Broadcast256 copies element zero of its (128-bit) input to all elements of
// the 256-bit output vector.
//
// Asm: VPBROADCASTQ, CPU Feature: AVX512
func (x Int64x2) Broadcast1To8() Int64x8
// Asm: VPBROADCASTQ, CPU Feature: AVX2
func (x Int64x2) Broadcast256() Int64x4
// Broadcast1To8 copies the lowest element of its input to all 8 elements of
// the output vector.
// Broadcast256 copies element zero of its (128-bit) input to all elements of
// the 256-bit output vector.
//
// Asm: VPBROADCASTB, CPU Feature: AVX2
func (x Uint8x16) Broadcast256() Uint8x32
// Broadcast256 copies element zero of its (128-bit) input to all elements of
// the 256-bit output vector.
//
// Asm: VPBROADCASTW, CPU Feature: AVX2
func (x Uint16x8) Broadcast1To8() Uint16x8
func (x Uint16x8) Broadcast256() Uint16x16
// Broadcast1To8 copies the lowest element of its input to all 8 elements of
// the output vector.
// Broadcast256 copies element zero of its (128-bit) input to all elements of
// the 256-bit output vector.
//
// Asm: VPBROADCASTD, CPU Feature: AVX2
func (x Uint32x4) Broadcast1To8() Uint32x8
func (x Uint32x4) Broadcast256() Uint32x8
// Broadcast1To8 copies the lowest element of its input to all 8 elements of
// the output vector.
// Broadcast256 copies element zero of its (128-bit) input to all elements of
// the 256-bit output vector.
//
// Asm: VPBROADCASTQ, CPU Feature: AVX512
func (x Uint64x2) Broadcast1To8() Uint64x8
// Asm: VPBROADCASTQ, CPU Feature: AVX2
func (x Uint64x2) Broadcast256() Uint64x4
/* Broadcast1To16 */
/* Broadcast512 */
// Broadcast1To16 copies the lowest element of its input to all 16 elements of
// the output vector.
// Broadcast512 copies element zero of its (128-bit) input to all elements of
// the 512-bit output vector.
//
// Asm: VBROADCASTSS, CPU Feature: AVX512
func (x Float32x4) Broadcast1To16() Float32x16
func (x Float32x4) Broadcast512() Float32x16
// Broadcast1To16 copies the lowest element of its input to all 16 elements of
// the output vector.
// Broadcast512 copies element zero of its (128-bit) input to all elements of
// the 512-bit output vector.
//
// Asm: VPBROADCASTB, CPU Feature: AVX2
func (x Int8x16) Broadcast1To16() Int8x16
// Asm: VBROADCASTSD, CPU Feature: AVX512
func (x Float64x2) Broadcast512() Float64x8
// Broadcast1To16 copies the lowest element of its input to all 16 elements of
// the output vector.
//
// Asm: VPBROADCASTW, CPU Feature: AVX2
func (x Int16x8) Broadcast1To16() Int16x16
// Broadcast1To16 copies the lowest element of its input to all 16 elements of
// the output vector.
//
// Asm: VPBROADCASTD, CPU Feature: AVX512
func (x Int32x4) Broadcast1To16() Int32x16
// Broadcast1To16 copies the lowest element of its input to all 16 elements of
// the output vector.
//
// Asm: VPBROADCASTB, CPU Feature: AVX2
func (x Uint8x16) Broadcast1To16() Uint8x16
// Broadcast1To16 copies the lowest element of its input to all 16 elements of
// the output vector.
//
// Asm: VPBROADCASTW, CPU Feature: AVX2
func (x Uint16x8) Broadcast1To16() Uint16x16
// Broadcast1To16 copies the lowest element of its input to all 16 elements of
// the output vector.
//
// Asm: VPBROADCASTD, CPU Feature: AVX512
func (x Uint32x4) Broadcast1To16() Uint32x16
/* Broadcast1To32 */
// Broadcast1To32 copies the lowest element of its input to all 32 elements of
// the output vector.
//
// Asm: VPBROADCASTB, CPU Feature: AVX2
func (x Int8x16) Broadcast1To32() Int8x32
// Broadcast1To32 copies the lowest element of its input to all 32 elements of
// the output vector.
//
// Asm: VPBROADCASTW, CPU Feature: AVX512
func (x Int16x8) Broadcast1To32() Int16x32
// Broadcast1To32 copies the lowest element of its input to all 32 elements of
// the output vector.
//
// Asm: VPBROADCASTB, CPU Feature: AVX2
func (x Uint8x16) Broadcast1To32() Uint8x32
// Broadcast1To32 copies the lowest element of its input to all 32 elements of
// the output vector.
//
// Asm: VPBROADCASTW, CPU Feature: AVX512
func (x Uint16x8) Broadcast1To32() Uint16x32
/* Broadcast1To64 */
// Broadcast1To64 copies the lowest element of its input to all 64 elements of
// the output vector.
// Broadcast512 copies element zero of its (128-bit) input to all elements of
// the 512-bit output vector.
//
// Asm: VPBROADCASTB, CPU Feature: AVX512
func (x Int8x16) Broadcast1To64() Int8x64
func (x Int8x16) Broadcast512() Int8x64
// Broadcast1To64 copies the lowest element of its input to all 64 elements of
// the output vector.
// Broadcast512 copies element zero of its (128-bit) input to all elements of
// the 512-bit output vector.
//
// Asm: VPBROADCASTW, CPU Feature: AVX512
func (x Int16x8) Broadcast512() Int16x32
// Broadcast512 copies element zero of its (128-bit) input to all elements of
// the 512-bit output vector.
//
// Asm: VPBROADCASTD, CPU Feature: AVX512
func (x Int32x4) Broadcast512() Int32x16
// Broadcast512 copies element zero of its (128-bit) input to all elements of
// the 512-bit output vector.
//
// Asm: VPBROADCASTQ, CPU Feature: AVX512
func (x Int64x2) Broadcast512() Int64x8
// Broadcast512 copies element zero of its (128-bit) input to all elements of
// the 512-bit output vector.
//
// Asm: VPBROADCASTB, CPU Feature: AVX512
func (x Uint8x16) Broadcast1To64() Uint8x64
func (x Uint8x16) Broadcast512() Uint8x64
// Broadcast512 copies element zero of its (128-bit) input to all elements of
// the 512-bit output vector.
//
// Asm: VPBROADCASTW, CPU Feature: AVX512
func (x Uint16x8) Broadcast512() Uint16x32
// Broadcast512 copies element zero of its (128-bit) input to all elements of
// the 512-bit output vector.
//
// Asm: VPBROADCASTD, CPU Feature: AVX512
func (x Uint32x4) Broadcast512() Uint32x16
// Broadcast512 copies element zero of its (128-bit) input to all elements of
// the 512-bit output vector.
//
// Asm: VPBROADCASTQ, CPU Feature: AVX512
func (x Uint64x2) Broadcast512() Uint64x8
/* Ceil */
@@ -4088,12 +4082,12 @@ func (x Uint64x8) Mul(y Uint64x8) Uint64x8
// MulAdd performs a fused (x * y) + z.
//
// Asm: VFMADD213PS, CPU Feature: AVX512
// Asm: VFMADD213PS, CPU Feature: FMA
func (x Float32x4) MulAdd(y Float32x4, z Float32x4) Float32x4
// MulAdd performs a fused (x * y) + z.
//
// Asm: VFMADD213PS, CPU Feature: AVX512
// Asm: VFMADD213PS, CPU Feature: FMA
func (x Float32x8) MulAdd(y Float32x8, z Float32x8) Float32x8
// MulAdd performs a fused (x * y) + z.
@@ -4103,12 +4097,12 @@ func (x Float32x16) MulAdd(y Float32x16, z Float32x16) Float32x16
// MulAdd performs a fused (x * y) + z.
//
// Asm: VFMADD213PD, CPU Feature: AVX512
// Asm: VFMADD213PD, CPU Feature: FMA
func (x Float64x2) MulAdd(y Float64x2, z Float64x2) Float64x2
// MulAdd performs a fused (x * y) + z.
//
// Asm: VFMADD213PD, CPU Feature: AVX512
// Asm: VFMADD213PD, CPU Feature: FMA
func (x Float64x4) MulAdd(y Float64x4, z Float64x4) Float64x4
// MulAdd performs a fused (x * y) + z.
@@ -4120,12 +4114,12 @@ func (x Float64x8) MulAdd(y Float64x8, z Float64x8) Float64x8
// MulAddSub performs a fused (x * y) - z for odd-indexed elements, and (x * y) + z for even-indexed elements.
//
// Asm: VFMADDSUB213PS, CPU Feature: AVX512
// Asm: VFMADDSUB213PS, CPU Feature: FMA
func (x Float32x4) MulAddSub(y Float32x4, z Float32x4) Float32x4
// MulAddSub performs a fused (x * y) - z for odd-indexed elements, and (x * y) + z for even-indexed elements.
//
// Asm: VFMADDSUB213PS, CPU Feature: AVX512
// Asm: VFMADDSUB213PS, CPU Feature: FMA
func (x Float32x8) MulAddSub(y Float32x8, z Float32x8) Float32x8
// MulAddSub performs a fused (x * y) - z for odd-indexed elements, and (x * y) + z for even-indexed elements.
@@ -4135,12 +4129,12 @@ func (x Float32x16) MulAddSub(y Float32x16, z Float32x16) Float32x16
// MulAddSub performs a fused (x * y) - z for odd-indexed elements, and (x * y) + z for even-indexed elements.
//
// Asm: VFMADDSUB213PD, CPU Feature: AVX512
// Asm: VFMADDSUB213PD, CPU Feature: FMA
func (x Float64x2) MulAddSub(y Float64x2, z Float64x2) Float64x2
// MulAddSub performs a fused (x * y) - z for odd-indexed elements, and (x * y) + z for even-indexed elements.
//
// Asm: VFMADDSUB213PD, CPU Feature: AVX512
// Asm: VFMADDSUB213PD, CPU Feature: FMA
func (x Float64x4) MulAddSub(y Float64x4, z Float64x4) Float64x4
// MulAddSub performs a fused (x * y) - z for odd-indexed elements, and (x * y) + z for even-indexed elements.
@@ -4210,12 +4204,12 @@ func (x Uint16x32) MulHigh(y Uint16x32) Uint16x32
// MulSubAdd performs a fused (x * y) + z for odd-indexed elements, and (x * y) - z for even-indexed elements.
//
// Asm: VFMSUBADD213PS, CPU Feature: AVX512
// Asm: VFMSUBADD213PS, CPU Feature: FMA
func (x Float32x4) MulSubAdd(y Float32x4, z Float32x4) Float32x4
// MulSubAdd performs a fused (x * y) + z for odd-indexed elements, and (x * y) - z for even-indexed elements.
//
// Asm: VFMSUBADD213PS, CPU Feature: AVX512
// Asm: VFMSUBADD213PS, CPU Feature: FMA
func (x Float32x8) MulSubAdd(y Float32x8, z Float32x8) Float32x8
// MulSubAdd performs a fused (x * y) + z for odd-indexed elements, and (x * y) - z for even-indexed elements.
@@ -4225,12 +4219,12 @@ func (x Float32x16) MulSubAdd(y Float32x16, z Float32x16) Float32x16
// MulSubAdd performs a fused (x * y) + z for odd-indexed elements, and (x * y) - z for even-indexed elements.
//
// Asm: VFMSUBADD213PD, CPU Feature: AVX512
// Asm: VFMSUBADD213PD, CPU Feature: FMA
func (x Float64x2) MulSubAdd(y Float64x2, z Float64x2) Float64x2
// MulSubAdd performs a fused (x * y) + z for odd-indexed elements, and (x * y) - z for even-indexed elements.
//
// Asm: VFMSUBADD213PD, CPU Feature: AVX512
// Asm: VFMSUBADD213PD, CPU Feature: FMA
func (x Float64x4) MulSubAdd(y Float64x4, z Float64x4) Float64x4
// MulSubAdd performs a fused (x * y) + z for odd-indexed elements, and (x * y) - z for even-indexed elements.

View File

@@ -10,7 +10,7 @@ package archsimd
// Emulated, CPU Feature: AVX2
func BroadcastInt8x16(x int8) Int8x16 {
var z Int8x16
return z.SetElem(0, x).Broadcast1To16()
return z.SetElem(0, x).Broadcast128()
}
// BroadcastInt16x8 returns a vector with the input
@@ -19,7 +19,7 @@ func BroadcastInt8x16(x int8) Int8x16 {
// Emulated, CPU Feature: AVX2
func BroadcastInt16x8(x int16) Int16x8 {
var z Int16x8
return z.SetElem(0, x).Broadcast1To8()
return z.SetElem(0, x).Broadcast128()
}
// BroadcastInt32x4 returns a vector with the input
@@ -28,7 +28,7 @@ func BroadcastInt16x8(x int16) Int16x8 {
// Emulated, CPU Feature: AVX2
func BroadcastInt32x4(x int32) Int32x4 {
var z Int32x4
return z.SetElem(0, x).Broadcast1To4()
return z.SetElem(0, x).Broadcast128()
}
// BroadcastInt64x2 returns a vector with the input
@@ -37,7 +37,7 @@ func BroadcastInt32x4(x int32) Int32x4 {
// Emulated, CPU Feature: AVX2
func BroadcastInt64x2(x int64) Int64x2 {
var z Int64x2
return z.SetElem(0, x).Broadcast1To2()
return z.SetElem(0, x).Broadcast128()
}
// BroadcastUint8x16 returns a vector with the input
@@ -46,7 +46,7 @@ func BroadcastInt64x2(x int64) Int64x2 {
// Emulated, CPU Feature: AVX2
func BroadcastUint8x16(x uint8) Uint8x16 {
var z Uint8x16
return z.SetElem(0, x).Broadcast1To16()
return z.SetElem(0, x).Broadcast128()
}
// BroadcastUint16x8 returns a vector with the input
@@ -55,7 +55,7 @@ func BroadcastUint8x16(x uint8) Uint8x16 {
// Emulated, CPU Feature: AVX2
func BroadcastUint16x8(x uint16) Uint16x8 {
var z Uint16x8
return z.SetElem(0, x).Broadcast1To8()
return z.SetElem(0, x).Broadcast128()
}
// BroadcastUint32x4 returns a vector with the input
@@ -64,7 +64,7 @@ func BroadcastUint16x8(x uint16) Uint16x8 {
// Emulated, CPU Feature: AVX2
func BroadcastUint32x4(x uint32) Uint32x4 {
var z Uint32x4
return z.SetElem(0, x).Broadcast1To4()
return z.SetElem(0, x).Broadcast128()
}
// BroadcastUint64x2 returns a vector with the input
@@ -73,7 +73,7 @@ func BroadcastUint32x4(x uint32) Uint32x4 {
// Emulated, CPU Feature: AVX2
func BroadcastUint64x2(x uint64) Uint64x2 {
var z Uint64x2
return z.SetElem(0, x).Broadcast1To2()
return z.SetElem(0, x).Broadcast128()
}
// BroadcastFloat32x4 returns a vector with the input
@@ -82,7 +82,7 @@ func BroadcastUint64x2(x uint64) Uint64x2 {
// Emulated, CPU Feature: AVX2
func BroadcastFloat32x4(x float32) Float32x4 {
var z Float32x4
return z.SetElem(0, x).Broadcast1To4()
return z.SetElem(0, x).Broadcast128()
}
// BroadcastFloat64x2 returns a vector with the input
@@ -91,7 +91,7 @@ func BroadcastFloat32x4(x float32) Float32x4 {
// Emulated, CPU Feature: AVX2
func BroadcastFloat64x2(x float64) Float64x2 {
var z Float64x2
return z.SetElem(0, x).Broadcast1To2()
return z.SetElem(0, x).Broadcast128()
}
// BroadcastInt8x32 returns a vector with the input
@@ -100,7 +100,7 @@ func BroadcastFloat64x2(x float64) Float64x2 {
// Emulated, CPU Feature: AVX2
func BroadcastInt8x32(x int8) Int8x32 {
var z Int8x16
return z.SetElem(0, x).Broadcast1To32()
return z.SetElem(0, x).Broadcast256()
}
// BroadcastInt16x16 returns a vector with the input
@@ -109,7 +109,7 @@ func BroadcastInt8x32(x int8) Int8x32 {
// Emulated, CPU Feature: AVX2
func BroadcastInt16x16(x int16) Int16x16 {
var z Int16x8
return z.SetElem(0, x).Broadcast1To16()
return z.SetElem(0, x).Broadcast256()
}
// BroadcastInt32x8 returns a vector with the input
@@ -118,7 +118,7 @@ func BroadcastInt16x16(x int16) Int16x16 {
// Emulated, CPU Feature: AVX2
func BroadcastInt32x8(x int32) Int32x8 {
var z Int32x4
return z.SetElem(0, x).Broadcast1To8()
return z.SetElem(0, x).Broadcast256()
}
// BroadcastInt64x4 returns a vector with the input
@@ -127,7 +127,7 @@ func BroadcastInt32x8(x int32) Int32x8 {
// Emulated, CPU Feature: AVX2
func BroadcastInt64x4(x int64) Int64x4 {
var z Int64x2
return z.SetElem(0, x).Broadcast1To4()
return z.SetElem(0, x).Broadcast256()
}
// BroadcastUint8x32 returns a vector with the input
@@ -136,7 +136,7 @@ func BroadcastInt64x4(x int64) Int64x4 {
// Emulated, CPU Feature: AVX2
func BroadcastUint8x32(x uint8) Uint8x32 {
var z Uint8x16
return z.SetElem(0, x).Broadcast1To32()
return z.SetElem(0, x).Broadcast256()
}
// BroadcastUint16x16 returns a vector with the input
@@ -145,7 +145,7 @@ func BroadcastUint8x32(x uint8) Uint8x32 {
// Emulated, CPU Feature: AVX2
func BroadcastUint16x16(x uint16) Uint16x16 {
var z Uint16x8
return z.SetElem(0, x).Broadcast1To16()
return z.SetElem(0, x).Broadcast256()
}
// BroadcastUint32x8 returns a vector with the input
@@ -154,7 +154,7 @@ func BroadcastUint16x16(x uint16) Uint16x16 {
// Emulated, CPU Feature: AVX2
func BroadcastUint32x8(x uint32) Uint32x8 {
var z Uint32x4
return z.SetElem(0, x).Broadcast1To8()
return z.SetElem(0, x).Broadcast256()
}
// BroadcastUint64x4 returns a vector with the input
@@ -163,7 +163,7 @@ func BroadcastUint32x8(x uint32) Uint32x8 {
// Emulated, CPU Feature: AVX2
func BroadcastUint64x4(x uint64) Uint64x4 {
var z Uint64x2
return z.SetElem(0, x).Broadcast1To4()
return z.SetElem(0, x).Broadcast256()
}
// BroadcastFloat32x8 returns a vector with the input
@@ -172,7 +172,7 @@ func BroadcastUint64x4(x uint64) Uint64x4 {
// Emulated, CPU Feature: AVX2
func BroadcastFloat32x8(x float32) Float32x8 {
var z Float32x4
return z.SetElem(0, x).Broadcast1To8()
return z.SetElem(0, x).Broadcast256()
}
// BroadcastFloat64x4 returns a vector with the input
@@ -181,7 +181,7 @@ func BroadcastFloat32x8(x float32) Float32x8 {
// Emulated, CPU Feature: AVX2
func BroadcastFloat64x4(x float64) Float64x4 {
var z Float64x2
return z.SetElem(0, x).Broadcast1To4()
return z.SetElem(0, x).Broadcast256()
}
// BroadcastInt8x64 returns a vector with the input
@@ -190,7 +190,7 @@ func BroadcastFloat64x4(x float64) Float64x4 {
// Emulated, CPU Feature: AVX512BW
func BroadcastInt8x64(x int8) Int8x64 {
var z Int8x16
return z.SetElem(0, x).Broadcast1To64()
return z.SetElem(0, x).Broadcast512()
}
// BroadcastInt16x32 returns a vector with the input
@@ -199,7 +199,7 @@ func BroadcastInt8x64(x int8) Int8x64 {
// Emulated, CPU Feature: AVX512BW
func BroadcastInt16x32(x int16) Int16x32 {
var z Int16x8
return z.SetElem(0, x).Broadcast1To32()
return z.SetElem(0, x).Broadcast512()
}
// BroadcastInt32x16 returns a vector with the input
@@ -208,7 +208,7 @@ func BroadcastInt16x32(x int16) Int16x32 {
// Emulated, CPU Feature: AVX512F
func BroadcastInt32x16(x int32) Int32x16 {
var z Int32x4
return z.SetElem(0, x).Broadcast1To16()
return z.SetElem(0, x).Broadcast512()
}
// BroadcastInt64x8 returns a vector with the input
@@ -217,7 +217,7 @@ func BroadcastInt32x16(x int32) Int32x16 {
// Emulated, CPU Feature: AVX512F
func BroadcastInt64x8(x int64) Int64x8 {
var z Int64x2
return z.SetElem(0, x).Broadcast1To8()
return z.SetElem(0, x).Broadcast512()
}
// BroadcastUint8x64 returns a vector with the input
@@ -226,7 +226,7 @@ func BroadcastInt64x8(x int64) Int64x8 {
// Emulated, CPU Feature: AVX512BW
func BroadcastUint8x64(x uint8) Uint8x64 {
var z Uint8x16
return z.SetElem(0, x).Broadcast1To64()
return z.SetElem(0, x).Broadcast512()
}
// BroadcastUint16x32 returns a vector with the input
@@ -235,7 +235,7 @@ func BroadcastUint8x64(x uint8) Uint8x64 {
// Emulated, CPU Feature: AVX512BW
func BroadcastUint16x32(x uint16) Uint16x32 {
var z Uint16x8
return z.SetElem(0, x).Broadcast1To32()
return z.SetElem(0, x).Broadcast512()
}
// BroadcastUint32x16 returns a vector with the input
@@ -244,7 +244,7 @@ func BroadcastUint16x32(x uint16) Uint16x32 {
// Emulated, CPU Feature: AVX512F
func BroadcastUint32x16(x uint32) Uint32x16 {
var z Uint32x4
return z.SetElem(0, x).Broadcast1To16()
return z.SetElem(0, x).Broadcast512()
}
// BroadcastUint64x8 returns a vector with the input
@@ -253,7 +253,7 @@ func BroadcastUint32x16(x uint32) Uint32x16 {
// Emulated, CPU Feature: AVX512F
func BroadcastUint64x8(x uint64) Uint64x8 {
var z Uint64x2
return z.SetElem(0, x).Broadcast1To8()
return z.SetElem(0, x).Broadcast512()
}
// BroadcastFloat32x16 returns a vector with the input
@@ -262,7 +262,7 @@ func BroadcastUint64x8(x uint64) Uint64x8 {
// Emulated, CPU Feature: AVX512F
func BroadcastFloat32x16(x float32) Float32x16 {
var z Float32x4
return z.SetElem(0, x).Broadcast1To16()
return z.SetElem(0, x).Broadcast512()
}
// BroadcastFloat64x8 returns a vector with the input
@@ -271,7 +271,7 @@ func BroadcastFloat32x16(x float32) Float32x16 {
// Emulated, CPU Feature: AVX512F
func BroadcastFloat64x8(x float64) Float64x8 {
var z Float64x2
return z.SetElem(0, x).Broadcast1To8()
return z.SetElem(0, x).Broadcast512()
}
// ToMask converts from Int8x16 to Mask8x16, mask element is set to true when the corresponding vector element is non-zero.

View File

@@ -308,7 +308,7 @@ func Mask8x16FromBits(y uint16) Mask8x16
// ToBits constructs a bitmap from a Mask8x16, where 1 means set for the indexed element, 0 means unset.
//
// Asm: VPMOVMSKB, CPU Features: AVX
// Asm: KMOVB, CPU Features: AVX512
func (x Mask8x16) ToBits() uint16
// Mask16x8 is a mask for a SIMD vector of 8 16-bit elements.
@@ -342,7 +342,7 @@ func Mask32x4FromBits(y uint8) Mask32x4
// ToBits constructs a bitmap from a Mask32x4, where 1 means set for the indexed element, 0 means unset.
// Only the lower 4 bits of y are used.
//
// Asm: VMOVMSKPS, CPU Features: AVX
// Asm: KMOVD, CPU Features: AVX512
func (x Mask32x4) ToBits() uint8
// Mask64x2 is a mask for a SIMD vector of 2 64-bit elements.
@@ -360,7 +360,7 @@ func Mask64x2FromBits(y uint8) Mask64x2
// ToBits constructs a bitmap from a Mask64x2, where 1 means set for the indexed element, 0 means unset.
// Only the lower 2 bits of y are used.
//
// Asm: VMOVMSKPD, CPU Features: AVX
// Asm: KMOVQ, CPU Features: AVX512
func (x Mask64x2) ToBits() uint8
// v256 is a tag type that tells the compiler that this is really 256-bit SIMD
@@ -667,7 +667,7 @@ func Mask8x32FromBits(y uint32) Mask8x32
// ToBits constructs a bitmap from a Mask8x32, where 1 means set for the indexed element, 0 means unset.
//
// Asm: VPMOVMSKB, CPU Features: AVX2
// Asm: KMOVB, CPU Features: AVX512
func (x Mask8x32) ToBits() uint32
// Mask16x16 is a mask for a SIMD vector of 16 16-bit elements.
@@ -699,7 +699,7 @@ func Mask32x8FromBits(y uint8) Mask32x8
// ToBits constructs a bitmap from a Mask32x8, where 1 means set for the indexed element, 0 means unset.
//
// Asm: VMOVMSKPS, CPU Features: AVX
// Asm: KMOVD, CPU Features: AVX512
func (x Mask32x8) ToBits() uint8
// Mask64x4 is a mask for a SIMD vector of 4 64-bit elements.
@@ -717,7 +717,7 @@ func Mask64x4FromBits(y uint8) Mask64x4
// ToBits constructs a bitmap from a Mask64x4, where 1 means set for the indexed element, 0 means unset.
// Only the lower 4 bits of y are used.
//
// Asm: VMOVMSKPD, CPU Features: AVX
// Asm: KMOVQ, CPU Features: AVX512
func (x Mask64x4) ToBits() uint8
// v512 is a tag type that tells the compiler that this is really 512-bit SIMD

View File

@@ -13,20 +13,17 @@ import "math/bits"
//
func bitsCheckConstLeftShiftU64(a uint64) (n int) {
// amd64:"BTQ [$]63,"
// arm64:"TBNZ [$]63,"
// amd64:"BTQ [$]63"
// riscv64:"MOV [$]" "AND" "BNEZ"
if a&(1<<63) != 0 {
return 1
}
// amd64:"BTQ [$]60,"
// arm64:"TBNZ [$]60,"
// amd64:"BTQ [$]60"
// riscv64:"MOV [$]" "AND" "BNEZ"
if a&(1<<60) != 0 {
return 1
}
// amd64:"BTL [$]0,"
// arm64:"TBZ [$]0,"
// amd64:"BTL [$]0"
// riscv64:"ANDI" "BEQZ"
if a&(1<<0) != 0 {
return 1
@@ -35,44 +32,37 @@ func bitsCheckConstLeftShiftU64(a uint64) (n int) {
}
func bitsCheckConstRightShiftU64(a [8]uint64) (n int) {
// amd64:"BTQ [$]63,"
// arm64:"LSR [$]63," "TBNZ [$]0,"
// amd64:"BTQ [$]63"
// riscv64:"SRLI" "ANDI" "BNEZ"
if (a[0]>>63)&1 != 0 {
return 1
}
// amd64:"BTQ [$]63,"
// arm64:"LSR [$]63," "CBNZ"
// amd64:"BTQ [$]63"
// riscv64:"SRLI" "BNEZ"
if a[1]>>63 != 0 {
return 1
}
// amd64:"BTQ [$]63,"
// arm64:"LSR [$]63," "CBZ"
// amd64:"BTQ [$]63"
// riscv64:"SRLI" "BEQZ"
if a[2]>>63 == 0 {
return 1
}
// amd64:"BTQ [$]60,"
// arm64:"LSR [$]60," "TBZ [$]0,"
// amd64:"BTQ [$]60"
// riscv64:"SRLI", "ANDI" "BEQZ"
if (a[3]>>60)&1 == 0 {
return 1
}
// amd64:"BTL [$]1,"
// arm64:"LSR [$]1," "TBZ [$]0,"
// amd64:"BTL [$]1"
// riscv64:"SRLI" "ANDI" "BEQZ"
if (a[4]>>1)&1 == 0 {
return 1
}
// amd64:"BTL [$]0,"
// arm64:"TBZ [$]0," -"LSR"
// amd64:"BTL [$]0"
// riscv64:"ANDI" "BEQZ" -"SRLI"
if (a[5]>>0)&1 == 0 {
return 1
}
// amd64:"BTL [$]7,"
// arm64:"LSR [$]5," "TBNZ [$]2,"
// amd64:"BTL [$]7"
// riscv64:"SRLI" "ANDI" "BNEZ"
if (a[6]>>5)&4 == 0 {
return 1
@@ -82,14 +72,12 @@ func bitsCheckConstRightShiftU64(a [8]uint64) (n int) {
func bitsCheckVarU64(a, b uint64) (n int) {
// amd64:"BTQ"
// arm64:"MOVD [$]1," "LSL" "TST"
// riscv64:"ANDI [$]63," "SLL " "AND "
if a&(1<<(b&63)) != 0 {
return 1
}
// amd64:"BTQ" -"BT. [$]0,"
// arm64:"LSR" "TBZ [$]0,"
// riscv64:"ANDI [$]63," "SRL" "ANDI [$]1,"
// amd64:"BTQ" -"BT. [$]0"
// riscv64:"ANDI [$]63," "SRL " "ANDI [$]1,"
if (b>>(a&63))&1 != 0 {
return 1
}
@@ -97,20 +85,17 @@ func bitsCheckVarU64(a, b uint64) (n int) {
}
func bitsCheckMaskU64(a uint64) (n int) {
// amd64:"BTQ [$]63,"
// arm64:"TBNZ [$]63,"
// amd64:"BTQ [$]63"
// riscv64:"MOV [$]" "AND" "BNEZ"
if a&0x8000000000000000 != 0 {
return 1
}
// amd64:"BTQ [$]59,"
// arm64:"TBNZ [$]59,"
// amd64:"BTQ [$]59"
// riscv64:"MOV [$]" "AND" "BNEZ"
if a&0x800000000000000 != 0 {
return 1
}
// amd64:"BTL [$]0,"
// arm64:"TBZ [$]0,"
// amd64:"BTL [$]0"
// riscv64:"ANDI" "BEQZ"
if a&0x1 != 0 {
return 1
@@ -120,22 +105,18 @@ func bitsCheckMaskU64(a uint64) (n int) {
func bitsSetU64(a, b uint64) (n uint64) {
// amd64:"BTSQ"
// arm64:"MOVD [$]1," "LSL" "ORR"
// riscv64:"ANDI" "SLL" "OR"
n += b | (1 << (a & 63))
// amd64:"BTSQ [$]63,"
// arm64:"ORR [$]-9223372036854775808,"
// amd64:"BTSQ [$]63"
// riscv64:"MOV [$]" "OR "
n += a | (1 << 63)
// amd64:"BTSQ [$]60,"
// arm64:"ORR [$]1152921504606846976,"
// amd64:"BTSQ [$]60"
// riscv64:"MOV [$]" "OR "
n += a | (1 << 60)
// amd64:"ORQ [$]1,"
// arm64:"ORR [$]1,"
// amd64:"ORQ [$]1"
// riscv64:"ORI"
n += a | (1 << 0)
@@ -144,22 +125,18 @@ func bitsSetU64(a, b uint64) (n uint64) {
func bitsClearU64(a, b uint64) (n uint64) {
// amd64:"BTRQ"
// arm64:"MOVD [$]1," "LSL" "BIC"
// riscv64:"ANDI" "SLL" "ANDN"
n += b &^ (1 << (a & 63))
// amd64:"BTRQ [$]63,"
// arm64:"AND [$]9223372036854775807,"
// amd64:"BTRQ [$]63"
// riscv64:"MOV [$]" "AND "
n += a &^ (1 << 63)
// amd64:"BTRQ [$]60,"
// arm64:"AND [$]-1152921504606846977,"
// amd64:"BTRQ [$]60"
// riscv64:"MOV [$]" "AND "
n += a &^ (1 << 60)
// amd64:"ANDQ [$]-2"
// arm64:"AND [$]-2"
// riscv64:"ANDI [$]-2"
n += a &^ (1 << 0)
@@ -167,14 +144,12 @@ func bitsClearU64(a, b uint64) (n uint64) {
}
func bitsClearLowest(x int64, y int32) (int64, int32) {
// amd64:"ANDQ [$]-2,"
// arm64:"AND [$]-2,"
// riscv64:"ANDI [$]-2,"
// amd64:"ANDQ [$]-2"
// riscv64:"ANDI [$]-2"
a := (x >> 1) << 1
// amd64:"ANDL [$]-2,"
// arm64:"AND [$]-2,"
// riscv64:"ANDI [$]-2,"
// amd64:"ANDL [$]-2"
// riscv64:"ANDI [$]-2"
b := (y >> 1) << 1
return a, b
@@ -182,23 +157,19 @@ func bitsClearLowest(x int64, y int32) (int64, int32) {
func bitsFlipU64(a, b uint64) (n uint64) {
// amd64:"BTCQ"
// arm64:"MOVD [$]1," "LSL" "EOR"
// riscv64:"ANDI" "SLL" "XOR "
n += b ^ (1 << (a & 63))
// amd64:"BTCQ [$]63,"
// arm64:"EOR [$]-9223372036854775808,"
// amd64:"BTCQ [$]63"
// riscv64:"MOV [$]" "XOR "
n += a ^ (1 << 63)
// amd64:"BTCQ [$]60,"
// arm64:"EOR [$]1152921504606846976,"
// amd64:"BTCQ [$]60"
// riscv64:"MOV [$]" "XOR "
n += a ^ (1 << 60)
// amd64:"XORQ [$]1,"
// arm64:"EOR [$]1,"
// riscv64:"XORI [$]1,"
// amd64:"XORQ [$]1"
// riscv64:"XORI [$]1"
n += a ^ (1 << 0)
return n
@@ -209,20 +180,17 @@ func bitsFlipU64(a, b uint64) (n uint64) {
//
func bitsCheckConstShiftLeftU32(a uint32) (n int) {
// amd64:"BTL [$]31,"
// arm64:"TBNZ [$]31,"
// amd64:"BTL [$]31"
// riscv64:"MOV [$]" "AND" "BNEZ"
if a&(1<<31) != 0 {
return 1
}
// amd64:"BTL [$]28,"
// arm64:"TBNZ [$]28,"
// amd64:"BTL [$]28"
// riscv64:"ANDI" "BNEZ"
if a&(1<<28) != 0 {
return 1
}
// amd64:"BTL [$]0,"
// arm64:"TBZ [$]0,"
// amd64:"BTL [$]0"
// riscv64:"ANDI" "BEQZ"
if a&(1<<0) != 0 {
return 1
@@ -231,44 +199,37 @@ func bitsCheckConstShiftLeftU32(a uint32) (n int) {
}
func bitsCheckConstRightShiftU32(a [8]uint32) (n int) {
// amd64:"BTL [$]31,"
// arm64:"UBFX [$]31," "CBNZW"
// amd64:"BTL [$]31"
// riscv64:"SRLI" "ANDI" "BNEZ"
if (a[0]>>31)&1 != 0 {
return 1
}
// amd64:"BTL [$]31,"
// arm64:"UBFX [$]31," "CBNZW"
// amd64:"BTL [$]31"
// riscv64:"SRLI" "BNEZ"
if a[1]>>31 != 0 {
return 1
}
// amd64:"BTL [$]31,"
// arm64:"UBFX [$]31," "CBZW"
// amd64:"BTL [$]31"
// riscv64:"SRLI" "BEQZ"
if a[2]>>31 == 0 {
return 1
}
// amd64:"BTL [$]28,"
// arm64:"UBFX [$]28," "TBZ"
// amd64:"BTL [$]28"
// riscv64:"SRLI" "ANDI" "BEQZ"
if (a[3]>>28)&1 == 0 {
return 1
}
// amd64:"BTL [$]1,"
// arm64:"UBFX [$]1," "TBZ"
// amd64:"BTL [$]1"
// riscv64:"SRLI" "ANDI" "BEQZ"
if (a[4]>>1)&1 == 0 {
return 1
}
// amd64:"BTL [$]0,"
// arm64:"TBZ" -"UBFX" -"SRL"
// riscv64:"ANDI" "BEQZ" -"SRLI "
// amd64:"BTL [$]0"
// riscv64:"ANDI" "BEQZ" -"SRLI"
if (a[5]>>0)&1 == 0 {
return 1
}
// amd64:"BTL [$]7,"
// arm64:"UBFX [$]5," "TBNZ"
// amd64:"BTL [$]7"
// riscv64:"SRLI" "ANDI" "BNEZ"
if (a[6]>>5)&4 == 0 {
return 1
@@ -278,13 +239,11 @@ func bitsCheckConstRightShiftU32(a [8]uint32) (n int) {
func bitsCheckVarU32(a, b uint32) (n int) {
// amd64:"BTL"
// arm64:"AND [$]31," "MOVD [$]1," "LSL" "TSTW"
// riscv64:"ANDI [$]31," "SLL " "AND "
if a&(1<<(b&31)) != 0 {
return 1
}
// amd64:"BTL" -"BT. [$]0"
// arm64:"AND [$]31," "LSR" "TBZ"
// riscv64:"ANDI [$]31," "SRLW " "ANDI [$]1,"
if (b>>(a&31))&1 != 0 {
return 1
@@ -293,20 +252,17 @@ func bitsCheckVarU32(a, b uint32) (n int) {
}
func bitsCheckMaskU32(a uint32) (n int) {
// amd64:"BTL [$]31,"
// arm64:"TBNZ [$]31,"
// amd64:"BTL [$]31"
// riscv64:"MOV [$]" "AND" "BNEZ"
if a&0x80000000 != 0 {
return 1
}
// amd64:"BTL [$]27,"
// arm64:"TBNZ [$]27,"
// amd64:"BTL [$]27"
// riscv64:"ANDI" "BNEZ"
if a&0x8000000 != 0 {
return 1
}
// amd64:"BTL [$]0,"
// arm64:"TBZ [$]0,"
// amd64:"BTL [$]0"
// riscv64:"ANDI" "BEQZ"
if a&0x1 != 0 {
return 1
@@ -316,23 +272,19 @@ func bitsCheckMaskU32(a uint32) (n int) {
func bitsSetU32(a, b uint32) (n uint32) {
// amd64:"BTSL"
// arm64:"AND [$]31," "MOVD [$]1," "LSL" "ORR"
// riscv64:"ANDI" "SLL" "OR"
n += b | (1 << (a & 31))
// amd64:"ORL [$]-2147483648,"
// arm64:"ORR [$]-2147483648,"
// riscv64:"ORI [$]-2147483648,"
// amd64:"ORL [$]-2147483648"
// riscv64:"ORI [$]-2147483648"
n += a | (1 << 31)
// amd64:"ORL [$]268435456,"
// arm64:"ORR [$]268435456,"
// riscv64:"ORI [$]268435456,"
// amd64:"ORL [$]268435456"
// riscv64:"ORI [$]268435456"
n += a | (1 << 28)
// amd64:"ORL [$]1,"
// arm64:"ORR [$]1,"
// riscv64:"ORI [$]1,"
// amd64:"ORL [$]1"
// riscv64:"ORI [$]1"
n += a | (1 << 0)
return n
@@ -340,23 +292,19 @@ func bitsSetU32(a, b uint32) (n uint32) {
func bitsClearU32(a, b uint32) (n uint32) {
// amd64:"BTRL"
// arm64:"AND [$]31," "MOVD [$]1," "LSL" "BIC"
// riscv64:"ANDI" "SLL" "ANDN"
n += b &^ (1 << (a & 31))
// amd64:"ANDL [$]2147483647,"
// arm64:"AND [$]2147483647,"
// riscv64:"ANDI [$]2147483647,"
// amd64:"ANDL [$]2147483647"
// riscv64:"ANDI [$]2147483647"
n += a &^ (1 << 31)
// amd64:"ANDL [$]-268435457,"
// arm64:"AND [$]-268435457,"
// riscv64:"ANDI [$]-268435457,"
// amd64:"ANDL [$]-268435457"
// riscv64:"ANDI [$]-268435457"
n += a &^ (1 << 28)
// amd64:"ANDL [$]-2,"
// arm64:"AND [$]-2,"
// riscv64:"ANDI [$]-2,"
// amd64:"ANDL [$]-2"
// riscv64:"ANDI [$]-2"
n += a &^ (1 << 0)
return n
@@ -364,23 +312,19 @@ func bitsClearU32(a, b uint32) (n uint32) {
func bitsFlipU32(a, b uint32) (n uint32) {
// amd64:"BTCL"
// arm64:"AND [$]31," "MOVD [$]1," "LSL" "EOR"
// riscv64:"ANDI" "SLL" "XOR "
n += b ^ (1 << (a & 31))
// amd64:"XORL [$]-2147483648,"
// arm64:"EOR [$]-2147483648,"
// riscv64:"XORI [$]-2147483648,"
// amd64:"XORL [$]-2147483648"
// riscv64:"XORI [$]-2147483648"
n += a ^ (1 << 31)
// amd64:"XORL [$]268435456,"
// arm64:"EOR [$]268435456,"
// riscv64:"XORI [$]268435456,"
// amd64:"XORL [$]268435456"
// riscv64:"XORI [$]268435456"
n += a ^ (1 << 28)
// amd64:"XORL [$]1,"
// arm64:"EOR [$]1,"
// riscv64:"XORI [$]1,"
// amd64:"XORL [$]1"
// riscv64:"XORI [$]1"
n += a ^ (1 << 0)
return n
@@ -399,7 +343,6 @@ func bitsOpOnMem(a []uint32, b, c, d uint32) {
func bitsCheckMostNegative(b uint8) bool {
// amd64:"TESTB"
// arm64:"TSTW" "CSET"
// riscv64:"ANDI [$]128," "SNEZ" -"ADDI"
return b&0x80 == 0x80
}