gsrlabs/go - go - Gitea: Git with a cup of tea

gsrlabs/go

mirror of https://github.com/golang/go.git synced 2026-01-29 07:02:05 +03:00

Author	SHA1	Message	Date
Russ Cox	9bbda7c99d	cmd/compile: make prove understand div, mod better This CL introduces new divisible and divmod passes that rewrite divisibility checks and div, mod, and mul. These happen after prove, so that prove can make better sense of the code for deriving bounds, and they must run before decompose, so that 64-bit ops can be lowered to 32-bit ops on 32-bit systems. And then they need another generic pass as well, to optimize the generated code before decomposing. The three opt passes are "opt", "middle opt", and "late opt". (Perhaps instead they should be "generic", "opt", and "late opt"?) The "late opt" pass repeats the "middle opt" work on any new code that has been generated in the interim. There will not be new divs or mods, but there may be new muls. The x%c==0 rewrite rules are much simpler now, since they can match before divs have been rewritten. This has the effect of applying them more consistently and making the rewrite rules independent of the exact div rewrites. Prove is also now charged with marking signed div/mod as unsigned when the arguments call for it, allowing simpler code to be emitted in various cases. For example, t.Seconds()/2 and len(x)/2 are now recognized as unsigned, meaning they compile to a simple shift (unsigned division), avoiding the more complex fixup we need for signed values. https://gist.github.com/rsc/99d9d3bd99cde87b6a1a390e3d85aa32 shows a diff of 'go build -a -gcflags=-d=ssa/prove/debug=1 std' output before and after. "Proved Rsh64x64 shifts to zero" is replaced by the higher-level "Proved Div64 is unsigned" (the shift was in the signed expansion of div by constant), but otherwise prove is only finding more things to prove. One short example, in code that does x[i%len(x)]: < runtime/mfinal.go:131:34: Proved Rsh64x64 shifts to zero --- > runtime/mfinal.go:131:34: Proved Div64 is unsigned > runtime/mfinal.go:131:38: Proved IsInBounds A longer example: < crypto/internal/fips140/sha3/shake.go:28:30: Proved Rsh64x64 shifts to zero < crypto/internal/fips140/sha3/shake.go:38:27: Proved Rsh64x64 shifts to zero < crypto/internal/fips140/sha3/shake.go:53:46: Proved Rsh64x64 shifts to zero < crypto/internal/fips140/sha3/shake.go:55:46: Proved Rsh64x64 shifts to zero --- > crypto/internal/fips140/sha3/shake.go:28:30: Proved Div64 is unsigned > crypto/internal/fips140/sha3/shake.go:28:30: Proved IsInBounds > crypto/internal/fips140/sha3/shake.go:28:30: Proved IsSliceInBounds > crypto/internal/fips140/sha3/shake.go:38:27: Proved Div64 is unsigned > crypto/internal/fips140/sha3/shake.go:45:7: Proved IsSliceInBounds > crypto/internal/fips140/sha3/shake.go:46:4: Proved IsInBounds > crypto/internal/fips140/sha3/shake.go:53:46: Proved Div64 is unsigned > crypto/internal/fips140/sha3/shake.go:53:46: Proved IsInBounds > crypto/internal/fips140/sha3/shake.go:53:46: Proved IsSliceInBounds > crypto/internal/fips140/sha3/shake.go:55:46: Proved Div64 is unsigned > crypto/internal/fips140/sha3/shake.go:55:46: Proved IsInBounds > crypto/internal/fips140/sha3/shake.go:55:46: Proved IsSliceInBounds These diffs are due to the smaller opt being better and taking work away from prove: < image/jpeg/dct.go:307:5: Proved IsInBounds < image/jpeg/dct.go:308:5: Proved IsInBounds ... < image/jpeg/dct.go:442:5: Proved IsInBounds In the old opt, Mul by 8 was rewritten to Lsh by 3 early. This CL delays that rule to help prove recognize mods, but it also helps opt constant-fold the slice x[8i:8i+8:8*i+8]. Specifically, computing the length, opt can now do: (Sub64 (Add (Mul 8 i) 8) (Add (Mul 8 i) 8)) -> (Add 8 (Sub (Mul 8 i) (Mul 8 i))) -> (Add 8 (Mul 8 (Sub i i))) -> (Add 8 (Mul 8 0)) -> (Add 8 0) -> 8 The key step is (Sub (Mul x y) (Mul x z)) -> (Mul x (Sub y z)), Leaving the multiply as Mul enables using that step; the old rewrite to Lsh blocked it, leaving prove to figure out the length and then remove the bounds checks. But now opt can evaluate the length down to a constant 8 and then constant-fold away the bounds checks 0 < 8, 1 < 8, and so on. After that, the compiler has nothing left to prove. Benchmarks are noisy in general; I checked the assembly for the many large increases below, and the vast majority are unchanged and presumably hitting the caches differently in some way. The divisibility optimizations were not reliably triggering before. This leads to a very large improvement in some cases, like DivisiblePow2constI64, DivisibleconstI64 on 64-bit systems and DivisbleconstU64 on 32-bit systems. Another way the divisibility optimizations were unreliable before was incorrectly triggering for x/3, x%3 even though they are written not to do that. There is a real but small slowdown in the DivisibleWDivconst benchmarks on Mac because in the cases used in the benchmark, it is still faster (on Mac) to do the divisibility check than to remultiply. This may be worth further study. Perhaps when there is no rotate (meaning the divisor is odd), the divisibility optimization should be enabled always. In any event, this CL makes it possible to study that. benchmark \ host s7 linux-amd64 mac linux-arm64 linux-ppc64le linux-386 s7:GOARCH=386 linux-arm vs base vs base vs base vs base vs base vs base vs base vs base LoadAdd ~ ~ ~ ~ ~ -1.59% ~ ~ ExtShift ~ ~ -42.14% +0.10% ~ +1.44% +5.66% +8.50% Modify ~ ~ ~ ~ ~ ~ ~ -1.53% MullImm ~ ~ ~ ~ ~ +37.90% -21.87% +3.05% ConstModify ~ ~ ~ ~ -49.14% ~ ~ ~ BitSet ~ ~ ~ ~ -15.86% -14.57% +6.44% +0.06% BitClear ~ ~ ~ ~ ~ +1.78% +3.50% +0.06% BitToggle ~ ~ ~ ~ ~ -16.09% +2.91% ~ BitSetConst ~ ~ ~ ~ ~ ~ ~ -0.49% BitClearConst ~ ~ ~ ~ -28.29% ~ ~ -0.40% BitToggleConst ~ ~ ~ +8.89% -31.19% ~ ~ -0.77% MulNeg ~ ~ ~ ~ ~ ~ ~ ~ Mul2Neg ~ ~ -4.83% ~ ~ -13.75% -5.92% ~ DivconstI64 ~ ~ ~ ~ ~ -30.12% ~ +0.50% ModconstI64 ~ ~ -9.94% -4.63% ~ +3.15% ~ +5.32% DivisiblePow2constI64 -34.49% -12.58% ~ ~ -12.25% ~ ~ ~ DivisibleconstI64 -24.69% -25.06% -0.40% -2.27% -42.61% -3.31% ~ +1.63% DivisibleWDivconstI64 ~ ~ ~ ~ ~ -17.55% ~ -0.60% DivconstU64/3 ~ ~ ~ ~ ~ +1.51% ~ ~ DivconstU64/5 ~ ~ ~ ~ ~ ~ ~ ~ DivconstU64/37 ~ ~ -0.18% ~ ~ +2.70% ~ ~ DivconstU64/1234567 ~ ~ ~ ~ ~ ~ ~ +0.12% ModconstU64 ~ ~ ~ -0.24% ~ -5.10% -1.07% -1.56% DivisibleconstU64 ~ ~ ~ ~ ~ -29.01% -59.13% -50.72% DivisibleWDivconstU64 ~ ~ -12.18% -18.88% ~ -5.50% -3.91% +5.17% DivconstI32 ~ ~ -0.48% ~ -34.69% +89.01% -6.01% -16.67% ModconstI32 ~ +2.95% -0.33% ~ ~ -2.98% -5.40% -8.30% DivisiblePow2constI32 ~ ~ ~ ~ ~ ~ ~ -16.22% DivisibleconstI32 ~ ~ ~ ~ ~ -37.27% -47.75% -25.03% DivisibleWDivconstI32 -11.59% +5.22% -12.99% -23.83% ~ +45.95% -7.03% -10.01% DivconstU32 ~ ~ ~ ~ ~ +74.71% +4.81% ~ ModconstU32 ~ ~ +0.53% +0.18% ~ +51.16% ~ ~ DivisibleconstU32 ~ ~ ~ -0.62% ~ -4.25% ~ ~ DivisibleWDivconstU32 -2.77% +5.56% +11.12% -5.15% ~ +48.70% +25.11% -4.07% DivconstI16 -6.06% ~ -0.33% +0.22% ~ ~ -9.68% +5.47% ModconstI16 ~ ~ +4.44% +2.82% ~ ~ ~ +5.06% DivisiblePow2constI16 ~ ~ ~ ~ ~ ~ ~ -0.17% DivisibleconstI16 ~ ~ -0.23% ~ ~ ~ +4.60% +6.64% DivisibleWDivconstI16 -1.44% -0.43% +13.48% -5.76% ~ +1.62% -23.15% -9.06% DivconstU16 +1.61% ~ -0.35% -0.47% ~ ~ +15.59% ~ ModconstU16 ~ ~ ~ ~ ~ -0.72% ~ +14.23% DivisibleconstU16 ~ ~ -0.05% +3.00% ~ ~ ~ +5.06% DivisibleWDivconstU16 +52.10% +0.75% +17.28% +4.79% ~ -37.39% +5.28% -9.06% DivconstI8 ~ ~ -0.34% -0.96% ~ ~ -9.20% ~ ModconstI8 +2.29% ~ +4.38% +2.96% ~ ~ ~ ~ DivisiblePow2constI8 ~ ~ ~ ~ ~ ~ ~ ~ DivisibleconstI8 ~ ~ ~ ~ ~ ~ +6.04% ~ DivisibleWDivconstI8 -26.44% +1.69% +17.03% +4.05% ~ +32.48% -24.90% ~ DivconstU8 -4.50% +14.06% -0.28% ~ ~ ~ +4.16% +0.88% ModconstU8 ~ ~ +25.84% -0.64% ~ ~ ~ ~ DivisibleconstU8 ~ ~ -5.70% ~ ~ ~ ~ ~ DivisibleWDivconstU8 +49.55% +9.07% ~ +4.03% +53.87% -40.03% +39.72% -3.01% Mul2 ~ ~ ~ ~ ~ ~ ~ ~ MulNeg2 ~ ~ ~ ~ -11.73% ~ ~ -0.02% EfaceInteger ~ ~ ~ ~ ~ +18.11% ~ +2.53% TypeAssert +33.90% +2.86% ~ ~ ~ -1.07% -5.29% -1.04% Div64UnsignedSmall ~ ~ ~ ~ ~ ~ ~ ~ Div64Small ~ ~ ~ ~ ~ -0.88% ~ +2.39% Div64SmallNegDivisor ~ ~ ~ ~ ~ ~ ~ +0.35% Div64SmallNegDividend ~ ~ ~ ~ ~ -0.84% ~ +3.57% Div64SmallNegBoth ~ ~ ~ ~ ~ -0.86% ~ +3.55% Div64Unsigned ~ ~ ~ ~ ~ ~ ~ -0.11% Div64 ~ ~ ~ ~ ~ ~ ~ +0.11% Div64NegDivisor ~ ~ ~ ~ ~ -1.29% ~ ~ Div64NegDividend ~ ~ ~ ~ ~ -1.44% ~ ~ Div64NegBoth ~ ~ ~ ~ ~ ~ ~ +0.28% Mod64UnsignedSmall ~ ~ ~ ~ ~ +0.48% ~ +0.93% Mod64Small ~ ~ ~ ~ ~ ~ ~ ~ Mod64SmallNegDivisor ~ ~ ~ ~ ~ ~ ~ +1.44% Mod64SmallNegDividend ~ ~ ~ ~ ~ +0.22% ~ +1.37% Mod64SmallNegBoth ~ ~ ~ ~ ~ ~ ~ -2.22% Mod64Unsigned ~ ~ ~ ~ ~ -0.95% ~ +0.11% Mod64 ~ ~ ~ ~ ~ ~ ~ ~ Mod64NegDivisor ~ ~ ~ ~ ~ ~ ~ -0.02% Mod64NegDividend ~ ~ ~ ~ ~ ~ ~ ~ Mod64NegBoth ~ ~ ~ ~ ~ ~ ~ -0.02% MulconstI32/3 ~ ~ ~ -25.00% ~ ~ ~ +47.37% MulconstI32/5 ~ ~ ~ +33.28% ~ ~ ~ +32.21% MulconstI32/12 ~ ~ ~ -2.13% ~ ~ ~ -0.02% MulconstI32/120 ~ ~ ~ +2.93% ~ ~ ~ -0.03% MulconstI32/-120 ~ ~ ~ -2.17% ~ ~ ~ -0.03% MulconstI32/65537 ~ ~ ~ ~ ~ ~ ~ +0.03% MulconstI32/65538 ~ ~ ~ ~ ~ -33.38% ~ +0.04% MulconstI64/3 ~ ~ ~ +33.35% ~ -0.37% ~ -0.13% MulconstI64/5 ~ ~ ~ -25.00% ~ -0.34% ~ ~ MulconstI64/12 ~ ~ ~ +2.13% ~ +11.62% ~ +2.30% MulconstI64/120 ~ ~ ~ -1.98% ~ ~ ~ ~ MulconstI64/-120 ~ ~ ~ +0.75% ~ ~ ~ ~ MulconstI64/65537 ~ ~ ~ ~ ~ +5.61% ~ ~ MulconstI64/65538 ~ ~ ~ ~ ~ +5.25% ~ ~ MulconstU32/3 ~ +0.81% ~ +33.39% ~ +77.92% ~ -32.31% MulconstU32/5 ~ ~ ~ -24.97% ~ +77.92% ~ -24.47% MulconstU32/12 ~ ~ ~ +2.06% ~ ~ ~ +0.03% MulconstU32/120 ~ ~ ~ -2.74% ~ ~ ~ +0.03% MulconstU32/65537 ~ ~ ~ ~ ~ ~ ~ +0.03% MulconstU32/65538 ~ ~ ~ ~ ~ -33.42% ~ -0.03% MulconstU64/3 ~ ~ ~ +33.33% ~ -0.28% ~ +1.22% MulconstU64/5 ~ ~ ~ -25.00% ~ ~ ~ -0.64% MulconstU64/12 ~ ~ ~ +2.30% ~ +11.59% ~ +0.14% MulconstU64/120 ~ ~ ~ -2.82% ~ ~ ~ +0.04% MulconstU64/65537 ~ +0.37% ~ ~ ~ +5.58% ~ ~ MulconstU64/65538 ~ ~ ~ ~ ~ +5.16% ~ ~ ShiftArithmeticRight ~ ~ ~ ~ ~ -10.81% ~ +0.31% Switch8Predictable +14.69% ~ ~ ~ ~ -24.85% ~ ~ Switch8Unpredictable ~ -0.58% -3.80% ~ ~ -11.78% ~ -0.79% Switch32Predictable -10.33% +17.89% ~ ~ ~ +5.76% ~ ~ Switch32Unpredictable -3.15% +1.19% +9.42% ~ ~ -10.30% -5.09% +0.44% SwitchStringPredictable +70.88% +20.48% ~ ~ ~ +2.39% ~ +0.31% SwitchStringUnpredictable ~ +3.91% -5.06% -0.98% ~ +0.61% +2.03% ~ SwitchTypePredictable +146.58% -1.10% ~ -12.45% ~ -0.46% -3.81% ~ SwitchTypeUnpredictable +0.46% -0.83% ~ +4.18% ~ +0.43% ~ +0.62% SwitchInterfaceTypePredictable -13.41% -10.13% +11.03% ~ ~ -4.38% ~ +0.75% SwitchInterfaceTypeUnpredictable -6.37% -2.14% ~ -3.21% ~ -4.20% ~ +1.08% Fixes #63110. Fixes #75954. Change-Id: I55a876f08c6c14f419ce1a8cbba2eaae6c6efbf0 Reviewed-on: https://go-review.googlesource.com/c/go/+/714160 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>	2025-10-29 18:49:40 -07:00
khr@golang.org	3b96eebcbd	cmd/compile: rewrite the constant parts of the prove pass Handles a lot more cases where constant ranges can eliminate various (mostly bounds failure) paths. Fixes #66826 Fixes #66692 Fixes #48213 Update #57959 TODO: remove constant logic from poset code, no longer needed. Change-Id: Id196436fcd8a0c84c7d59c04f93bd92e26a0fd7e Reviewed-on: https://go-review.googlesource.com/c/go/+/599096 Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>	2024-08-07 16:07:33 +00:00
Dmitri Shuralyov	b2fd76ab8d	test: migrate remaining files to go:build syntax Most of the test cases in the test directory use the new go:build syntax already. Convert the rest. In general, try to place the build constraint line below the test directive comment in more places. For #41184. For #60268. Change-Id: I11c41a0642a8a26dc2eda1406da908645bbc005b Cq-Include-Trybots: luci.golang.try:gotip-linux-386-longtest,gotip-linux-amd64-longtest,gotip-windows-amd64-longtest Reviewed-on: https://go-review.googlesource.com/c/go/+/536236 Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>	2023-10-19 23:33:25 +00:00
Yi Yang	4ee1d542ed	cmd/compile: sparse conditional constant propagation sparse conditional constant propagation can discover optimization opportunities that cannot be found by just combining constant folding and constant propagation and dead code elimination separately. This is a re-submit of PR#59575, which fix a broken dominance relationship caught by ssacheck Updates https://github.com/golang/go/issues/59399 Change-Id: I57482dee38f8e80a610aed4f64295e60c38b7a47 GitHub-Last-Rev: `830016f24e` GitHub-Pull-Request: golang/go#60469 Reviewed-on: https://go-review.googlesource.com/c/go/+/498795 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Heschi Kreinick <heschi@google.com> Reviewed-by: Keith Randall <khr@golang.org>	2023-09-12 21:01:50 +00:00
Bryan Mills	02d234e34d	Revert "cmd/compile: sparse conditional constant propagation" This reverts CL 483875. Reason for revert: appears to cause internal compiler errors on the ssacheck builder. Change-Id: I662418384291470c1962c417797a5890dd9aa7a4 Reviewed-on: https://go-review.googlesource.com/c/go/+/497855 Reviewed-by: Keith Randall <khr@google.com> Run-TryBot: Bryan Mills <bcmills@google.com> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Auto-Submit: Bryan Mills <bcmills@google.com>	2023-05-24 14:39:34 +00:00
Yi Yang	fa50248ce6	cmd/compile: sparse conditional constant propagation sparse conditional constant propagation can discover optimization opportunities that cannot be found by just combining constant folding and constant propagation and dead code elimination separately. Updates #59399 Change-Id: Ia954e906480654a6f0cc065d75b5912f96f36b2e GitHub-Last-Rev: `90fc02db99` GitHub-Pull-Request: golang/go#59575 Reviewed-on: https://go-review.googlesource.com/c/go/+/483875 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com> Run-TryBot: Keith Randall <khr@golang.org>	2023-05-24 02:54:03 +00:00
Keith Randall	4a9064ef41	cmd/compile: fix ordering for short-circuiting ops Make sure the side effects inside short-circuited operations (&& and \|\|) happen correctly. Before this CL, we attached the side effects to the node itself using exprInPlace. That caused other side effects in sibling expressions to get reordered with respect to the short circuit side effect. Instead, rewrite a && b like: r := a if r { r = b } That code we can keep correctly ordered with respect to other side-effects extracted from part of a big expression. exprInPlace seems generally unsafe. But this was the only case where exprInPlace is called not at the top level of an expression, so I don't think the other uses can actually trigger an issue (there can't be a sibling expression). TODO: maybe those cases don't need "in place", and we can retire that function generally. This CL needed a small tweak to the SSA generation of OIF so that the short circuit optimization still triggers. The short circuit optimization looks for triangle but not diamonds, so don't bother allocating a block if it will be empty. Go 1 benchmarks are in the noise. Fixes #30566 Change-Id: I19c04296bea63cbd6ad05f87a63b005029123610 Reviewed-on: https://go-review.googlesource.com/c/go/+/165617 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2019-03-06 20:04:07 +00:00
Alberto Donizetti	8c1c6702f1	test: restore binary.BigEndian use in checkbce CL 136855 removed the encoding/binary dependency from the checkbce.go test by defining a local Uint64 to fix the noopt builder; then a more general mechanism to skip tests on the noopt builder was introduced in CL 136898, so we can now restore the binary.Uint64 calls in testbce. Change-Id: I3efbb41be0bfc446a7e638ce6a593371ead2684f Reviewed-on: https://go-review.googlesource.com/137056 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> Reviewed-by: Giovanni Bajo <rasky@develer.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-09-24 21:20:51 +00:00
Brad Fitzpatrick	b3369063e5	test: skip some tests on noopt builder Adds a new build tag "gcflags_noopt" that can be used in test/*.go tests. Fixes #27833 Change-Id: I4ea0ccd9e9e58c4639de18645fec81eb24a3a929 Reviewed-on: https://go-review.googlesource.com/136898 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-09-24 20:56:48 +00:00
Alberto Donizetti	6054fef17f	test: fix bcecheck test on noopt builder The noopt builder is configured by setting GO_GCFLAGS=-N -l, but the test/run.go test harness doesn't look at GO_GCFLAGS when processing "errorcheck" files, it just calls compile: cmdline := []string{goTool(), "tool", "compile", /* etc */} This is working as intended, since it makes the tests more robust and independent from the environment; errorcheck files are supposed to set additional building flags, when needed, like in: // errorcheck -0 -N -l The test/bcecheck.go test used to work on the noopt builder (even if bce is not active on -N -l) because the test was auto-contained and the file always compiled with optimizations enabled. In CL 107355, a new bce test dependent on an external package (encoding.binary) was added. On the noopt builder the external package is built using -N -l, and this causes a test failure that broke the noopt builder: https://build.golang.org/log/b2be319536285e5807ee9d66d6d0ec4d57433768 To reproduce the failure, one can do: $ go install -a -gcflags="-N -l" std $ go run run.go -- checkbce.go This change fixes the noopt builder breakage by removing the bce test dependency on encoding/binary by defining a local Uint64() function to be used in the test. Change-Id: Ife71aab662001442e715c32a0b7d758349a63ff1 Reviewed-on: https://go-review.googlesource.com/136855 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-09-24 16:54:52 +00:00
Giovanni Bajo	dd5e9b32ff	cmd/compile: add testcase for #24876 This is still not fixed, the testcase reflects that there are still a few boundchecks. Let's fix the good alternative with an explicit test though. Updates #24876 Change-Id: I4da35eb353e19052bd7b69ea6190a69ced8b9b3d Reviewed-on: https://go-review.googlesource.com/107355 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Giovanni Bajo <rasky@develer.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-09-02 10:34:51 +00:00
Ian Lance Taylor	f700f89b0b	test: add missing copyright header to checkbce.go Change-Id: Iafeb8e033c876f482caa17cca414fe13b0fadb12 Reviewed-on: https://go-review.googlesource.com/43613 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Dave Cheney <dave@cheney.net>	2017-05-18 00:05:25 +00:00
Giovanni Bajo	4fc45ae879	cmd/compile: improve generic rules for BCE based on AND operations. Match more patterns generated by the compiler where the index for a bound check is bounded through a AND operation, with different register sizes. These rules trigger a dozen of times in a bootstrap. Change-Id: Ic9fff16f21d08580f19a366c3ee1a372e58357d1 Reviewed-on: https://go-review.googlesource.com/37442 TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-03-04 16:13:07 +00:00
Keith Randall	73f92f9b04	cmd/compile: use len(s)<=cap(s) to remove more bounds checks When we discover a relation x <= len(s), also discover the relation x <= cap(s). That way, in situations like: a := s[x:] // tests 0 <= x <= len(s) b := s[:x] // tests 0 <= x <= cap(s) the second check can be eliminated. Fixes #16813 Change-Id: Ifc037920b6955e43bac1a1eaf6bac63a89cfbd44 Reviewed-on: https://go-review.googlesource.com/33633 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro> Reviewed-by: David Chase <drchase@google.com>	2017-02-02 17:45:58 +00:00
Josh Bleecher Snyder	6286188986	cmd/compile: optimize integer "in range" expressions Use unsigned comparisons to reduce from two comparisons to one for integer "in range" checks, such as a <= b && b < c. We already do this for bounds checks. Extend it to user code. This is much easier to do in the front end than SSA. A back end optimization would be more powerful, but this is a good start. This reduces the power of some of SSA prove inferences (#16653), but those regressions appear to be rare and not worth holding this CL for. Fixes #15844. Fixes #16697. strconv benchmarks: name old time/op new time/op delta Atof64Decimal-8 41.4ns ± 3% 38.9ns ± 2% -5.89% (p=0.000 n=24+25) Atof64Float-8 48.5ns ± 0% 46.8ns ± 3% -3.64% (p=0.000 n=20+23) Atof64FloatExp-8 97.7ns ± 4% 93.5ns ± 1% -4.25% (p=0.000 n=25+20) Atof64Big-8 187ns ± 8% 162ns ± 2% -13.54% (p=0.000 n=24+22) Atof64RandomBits-8 250ns ± 6% 233ns ± 5% -6.76% (p=0.000 n=25+25) Atof64RandomFloats-8 160ns ± 0% 152ns ± 0% -5.00% (p=0.000 n=21+22) Atof32Decimal-8 41.1ns ± 1% 38.7ns ± 2% -5.86% (p=0.000 n=24+24) Atof32Float-8 46.1ns ± 1% 43.5ns ± 3% -5.63% (p=0.000 n=21+24) Atof32FloatExp-8 101ns ± 4% 100ns ± 2% -1.59% (p=0.000 n=24+23) Atof32Random-8 136ns ± 3% 133ns ± 3% -2.83% (p=0.000 n=22+22) Atoi-8 33.8ns ± 3% 30.6ns ± 3% -9.51% (p=0.000 n=24+25) AtoiNeg-8 31.6ns ± 3% 29.1ns ± 2% -8.05% (p=0.000 n=23+24) Atoi64-8 48.6ns ± 1% 43.8ns ± 1% -9.81% (p=0.000 n=20+23) Atoi64Neg-8 47.1ns ± 4% 42.0ns ± 2% -10.83% (p=0.000 n=25+25) FormatFloatDecimal-8 177ns ± 9% 178ns ± 6% ~ (p=0.460 n=25+25) FormatFloat-8 282ns ± 6% 282ns ± 3% ~ (p=0.954 n=25+22) FormatFloatExp-8 259ns ± 7% 255ns ± 6% ~ (p=0.089 n=25+24) FormatFloatNegExp-8 253ns ± 6% 254ns ± 6% ~ (p=0.941 n=25+24) FormatFloatBig-8 340ns ± 6% 341ns ± 8% ~ (p=0.600 n=22+25) AppendFloatDecimal-8 79.4ns ± 0% 80.6ns ± 6% ~ (p=0.861 n=20+25) AppendFloat-8 175ns ± 3% 174ns ± 0% ~ (p=0.722 n=25+20) AppendFloatExp-8 142ns ± 4% 142ns ± 2% ~ (p=0.948 n=25+24) AppendFloatNegExp-8 137ns ± 2% 138ns ± 2% +0.70% (p=0.001 n=24+25) AppendFloatBig-8 218ns ± 3% 218ns ± 4% ~ (p=0.596 n=25+25) AppendFloatBinaryExp-8 80.0ns ± 4% 78.0ns ± 1% -2.43% (p=0.000 n=24+21) AppendFloat32Integer-8 82.3ns ± 3% 79.3ns ± 4% -3.69% (p=0.000 n=24+25) AppendFloat32ExactFraction-8 143ns ± 2% 143ns ± 0% ~ (p=0.177 n=23+19) AppendFloat32Point-8 175ns ± 3% 175ns ± 3% ~ (p=0.062 n=24+25) AppendFloat32Exp-8 139ns ± 2% 137ns ± 4% -1.05% (p=0.001 n=24+24) AppendFloat32NegExp-8 134ns ± 0% 137ns ± 4% +2.06% (p=0.000 n=22+25) AppendFloat64Fixed1-8 97.8ns ± 0% 98.6ns ± 3% ~ (p=0.711 n=20+25) AppendFloat64Fixed2-8 110ns ± 3% 110ns ± 5% -0.45% (p=0.037 n=24+24) AppendFloat64Fixed3-8 102ns ± 3% 102ns ± 3% ~ (p=0.684 n=24+24) AppendFloat64Fixed4-8 112ns ± 3% 110ns ± 0% -1.43% (p=0.000 n=25+18) FormatInt-8 3.18µs ± 4% 3.10µs ± 6% -2.54% (p=0.001 n=24+25) AppendInt-8 1.81µs ± 5% 1.80µs ± 5% ~ (p=0.648 n=25+25) FormatUint-8 812ns ± 6% 816ns ± 6% ~ (p=0.777 n=25+25) AppendUint-8 536ns ± 4% 538ns ± 3% ~ (p=0.798 n=20+22) Quote-8 605ns ± 6% 602ns ± 9% ~ (p=0.573 n=25+25) QuoteRune-8 99.5ns ± 8% 100.2ns ± 7% ~ (p=0.432 n=25+25) AppendQuote-8 361ns ± 3% 363ns ± 4% ~ (p=0.085 n=25+25) AppendQuoteRune-8 23.3ns ± 3% 22.4ns ± 2% -3.79% (p=0.000 n=25+24) UnquoteEasy-8 146ns ± 4% 145ns ± 5% ~ (p=0.112 n=24+24) UnquoteHard-8 804ns ± 6% 771ns ± 6% -4.10% (p=0.000 n=25+24) Change-Id: Ibd384e46e90f1cfa40503c8c6352a54c65b72980 Reviewed-on: https://go-review.googlesource.com/27652 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-08-25 03:49:16 +00:00
Alexandru Moșoi	6c6089b3fd	cmd/compile: bce when max and limit are consts Removes 49 more bound checks in make.bash. For example: var a[100]int for i := 0; i < 50; i++ { use a[i+25] } Change-Id: I85e0130ee5d07f0ece9b17044bba1a2047414ce7 Reviewed-on: https://go-review.googlesource.com/21379 Reviewed-by: David Chase <drchase@google.com> Run-TryBot: Alexandru Moșoi <alexandru@mosoi.ro> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-11 16:01:22 +00:00
Alexandru Moșoi	8448d3aace	cmd/compile: fold CMPconst and SHR Fold the comparison when the SHR result is small. Useful for: - murmur mix like hashing where higher bits are desirable, i.e. hash = uint32(i * C) >> 18 - integer log2 via DeBruijn sequence: http://graphics.stanford.edu/~seander/bithacks.html#IntegerLogDeBruijn Change-Id: If70ae18cb86f4cc83ab6213f88ced03cc4986156 Reviewed-on: https://go-review.googlesource.com/21514 Run-TryBot: Alexandru Moșoi <alexandru@mosoi.ro> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2016-04-07 13:28:13 +00:00
Alexandru Moșoi	1747788c56	cmd/compile: add a pass to print bound checks Since BCE happens over several passes (opt, loopbce, prove) it's easy to regress especially with rewriting. The pass is only activated with special debug flag. Change-Id: I46205982e7a2751156db8e875d69af6138068f59 Reviewed-on: https://go-review.googlesource.com/21510 Run-TryBot: Alexandru Moșoi <alexandru@mosoi.ro> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2016-04-05 12:46:59 +00:00

18 Commits