Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: add cache to address codec #20122

Open
wants to merge 17 commits into
base: main
Choose a base branch
from
1 change: 1 addition & 0 deletions CHANGELOG.md
Expand Up @@ -65,6 +65,7 @@ Every module contains its own CHANGELOG.md. Please refer to the module you are i

### Improvements

* (codec) [#20122](https://github.com/cosmos/cosmos-sdk/pull/20122) Added a cache to address codec.
* (types) [#19869](https://github.com/cosmos/cosmos-sdk/pull/19869) Removed `Any` type from `codec/types` and replaced it with an alias for `cosmos/gogoproto/types/any`.
* (server) [#19854](https://github.com/cosmos/cosmos-sdk/pull/19854) Add customizability to start command.
* Add `StartCmdOptions` in `server.AddCommands` instead of `servertypes.ModuleInitFlags`. To set custom flags set them in the `StartCmdOptions` struct on the `AddFlags` field.
JulianToledano marked this conversation as resolved.
Show resolved Hide resolved
Expand Down
117 changes: 114 additions & 3 deletions codec/address/bech32_codec.go
Expand Up @@ -3,29 +3,100 @@ package address
import (
"errors"
"strings"
"sync"
"sync/atomic"

"github.com/hashicorp/golang-lru/simplelru"

"cosmossdk.io/core/address"
errorsmod "cosmossdk.io/errors"

"github.com/cosmos/cosmos-sdk/internal/conv"
sdkAddress "github.com/cosmos/cosmos-sdk/types/address"
"github.com/cosmos/cosmos-sdk/types/bech32"
sdkerrors "github.com/cosmos/cosmos-sdk/types/errors"
)

const (
// TODO: ideally sdk.GetBech32PrefixValAddr("") should be used but currently there's a cyclical import.
// Once globals are deleted the cyclical import won't happen.
suffixValAddr = "valoper"
suffixConsAddr = "valcons"
)

var errEmptyAddress = errors.New("empty address string is not allowed")

// cache variables
var (
accAddrMu sync.Mutex
accAddrCache *simplelru.LRU
consAddrMu sync.Mutex
consAddrCache *simplelru.LRU
valAddrMu sync.Mutex
valAddrCache *simplelru.LRU

isCachingEnabled atomic.Bool
)

func init() {
var err error
isCachingEnabled.Store(true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this is always true

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, followed the same style as in address.go, although I think it doesn't make much sense.


// in total the cache size is 61k entries. Key is 32 bytes and value is around 50-70 bytes.
// That will make around 92 * 61k * 2 (LRU) bytes ~ 11 MB
if accAddrCache, err = simplelru.NewLRU(60000, nil); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

personal preference: it would be great to have this configurable so people can trade CPU for memory or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I'm not sure what the best approach is to solve this, as LRUs are currently shared between codecs and defined as globals.

panic(err)
}
if consAddrCache, err = simplelru.NewLRU(500, nil); err != nil {
panic(err)
}
if valAddrCache, err = simplelru.NewLRU(500, nil); err != nil {
panic(err)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 do we need 3 different cache types? I was wondering if this is for sharding or pinning maybe? The LRU should keep the frequently used ones in cache anyway? WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bytes of the addresses are the same, but there are three possible outputs (AccAddress, ValAddress, ConsAddress). I see three options here:

  1. Use three different caches.
  2. Add the prefix of the address codec to the key
  3. Store a map in the LRU with prefix as keys (e.g., map["cosmos"] = "cosmos1dr3...")

I've chosen the first option to maintain consistency with how things were done before.

cosmos-sdk/types/address.go

Lines 100 to 111 in 8e60f3b

var (
// AccAddress.String() is expensive and if unoptimized dominantly showed up in profiles,
// yet has no mechanisms to trivially cache the result given that AccAddress is a []byte type.
accAddrMu sync.Mutex
accAddrCache *simplelru.LRU
consAddrMu sync.Mutex
consAddrCache *simplelru.LRU
valAddrMu sync.Mutex
valAddrCache *simplelru.LRU
isCachingEnabled atomic.Bool
)

}

type Bech32Codec struct {
Bech32Prefix string
}

var _ address.Codec = &Bech32Codec{}
type cachedBech32Codec struct {
codec Bech32Codec
mu *sync.Mutex
JulianToledano marked this conversation as resolved.
Show resolved Hide resolved
cache *simplelru.LRU
}

var (
_ address.Codec = &Bech32Codec{}
_ address.Codec = &cachedBech32Codec{}
)

func NewBech32Codec(prefix string) address.Codec {
return Bech32Codec{prefix}
ac := Bech32Codec{prefix}
if !isCachingEnabled.Load() {
return ac
}

lru := accAddrCache
mu := &accAddrMu
if strings.HasSuffix(prefix, suffixValAddr) {
lru = valAddrCache
mu = &valAddrMu
} else if strings.HasSuffix(prefix, suffixConsAddr) {
lru = consAddrCache
mu = &consAddrMu
}

return cachedBech32Codec{
codec: ac,
cache: lru,
mu: mu,
}
}

// StringToBytes encodes text to bytes
func (bc Bech32Codec) StringToBytes(text string) ([]byte, error) {
if len(strings.TrimSpace(text)) == 0 {
return []byte{}, errors.New("empty address string is not allowed")
return []byte{}, errEmptyAddress
}

hrp, bz, err := bech32.DecodeAndConvert(text)
Expand Down Expand Up @@ -61,3 +132,43 @@ func (bc Bech32Codec) BytesToString(bz []byte) (string, error) {

return text, nil
}

func (cbc cachedBech32Codec) BytesToString(bz []byte) (string, error) {
if len(bz) == 0 {
return "", nil
}

// caches prefix is added to the key to make sure keys are unique in case codecs with different bech32 prefix are defined.
key := cbc.codec.Bech32Prefix + conv.UnsafeBytesToStr(bz)
cbc.mu.Lock()
defer cbc.mu.Unlock()

if addr, ok := cbc.cache.Get(key); ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is possible but in case the bytes match an existing bech32 address, it can create some trouble. Adding key prefixes may solve this or not sharing the same cache.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See first comment. Currently, there are three different caches, but adding key prefixes may be a solution if we prefer just one.

return addr.(string), nil
}

addr, err := cbc.codec.BytesToString(bz)
if err != nil {
return "", err
}
cbc.cache.Add(key, addr)

return addr, nil
}
JulianToledano marked this conversation as resolved.
Show resolved Hide resolved

func (cbc cachedBech32Codec) StringToBytes(text string) ([]byte, error) {
cbc.mu.Lock()
defer cbc.mu.Unlock()

if addr, ok := cbc.cache.Get(text); ok {
return addr.([]byte), nil
}

addr, err := cbc.codec.StringToBytes(text)
if err != nil {
return nil, err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 it can be worth to cache failures, too. Some benchmarks would be interesting.

}
cbc.cache.Add(text, addr)

return addr, nil
}
JulianToledano marked this conversation as resolved.
Show resolved Hide resolved
133 changes: 133 additions & 0 deletions codec/address/bech32_codec_test.go
@@ -0,0 +1,133 @@
package address

import (
"crypto/rand"
"sync"
"sync/atomic"
"testing"

"github.com/hashicorp/golang-lru/simplelru"
"github.com/stretchr/testify/assert"

"github.com/cosmos/cosmos-sdk/internal/conv"
)

func generateAddresses(totalAddresses int) ([][]byte, error) {
keys := make([][]byte, totalAddresses)
addr := make([]byte, 32)
for i := 0; i < totalAddresses; i++ {
_, err := rand.Read(addr)
if err != nil {
return nil, err
}
keys[i] = addr
}

return keys, nil
}

func TestNewBech32Codec(t *testing.T) {
tests := []struct {
name string
prefix string
lru *simplelru.LRU
address string
}{
{
name: "create accounts cached bech32 codec",
prefix: "cosmos",
lru: accAddrCache,
address: "cosmos1p8s0p6gqc6c9gt77lgr2qqujz49huhu6a80smx",
},
{
name: "create validator cached bech32 codec",
prefix: "cosmosvaloper",
lru: valAddrCache,
address: "cosmosvaloper1sjllsnramtg3ewxqwwrwjxfgc4n4ef9u2lcnj0",
},
{
name: "create consensus cached bech32 codec",
prefix: "cosmosvalcons",
lru: consAddrCache,
address: "cosmosvalcons1ntk8eualewuprz0gamh8hnvcem2nrcdsgz563h",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
assert.Equal(t, tt.lru.Len(), 0)
ac := NewBech32Codec(tt.prefix)
cached, ok := ac.(cachedBech32Codec)
assert.True(t, ok)
assert.Equal(t, cached.cache, tt.lru)

addr, err := ac.StringToBytes(tt.address)
assert.NoError(t, err)
assert.Equal(t, tt.lru.Len(), 1)

cachedAddr, ok := tt.lru.Get(tt.address)
assert.True(t, ok)
assert.Equal(t, addr, cachedAddr)

accAddr, err := ac.BytesToString(addr)
assert.NoError(t, err)
assert.Equal(t, tt.lru.Len(), 2)

cachedStrAddr, ok := tt.lru.Get(cached.codec.Bech32Prefix + conv.UnsafeBytesToStr(addr))
assert.True(t, ok)
assert.Equal(t, accAddr, cachedStrAddr)
})
}
}
JulianToledano marked this conversation as resolved.
Show resolved Hide resolved

func TestMultipleBech32Codec(t *testing.T) {
JulianToledano marked this conversation as resolved.
Show resolved Hide resolved
cosmosAc, ok := NewBech32Codec("cosmos").(cachedBech32Codec)
assert.True(t, ok)
stakeAc := NewBech32Codec("stake").(cachedBech32Codec)
assert.True(t, ok)
assert.Equal(t, cosmosAc.cache, stakeAc.cache)

addr := make([]byte, 32)
_, err := rand.Read(addr)
assert.NoError(t, err)

cosmosAddr, err := cosmosAc.BytesToString(addr)
assert.NoError(t, err)
stakeAddr, err := stakeAc.BytesToString(addr)
assert.NoError(t, err)
assert.True(t, cosmosAddr != stakeAddr)

cachedCosmosAddr, err := cosmosAc.BytesToString(addr)
assert.NoError(t, err)
assert.Equal(t, cosmosAddr, cachedCosmosAddr)

cachedStakeAddr, err := stakeAc.BytesToString(addr)
assert.NoError(t, err)
assert.Equal(t, stakeAddr, cachedStakeAddr)
}

func TestBech32CodecRace(t *testing.T) {
ac := NewBech32Codec("cosmos")
myAddrBz := []byte{0x1, 0x2, 0x3, 0x4, 0x5}

var (
wgStart, wgDone sync.WaitGroup
errCount atomic.Uint32
)
const n = 3
wgStart.Add(n)
wgDone.Add(n)
for i := 0; i < n; i++ {
go func() {
wgStart.Done()
wgStart.Wait() // wait for all routines started

got, err := ac.BytesToString(myAddrBz)
if err != nil || got != "cosmos1qypqxpq9dc9msf" {
errCount.Add(1)
}
wgDone.Done()
}()
}
wgDone.Wait() // wait for all routines completed
assert.Equal(t, errCount.Load(), uint32(0))
}
35 changes: 35 additions & 0 deletions codec/address/bench_test.go
@@ -0,0 +1,35 @@
package address

import (
"testing"

"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"

"cosmossdk.io/core/address"
)

func BenchmarkCodecWithCache(b *testing.B) {
cdc := NewBech32Codec("cosmos")
bytesToString(b, cdc)
}

func BenchmarkCodecWithoutCache(b *testing.B) {
cdc := Bech32Codec{Bech32Prefix: "cosmos"}
bytesToString(b, cdc)
}

func bytesToString(b *testing.B, cdc address.Codec) {
b.Helper()
addresses, err := generateAddresses(10)
require.NoError(b, err)

b.Helper()
b.ReportAllocs()
b.ResetTimer()

for i := 0; i < b.N; i++ {
_, err := cdc.BytesToString(addresses[i%len(addresses)])
assert.NoError(b, err)
}
}
66 changes: 66 additions & 0 deletions codec/address/fuzz_test.go
@@ -0,0 +1,66 @@
package address

import (
"errors"
"testing"

"github.com/stretchr/testify/require"

"cosmossdk.io/core/address"

sdkAddress "github.com/cosmos/cosmos-sdk/types/address"
)

func FuzzCachedAddressCodec(f *testing.F) {
if testing.Short() {
f.Skip()
}

addresses, err := generateAddresses(2)
require.NoError(f, err)

for _, addr := range addresses {
f.Add(addr)
}
cdc := NewBech32Codec("cosmos")

f.Fuzz(func(t *testing.T, addr []byte) {
checkAddress(t, addr, cdc)
})
}

func FuzzAddressCodec(f *testing.F) {
if testing.Short() {
f.Skip()
}
addresses, err := generateAddresses(2)
require.NoError(f, err)

for _, addr := range addresses {
f.Add(addr)
}

cdc := Bech32Codec{Bech32Prefix: "cosmos"}

f.Fuzz(func(t *testing.T, addr []byte) {
checkAddress(t, addr, cdc)
})
}

func checkAddress(t *testing.T, addr []byte, cdc address.Codec) {
t.Helper()
if len(addr) > sdkAddress.MaxAddrLen {
return
}
strAddr, err := cdc.BytesToString(addr)
if err != nil {
t.Fatal(err)
}
b, err := cdc.StringToBytes(strAddr)
if err != nil {
if !errors.Is(errEmptyAddress, err) {
t.Fatal(err)
}
}
require.Equal(t, len(addr), len(b))
}