Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: maintain a cache of block information in ChainStore #12997

Open
nagisa opened this issue Feb 25, 2025 · 2 comments
Open

perf: maintain a cache of block information in ChainStore #12997

nagisa opened this issue Feb 25, 2025 · 2 comments

Comments

@nagisa
Copy link
Collaborator

nagisa commented Feb 25, 2025

Currently all ChainStore queries such as get_block_header, get_block_hash_by_height, get_block_height, get_block_header_on_chain_by_height and so on go straight to rocksdb.

In practice most of the queries will involve just a couple dozen most recent heights1, so in theory it should be possible to improve certain workloads significantly. If ChainStore maintained deserialized data for these recent blocks in warm memory, these queries could avoid a trip through the rocksdb layers (all the way to its block cache,) deserialization overhead, etc.

A particular workload where such queries appear to have a significant overhead is a native token transfer benchmark with a lower number of accounts. A profile of such a workload can be seen here.

Footnotes

  1. verify this theory! both in a benchmark and a real node -- I imagine we could log how far behind HEAD queries reach to for each query and then plot a frequency graph. My intuition is that it should take a shape of a normal distribution bell.

@mooori
Copy link
Contributor

mooori commented Feb 26, 2025

Can this lead to issues with stateless validation? For instance in the following scenario:

  • The node generating the state witness has Block B in it's cache, so the corresponding storage access is not recorded.
  • B is not included in the state witness.
  • A chunk validator cannot process a transaction that requires info related to B

If yes, one workaround might be clearing the cache before starting to produce a new state witness. Then the first access to B always goes through storage, but subsequent requests hit the cache.

@mooori
Copy link
Contributor

mooori commented Feb 26, 2025

Can this lead to issues with stateless validation?

It's not an issue because chunk validators maintain chain state by themselves and the related getters go through chain instead of state_witness (example).

Confirmed with @pugachAG that chunk validators can answer ChainStore::get_* queries by themselves without relying on the state witness, as long as they've processed the corresponding block.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants