Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial analysis is slow for diesel #4186

Closed
weiznich opened this issue Apr 28, 2020 · 24 comments
Closed

Initial analysis is slow for diesel #4186

weiznich opened this issue Apr 28, 2020 · 24 comments
Labels
S-actionable Someone could pick this issue up and work on it right now

Comments

@weiznich
Copy link
Contributor

Now with #4053 fixed I've tried the new weekly release (rust-analyzer 90f8378) on diesel again. It does work again, but the initial analysis needs now a couple of minutes (maybe 10 or so, didn't measure it exactly) of time, till completion is working and all type hints are shown. As this worked much faster on the previous version (rust-analyzer c388130) I've used I would consider this as performance regression.

Pinging @flodiebold as this is probably caused by new Chalk recursive solver in the 2020-04-20 release.

@flodiebold
Copy link
Member

How long does rust-analyzer analysis-stats . take for you in the diesel repository? I don't remember it taking that long with the fix 🤔

@weiznich
Copy link
Contributor Author

I've tried to run rust-analyzer analysis-stats . on the diesel repo. I've aborted the command after more than 2 hours, after it seemed to be stuck with the following message for more than 1 hour:

rust-analyzer --version
rust-analyzer 90f8378
rust-analyzer analysis-stats .                         
[ERROR ra_project_model] cyclic dependency diesel -> diesel_derives
Database loaded, 156 roots, 293.606385ms
Crates in this dir: 46
Total modules found: 602
Total declarations: 8406
Total functions: 3688
Item Collection: 18.829232898s, 0b allocated 0b resident
293/3688 7% processing: insert::insert_only_default_values_with_returning^C
rust-analyzer analysis-stats .  8080,46s user 17,49s system 99% cpu 2:15:38,94 total

(My old version does not complete this command either, it crashes with a trace similar to that one in #4053)

Steps to reproduce:

git clone https://github.com/diesel-rs/diesel
git checkout be0854094c79ae99ba3c893242a4be358ac0b0c5
cd diesel/diesel
rust-analyzer analysis-stats .

@lnicola
Copy link
Member

lnicola commented Apr 28, 2020

On my system it gets a bit further:

[ERROR ra_project_model] cyclic dependency diesel -> diesel_derives
Database loaded, 156 roots, 20.730520662s
Crates in this dir: 46
Total modules found: 602
Total declarations: 8406
Total functions: 3688
Item Collection: 16.365732395s, 0b allocated 0b resident
1656/3688 44% processing: types::test_chrono_types_sqlite^C                                                                                      

A quick profile:

8.94%  libc-2.31.so      [.] _int_malloc
8.02%  libc-2.31.so      [.] _int_free
7.23%  rust-analyzer     [.] <chalk_ir::Ty<I> as chalk_ir::fold::SuperFold<I,TI>>::super_fold_with
4.77%  libc-2.31.so      [.] malloc
4.71%  rust-analyzer     [.] <alloc::vec::Vec<T> as alloc::vec::SpecExtend<T,I>>::from_iter
3.02%  libc-2.31.so      [.] malloc_consolidate
2.47%  rust-analyzer     [.] chalk_ir::fold::boring_impls::<impl chalk_ir::fold::Fold<I,TI> for chalk_ir::Parameter<I>>::fold_with
2.44%  rust-analyzer     [.] core::iter::adapters::process_results
2.32%  rust-analyzer     [.] <chalk_ir::Goal<I> as chalk_ir::fold::SuperFold<I,TI>>::super_fold_with
2.13%  rust-analyzer     [.] chalk_solve::recursive::Solver<I>::solve_goal
1.99%  libc-2.31.so      [.] realloc
1.69%  libc-2.31.so      [.] cfree@GLIBC_2.2.5
1.53%  rust-analyzer     [.] chalk_ir::_DERIVE_chalk_ir_fold_Fold_I_TI_FOR_WhereClause::<impl chalk_ir::fold::Fold<I,_TI> for chalk_ir::WhereClause<I>>::fold_with
1.35%  rust-analyzer     [.] core::sync::atomic::atomic_load
1.33%  rust-analyzer     [.] alloc::sync::Arc<T>::drop_slow
1.25%  rust-analyzer     [.] <salsa::derived::DerivedStorage<DB,Q,MP> as salsa::plumbing::QueryStorageOps<DB,Q>>::try_fetch
1.21%  rust-analyzer     [.] core::ptr::drop_in_place
1.04%  rust-analyzer     [.] <chalk_ir::Ty<I> as chalk_ir::visit::SuperVisit<I>>::super_visit_with
1.02%  libc-2.31.so      [.] __memmove_avx_unaligned_erms

@flodiebold
Copy link
Member

90f8378 is not the current weekly release, it's one week old and does not include the performance improvements.

@lnicola
Copy link
Member

lnicola commented Apr 28, 2020

@flodiebold it's still slow for me (on 5671bac)

@flodiebold flodiebold reopened this Apr 28, 2020
@flodiebold
Copy link
Member

So I've looked into the overflow crash we sometimes have in Diesel, and it's caused by these impls:

                impl<DB: Backend> QueryFragment<DB> for star where
                    <table as QuerySource>::FromClause: QueryFragment<DB>,

of which there are a lot. Currently when we encounter that where clause, it causes us to try to enumerate all types that could implement QueryFragment. So that's probably also a cause of the slowness in Diesel.

@weiznich
Copy link
Contributor Author

weiznich commented May 5, 2020

Is there anything that could done here from diesels side to improve the situation?
Those code fragments are generated by the diesel::table! macro and are part of most applications that use diesel. In fact for each database table one of those definitions is generated, each for a distinct pair of star and table types. So I would assume the analysis time is directly correlated to the number of table! call?

@flodiebold
Copy link
Member

I think they are basically fine, we're just handling them a bit wrong. There's a WIP PR in Chalk that might fix this.

@pksunkara
Copy link
Contributor

This has definitely improved per my observations. It only takes 5 mins for RA to be ready now compared to 30 mins from before.

@lnicola lnicola added the S-unactionable Issue requires feedback, design decisions or is blocked on other work label Jan 22, 2021
@pksunkara
Copy link
Contributor

This has regressed since this week's release. RA process is stuck at 100% when indexing diesel. (Ran it for more than 30 mins).

@weiznich
Copy link
Contributor Author

weiznich commented Mar 9, 2021

@edwin0cheng Can this probably be caused by #7513? According to my last investigations with the rustc macro by example performance this seems to be a main bottle neck for diesel. (See here and here.
@lnicola What is required to make this actionable again. It's fine to apply such labels, but you should maybe leave something for the issue creator there to get this going again. Something like: This is blocked by xyz or we do need a more minimal example to reproduce that or whatever.

As another note: It may be useful to run some sort of regular benchmark on a set of crates that are known to stress rustc/rust-analyzer to prevent/detect such regressions earlier. Maybe it's even possible to use the rustc-perf-testsuite for this as this is already a collection of crates with similar problems.

@edwin0cheng
Copy link
Member

edwin0cheng commented Mar 9, 2021

Yes, it is infact caused by #7513, I am trying to figure out what happened now.

[Update]
It stuck on https://github.com/diesel-rs/diesel/blob/493541ddcecfb110814681ae83289adb149ee253/diesel/src/type_impls/tuples.rs#L502, and Um.. this macro call is huge. https://github.com/diesel-rs/diesel/blob/493541ddcecfb110814681ae83289adb149ee253/diesel/src/macros/tuples.rs and I need sometimes to investigate.

@lnicola
Copy link
Member

lnicola commented Mar 9, 2021

@weiznich at that point the chalk issue was more or less fixed and we weren't aware of other specific problems. But you're right, I'll try to comment when marking issues as such.

It may be useful to run some sort of regular benchmark on a set of crates that are known to stress rustc/rust-analyzer to prevent/detect such regressions earlier. Maybe it's even possible to use the rustc-perf-testsuite for this as this is already a collection of crates with similar problems.

We actually have that, but only for two or three crates: https://rust-analyzer.github.io/metrics/. They're gathered on GitHub Actions (we don't have other infra) and they're pretty noisy as-is. I don't think #7513 ended up in there, but it wouldn't have been visible anyway (the big bump you see was an unrelated PR). I actually wanted to add diesel and maybe some blockchain projects to it, but if analysis takes 30 minutes, we probably can't afford it.

@lnicola lnicola added S-actionable Someone could pick this issue up and work on it right now and removed S-unactionable Issue requires feedback, design decisions or is blocked on other work labels Mar 9, 2021
@pksunkara
Copy link
Contributor

Before yesterday's release, analysis of diesel was only taking around 5 mins. I think it should be okay adding to benchmark suite with that in mind.

@weiznich
Copy link
Contributor Author

weiznich commented Mar 9, 2021

We actually have that, but only for two or three crates: https://rust-analyzer.github.io/metrics/. They're gathered on GitHub Actions (we don't have other infra) and they're pretty noisy as-is. I don't think #7513 ended up in there, but it wouldn't have been visible anyway (the big bump you see was an unrelated PR). I actually wanted to add diesel and maybe some blockchain projects to it, but if analysis takes 30 minutes, we probably can't afford it.

Can you provide a link to the actual github action running those benchmarks? I assume they are not really time critical, right? So adding more benchmark crates should be fine as long as the queue does not grow continuously. It might be even OK to just run certain benchmarks once a day or something like that detect large regressions as this one + have some way to run them for certain specific PR's. I assume most PR's do not change the performance that much, but a rewrite of some "core" component is certainly something that should maybe have a separate benchmark run. I've setup a similar "on-demand" benchmark action for diesel, so I can provide at least some knowledge on how to do that.

@lnicola
Copy link
Member

lnicola commented Mar 9, 2021

It happens in https://github.com/rust-analyzer/rust-analyzer/blob/master/.github/workflows/metrics.yaml, which points to https://github.com/rust-analyzer/rust-analyzer/blob/master/xtask/src/metrics.rs.

I'm not sure what the free GitHub Actions budget is, but I assume that it stops working or gets throttled after a certain number of minutes. We already take about an hour for the nightly release artifacts -- multiplied by ten platforms --, and spending say half an hour more on each pull request would be a lot, since we merge about 30-60 of them each week. That's in addition to running the tests on PRs and bors r+ (about 10-15 minutes each).

As for running these only on-demand, I would prefer to avoid it. I can probably donate some CPU time on hardware I own to keep running them on every PR.

@weiznich
Copy link
Contributor Author

weiznich commented Mar 9, 2021

According to this github documentation the limit for the free tier are 20 concurrent jobs a max 6h per job. So from that point of view there would likely be quite a bit of room for additional benchmarks left.

@lnicola
Copy link
Member

lnicola commented Mar 9, 2021

The limit seems to be either 2000 minutes per month or unlimited: https://github.com/pricing. It says "free for public repositories" in the table, but not if you scroll down to "Code workflow".

@pksunkara
Copy link
Contributor

That limit is only for private repos.

@flodiebold
Copy link
Member

@edwin0cheng can we make an MBE benchmark out of those macro calls?

@weiznich
Copy link
Contributor Author

Probably interesting candidates form diesel:

  • Expanding the table! macro. This one is likely to be used by quite a lot of third party crates which depend on diesel to setup their schema definition.
  • This macro call in diesel itself. It is used to generate a lot of trait impls for tuples of various sizes, via this and this macro. You can indirectly control the complexity here by the x-columns-table feature from diesel (with x equals on of 32, 64 or 128, which basically controls the max tuple size)

@edwin0cheng
Copy link
Member

@edwin0cheng can we make an MBE benchmark out of those macro calls?

One of problems of making these macro calls inside MBE benchmark are, they are all recursive macro calls, which MBE itself does not handle.

I would recommend adding another macro benchmark in ide level instead.

@edwin0cheng
Copy link
Member

I am currently testing with this which is much smaller to work with.

32-column-tables takes around 2 mins for rust-analyzer analysis-stats but 64-column-tables just is stuck.

bors bot added a commit that referenced this issue Mar 13, 2021
7994: Speed up mbe matching in heavy recursive cases r=edwin0cheng a=edwin0cheng

In some cases (e.g.  #4186), mbe matching is very slow due to a lot of copy and allocation for bindings, this PR try to solve this problem by introduce a semi "link-list" approach for bindings building.

I used this [test case](https://github.com/weiznich/minimal_example_for_rust_81262) (for `features(32-column-tables)`) to run following command to benchmark:
```
time rust-analyzer analysis-stats  --load-output-dirs ./ 
```

Before this PR : 2 mins
After this PR: 3 seconds.

However, for 64-column-tables cases, we still need 4 mins to complete. 

I will try to investigate in the following weeks.

bors r+




Co-authored-by: Edwin Cheng <[email protected]>
@edwin0cheng
Copy link
Member

@weiznich I tested #7994 for diesel and it seem to not stuck again, would you mind to try it in your machine ?

bors bot added a commit that referenced this issue Mar 13, 2021
7994: Speed up mbe matching in heavy recursive cases r=edwin0cheng a=edwin0cheng

In some cases (e.g.  #4186), mbe matching is very slow due to a lot of copy and allocation for bindings, this PR try to solve this problem by introduce a semi "link-list" approach for bindings building.

I used this [test case](https://github.com/weiznich/minimal_example_for_rust_81262) (for `features(32-column-tables)`) to run following command to benchmark:
```
time rust-analyzer analysis-stats  --load-output-dirs ./ 
```

Before this PR : 2 mins
After this PR: 3 seconds.

However, for 64-column-tables cases, we still need 4 mins to complete. 

I will try to investigate in the following weeks.

bors r+




Co-authored-by: Edwin Cheng <[email protected]>
@Veykril Veykril closed this as completed Jan 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-actionable Someone could pick this issue up and work on it right now
Projects
None yet
Development

No branches or pull requests

6 participants