`lstat()` dominates in the case of small coverage samples #625

nedbat · 2017-12-22T23:13:46Z

Originally reported by Buck Evan (Bitbucket: bukzor, GitHub: bukzor)

The hypothesis library recently added coverage-led fuzzing, in which it needs to run a very short test many times, while examining the coverage between each trial. This (currently) involves many calls to coverage.Collector.save_data, which in turn causes many calls to realpath (and thus lstat). In the extreme case, lstat() ends up taking about 40% of the run time.

Can you please help me design a remedy? Some alternatives that I can think of:

add a cache to files.abs_file
replace the call to abs_file with a call to files.canonical_path, since canonical_path already has a cache
Delegate the filename-normalization responsibility from Collector to CoverageData, such that we can specialize CoverageData and fix this within our dependent library.

Bitbucket: https://bitbucket.org/ned/coveragepy/issue/625

The text was updated successfully, but these errors were encountered:

nedbat · 2017-12-23T20:53:44Z

Option 2 seems like the simplest thing to do, and I don't see a downside.

nedbat · 2017-12-23T21:07:19Z

Hmm, actually, canonical_filename searches for relative filenames on sys.path... I wish I understood better why it needs to do that.

nedbat · 2017-12-29T15:36:29Z

Original comment by Buck Evan (Bitbucket: bukzor, GitHub: bukzor)

I believe that's the semantics of a module with a relative __file__?

nedbat · 2018-05-14T18:20:06Z

This should be fixed with commit 44f0e230c68e611c9edfdf28fbdad73dd502afe5 (bb).

nedbat closed this as completed May 14, 2018

nedbat added major enhancement New feature or request labels Jun 23, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`lstat()` dominates in the case of small coverage samples #625

`lstat()` dominates in the case of small coverage samples #625

nedbat commented Dec 22, 2017

nedbat commented Dec 23, 2017

nedbat commented Dec 23, 2017

nedbat commented Dec 29, 2017

nedbat commented May 14, 2018

lstat() dominates in the case of small coverage samples #625

lstat() dominates in the case of small coverage samples #625

Comments

nedbat commented Dec 22, 2017

nedbat commented Dec 23, 2017

nedbat commented Dec 23, 2017

nedbat commented Dec 29, 2017

nedbat commented May 14, 2018

`lstat()` dominates in the case of small coverage samples #625

`lstat()` dominates in the case of small coverage samples #625