Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[stdlib][feature request] use Tuple[K,V] instead of DictEntry[K,V] #4138

Closed
wants to merge 2 commits into from

Conversation

illiasheshyn
Copy link
Contributor

Signed-off-by: Illia Sheshyn [email protected]

as per #2554

Changes

  • Dict type owns computed key hashes
  • removes usages and doc mentions of DictEntry type

self.size -= 1
return entry_value^.reap_value()
return entry_value[1]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't actually think this is right, but I haven't learnt Mojo to the extent of refactoring this bit correctly.

@gryznar
Copy link
Contributor

gryznar commented Mar 11, 2025

What is the impact of this change in terms of performance?

@martinvuyk
Copy link
Contributor

What is the impact of this change in terms of performance?

I also want to know that. FYI public benchmarking script is currently broken, so you will need to use the trick from #4054 for now

@illiasheshyn
Copy link
Contributor Author

I'll run the benchmarks 👍 Does the benchmarking script run with the wrong stdlib without the patch @martinvuyk ?

Also what's the approximate runtime to execute all 9 benchmarks? For reference, I'm on M1.

@martinvuyk
Copy link
Contributor

martinvuyk commented Mar 11, 2025

Does the benchmarking script run with the wrong stdlib without the patch @martinvuyk ?

yes It currently uses the packaged stdlib instead of the in-tree one

Also what's the approximate runtime to execute all 9 benchmarks? For reference, I'm on M1.

No idea, never timed the dict benchmarks. You can try it with fewer dict sizes first 🤷‍♂️

@illiasheshyn
Copy link
Contributor Author

illiasheshyn commented Mar 12, 2025

After running the benchmarks I no longer think this is a good idea.

main:

-- Testing: 1 tests, 1 workers --
Testing: 
PASS: Mojo Standard Library Benchmarks :: collections/bench_dict.mojo (1 of 1)
Exit Code: 0

Command Output (stdout):
--
Running bench_dict_init
Running bench_dict_insert[10]
Running bench_dict_lookup[10]
Running bench_dict_insert[30]
Running bench_dict_lookup[30]
Running bench_dict_insert[50]
Running bench_dict_lookup[50]
Running bench_dict_insert[100]
Running bench_dict_lookup[100]
Running bench_dict_insert[1000]
Running bench_dict_lookup[1000]
Running bench_dict_insert[10000]
Running bench_dict_lookup[10000]
--------------------------------------------------------------
| name                     | met (ms)              | iters   |
--------------------------------------------------------------
| bench_dict_init          | 0.11664069999999999   | 10000   |
| bench_dict_insert[10]    | 0.0023127957514038847 | 518383  |
| bench_dict_lookup[10]    | 0.0004064344464499991 | 2955126 |
| bench_dict_insert[30]    | 0.0015739644349956289 | 762941  |
| bench_dict_lookup[30]    | 0.0003881338210539945 | 3089200 |
| bench_dict_insert[50]    | 0.0020235444374895606 | 592709  |
| bench_dict_lookup[50]    | 0.0004001546309666075 | 2989052 |
| bench_dict_insert[100]   | 0.002014405277635384  | 596934  |
| bench_dict_lookup[100]   | 0.0004345849029736721 | 2759921 |
| bench_dict_insert[1000]  | 0.0017241858515323965 | 696556  |
| bench_dict_lookup[1000]  | 0.0003855355258063531 | 3112315 |
| bench_dict_insert[10000] | 0.0031414045873729016 | 380610  |
| bench_dict_lookup[10000] | 0.0003457493817437144 | 3410398 |
--------------------------------------------------------------

"bench_dict_memory_size[ 10 ]", 576 ,0
"bench_dict_memory_size[ 30 ]", 2160 ,0
"bench_dict_memory_size[ 50 ]", 4272 ,0
"bench_dict_memory_size[ 100 ]", 8752 ,0
"bench_dict_memory_size[ 1000 ]", 69680 ,0
"bench_dict_memory_size[ 10000 ]", 557104 ,0

--
Command Output (stderr):
--
RUN: at line 13: mojo /Users/illiasheshyn/projects/main/max/mojo/stdlib/benchmarks/collections/bench_dict.mojo
+ mojo /Users/illiasheshyn/projects/main/max/mojo/stdlib/benchmarks/collections/bench_dict.mojo

--

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. 

Testing Time: 755.38s

Total Discovered Tests: 1
  Passed: 1 (100.00%)

vs the current branch:

-- Testing: 1 tests, 1 workers --
Testing: 
PASS: Mojo Standard Library Benchmarks :: collections/bench_dict.mojo (1 of 1)
Exit Code: 0

Command Output (stdout):
--
Running bench_dict_init
Running bench_dict_insert[10]
Running bench_dict_lookup[10]
Running bench_dict_insert[30]
Running bench_dict_lookup[30]
Running bench_dict_insert[50]
Running bench_dict_lookup[50]
Running bench_dict_insert[100]
Running bench_dict_lookup[100]
Running bench_dict_insert[1000]
Running bench_dict_lookup[1000]
Running bench_dict_insert[10000]
Running bench_dict_lookup[10000]
--------------------------------------------------------------
| name                     | met (ms)              | iters   |
--------------------------------------------------------------
| bench_dict_init          | 0.1753140274050007    | 7079    |
| bench_dict_insert[10]    | 0.0030153959273198573 | 401210  |
| bench_dict_lookup[10]    | 0.0005599123669446356 | 2182738 |
| bench_dict_insert[30]    | 0.0022467145505445105 | 532956  |
| bench_dict_lookup[30]    | 0.0005172997543559624 | 2316767 |
| bench_dict_insert[50]    | 0.002724266163198065  | 436439  |
| bench_dict_lookup[50]    | 0.0005368650139242661 | 2238179 |
| bench_dict_insert[100]   | 0.0026650843007495887 | 446378  |
| bench_dict_lookup[100]   | 0.0005356770767102103 | 2241201 |
| bench_dict_insert[1000]  | 0.0025914859420305596 | 460415  |
| bench_dict_lookup[1000]  | 0.0004987785432849825 | 2404506 |
| bench_dict_insert[10000] | 0.00419460891683742   | 283060  |
| bench_dict_lookup[10000] | 0.0004564153054408435 | 2643818 |
--------------------------------------------------------------

"bench_dict_memory_size[ 10 ]", 472 ,0
"bench_dict_memory_size[ 30 ]", 1672 ,0
"bench_dict_memory_size[ 50 ]", 3272 ,0
"bench_dict_memory_size[ 100 ]", 6728 ,0
"bench_dict_memory_size[ 1000 ]", 53320 ,0
"bench_dict_memory_size[ 10000 ]", 426056 ,0

--
Command Output (stderr):
--
RUN: at line 13: mojo /Users/illiasheshyn/projects/main/max/mojo/stdlib/benchmarks/collections/bench_dict.mojo
+ mojo /Users/illiasheshyn/projects/main/max/mojo/stdlib/benchmarks/collections/bench_dict.mojo

--

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. 

Testing Time: 789.58s

Total Discovered Tests: 1
  Passed: 1 (100.00%)

Surfacing the DictEntry as a Tuple does not seem like a beneficial endeavor wrt runtime performance, however, if the goal is to be compatible with Python then I guess we can do that.

@gryznar
Copy link
Contributor

gryznar commented Mar 12, 2025

I don't think that sacrificing performance for Python compatibility is the way to go for language which aims to squeeze every bit of performance from hardware. I don't see sense of being compatible with Python but at moderate speed (which this compatibility will cause). In such case why to even bother to choose Mojo if there is already C++ for CPU or CUDA for GPU (not speaking about raw assembly) which do not care about Python ballast and will be way faster?

@martinvuyk
Copy link
Contributor

martinvuyk commented Mar 12, 2025

I also don't think it's worth sacrificing performance yet. Our dict is still slower than other implementations.

My hope is that in the future we get to control how a type is unpacked with a dunder method, and that way DictEntry can be unpacked with tuple syntax while still having .key and .value and having nice pythonic syntax.

I also think we could make _DictEntryIter.__next__() return a Tuple[K, V] without a pointer to be able to be unpacked. But I don't know what the mutability and perf. implications inside the loop itself that change would have. I think someone from Modular would have to chime in on this.

@gryznar
Copy link
Contributor

gryznar commented Mar 12, 2025

I'd love to see priorities as follow:

  1. SOTA performance keeping correctness
  2. Safety
  3. Language consistency (API not being thorn between C, Python and Mojo itself)
  4. Python compatibility

@soraros
Copy link
Contributor

soraros commented Mar 12, 2025

I'd like to know why there is any performance difference at all. I thought they would generate identical assembly.

@illiasheshyn
Copy link
Contributor Author

illiasheshyn commented Mar 12, 2025

I'd like to know why there is any performance difference at all. I thought they would generate identical assembly.

My understanding so far is the change in data locality. This code is doing an additional index for the key's hash, which is now in its own array, while the key and hash are very frequently read together.

I'll think about it some more, but the problem that I see would be constructing a tuple for the entry if I were to change how those values are stored within Dict

@soraros
Copy link
Contributor

soraros commented Mar 13, 2025

I see. I missed the part where the hash is no longer stored in the entry. So this PR is actually two changes combined: moving the hash out of line and converting the struct to a tuple. We need to be careful with the former I think.

@illiasheshyn
Copy link
Contributor Author

We need to be careful with the former I think.

@soraros yeah, that was just an idea I thought was easy to try out. Now that I see the performance impact I don't want to proceed with this change. Hope that the PR assists future discussions in some way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants