Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support parallel compilation of all input files #1382

Open
xermicus opened this issue Jun 23, 2023 · 6 comments
Open

Support parallel compilation of all input files #1382

xermicus opened this issue Jun 23, 2023 · 6 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@xermicus
Copy link
Contributor

Input files can be compiled in parallel.

  • Implement this for solang compile
  • Occurances of parallel solang compile in our CI jobs are no longer needed
@xermicus xermicus added enhancement New feature or request good first issue Good for newcomers labels Jun 23, 2023
@seanyoung
Copy link
Contributor

I think there are few things that are needed.

  • The FileResolver should be wrapped in a std::sync::RwLock
  • Optionally the parse tree should be cached in the FileResolver, so we don't waste cycle reparsing the same file (e.g. files that are imported)
  • Files should be processed in a thread worker pool fashion

@LucasSte
Copy link
Contributor

I think this issue is a bit more difficult than it looks like. If one file depends on the other, they cannot be built in parallel, due to dependency resolution in sema. At least, the parser and the lever can run in parallel.

@seanyoung
Copy link
Contributor

I think this issue is a bit more difficult than it looks like. If one file depends on the other, they cannot be built in parallel, due to dependency resolution in sema. At least, the parser and the lever can run in parallel.

I don't understand what you mean. What do you see as a problem?

@LucasSte
Copy link
Contributor

LucasSte commented Jun 27, 2023

@seanyoung Consider this case:

file A.sol:

contract A { ... }

file B.sol:

contract B is A { ... }

file C.sol

contract C {
   A other;
   function foo(address addr) external {
        other = new A{address: addr}();
   }
}

I can invoke Solang using solang compile --target Solana A.sol B.sol C.sol
File B.sol depends on A.sol. The semantic analysis can only happen for B after that contract A is fully resolved, even though they might generate different binaries. Parallel compilation for A and B is not possible.

For file C.sol, the contract needs to have contract A resolved. In addition, the Solana account collection in codegen expects the CFG from all contracts to be ready in order to collect accounts for function foo. Parallel compilation for C and A is not possible again.

The way I see, we either can enable parallel compilation and let the compiler do repeated work for these cases (e.g. resolve A.sol solely for B.sol in one thread to generate B's binary, while A.sol is building in another thread to generate A's binary), or we need to construct a dependency tree to identify what can be parallelized and use many synchronization mechanisms throughout the code to make this work.

@seanyoung
Copy link
Contributor

File B.sol depends on A.sol. The semantic analysis can only happen for B after that contract A is fully resolved, even though they might generate different binaries.

This is not how Solang works and it could never work that way.

Each file on the command line is new Namespace. When a file is imported, we call sema (recursively) with the existing namespace and then walk the parse tree of the imported file. So, the parse tree for the same file can be used concurrently in different threads.

You are suggesting that when B.sol imports A.sol, then it uses the Namespace of A.sol rather than the parse tree. That would be wrong and will lead to incorrect compilation. Each import needs to go through sema for its own Namespace.

There are global things like user defined types which could have different definitions in different files. When you then import another file, that imported file needs to use the correct global definitions.

So, sema and the following stages can run in parallel. Since we're using an lalr grammar, the parser stage should be pretty fast so I suspect this will make little difference.

@LucasSte
Copy link
Contributor

So, sema and the following stages can run in parallel. Since we're using an lalr grammar, the parser stage should be pretty fast so I suspect this will make little difference.

I apologize. I wasn't aware that Solang worked that way. By building both A.sol and B.sol, the compiler is doing repeated work resolving contract A, isn't it? Shouldn't we resolve A only once?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants