Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

when building, use a temporary directory instead of dumping garbage in cwd #298

Closed
andrewrk opened this issue Apr 2, 2017 · 4 comments
Closed
Labels
enhancement Solving this issue will likely involve adding new logic or components to the codebase.
Milestone

Comments

@andrewrk
Copy link
Member

andrewrk commented Apr 2, 2017

Currently when you invoke zig, it dumps .o files everywhere. Instead it should have a scratch space temporary directory where it can do this work.

When choosing a filename for temporary objects, one of two strategies should be chosen:

  • The file is cacheable, in which case Zig should compute a hash, look for an existing object with the hash, and use that, otherwise create it, and leave the cached file around for next time.
  • The file is not cacheable, in which case Zig should create a random filename so that will be unique, and then delete the file at the end.
@andrewrk andrewrk added the enhancement Solving this issue will likely involve adding new logic or components to the codebase. label Apr 2, 2017
@andrewrk andrewrk added this to the 0.1.0 milestone Apr 2, 2017
@andrewrk
Copy link
Member Author

andrewrk commented Apr 27, 2017

Here's my proposal for how this is going to work:

How the Cache Works

When you build an executable, library, or object, the output goes into a
cache directory. By default this is a directory named ./zig-cache/ in the
current working directory, however you can override this with the --cache-dir
command line argument. For the common automatically built object files,
builtin.o, compiler_rt.o, and test_runner.o, Zig passes a global path for
the cache dir, something like ~/.zig/cache/, or whatever the XDG standard is.
This way, the cache of these objects is correctly shared across multiple projects.

In addition to outputting the build artifacts, Zig computes metadata about what
it compiled. The metadata consists of:

  • Manifest: A list of all files and their mtimes at the time of compilation.
    • The Zig binary itself, and recursively all of its dynamic
      library dependencies.
    • The root source file, if any, and all of the source files that it
      imported.
    • Any files passed to @embedFile.
    • Any files read by libclang when using @cImport.
    • Object files passed in with --object.
    • Assembly files passed in with --assembly.
    • Linker script passed in with --linker-script.
    • The produced object, library, executable, or .h file.
  • Hash: A sha256 hash, which consists of:
    • The root source file name, if any.
    • Object file names passed with --object.
    • Assembly file names passed in with --assembly.
    • Linker script file name passed in with --linker-script.
    • All compile variables and their values.
    • --libc-include-dir [path] directory where libc stdlib.h resides
    • --name [name] override output name
    • --strip exclude debug symbols
    • --target-arch [name] specify target architecture
    • --target-environ [name] specify target environment
    • --target-os [name] specify target operating system
    • --zig-std-dir [path] directory where zig standard library resides
    • -dirafter [dir] same as -isystem but do it last
    • -isystem [dir] add additional search path for other .h files
    • --ar-path [path] set the path to ar
    • --dynamic-linker [path] set the path to ld.so
    • --each-lib-rpath add rpath for each used dynamic library
    • --libc-lib-dir [path] directory where libc crt1.o resides
    • --libc-static-lib-dir [path] directory where libc crtbegin.o resides
    • --library [lib] link against lib
    • --library-path [dir] add a directory to the library search path
    • -L[dir] alias for --library-path
    • -rdynamic add all symbols to the dynamic symbol table
    • -rpath [path] add directory to the runtime library search path
    • -mconsole (windows) --subsystem console to the linker
    • -mwindows (windows) --subsystem windows to the linker
    • -municode (windows) link with unicode
    • -framework [name] (darwin) link against framework
    • -mios-version-min [ver] (darwin) set iOS deployment target
    • -mlinker-version [ver] (darwin) override linker version
    • -mmacosx-version-min [ver] (darwin) set Mac OS X deployment target
    • --ver-major [ver] dynamic library semver major version
    • --ver-minor [ver] dynamic library semver minor version
    • --ver-patch [ver] dynamic library semver patch version

The manifest is serialized into a plain text file and stored in the cache,
with the path {cache_dir}/{hash_base64}.

When building an executable, library, or object, Zig first computes this hash
and looks for this manifest file. If it is missing, then Zig proceeds with the
build. It is undesirable for build artifacts to contain a gnarly hash in their
filename because this path shows up in compile errors and stack traces. So,
Zig atomically creates a new empty directory with a short name such as 0
(if the directory exists then it tries 1 and so on). Once this new empty
directory is created, it adds the path to this directory to the manifest, and
creates the manifest file.

On the other hand, if Zig finds the manifest file, then it checks the list of
files and their mtimes for consistency. If anything changed, Zig proceeds with
the build as described above. If everything is the same, Zig concludes that
no work needs to be done, and exits with success.

@andrewrk
Copy link
Member Author

Plan B

Because all of this has to be done generically in the Zig Build System anyway,
so that it can work for arbitrary commands, arbitrary steps, and building
C/C++ projects, maybe that's where the logic should go. That just leaves
builtin.o, compiler_rt.o, and test_runner.o to solve.

One idea is that there could be a command line argument to provide these object
files instead of having Zig automatically build them. Instead of caching them
globally, we would detect when they did not need to be rebuilt using the same
manifest/hash information as for the build that produced the object files.

So Zig would have a command line argument to output the list of zig source
files touched to a file, and likewise for these special object files, and
the build system would pass an option to make them output to the cache
directory and integrate them into the manifest/hash system.

This leaves the problem of when zig is invoked without the build system,
using build_exe, build_lib, or test directly. One solution to this
would be to use a temporary directory for these files and then delete it
upon completion. So when building with one of these commands directly,
there would be no caching. This is less than ideal with test, but not
a show stopper - building test_runner.o, builtin.o, and compiler_rt.o
for a source file with a single empty test takes up 20% of the build time.

We can eliminate test_runner.o by making it not be a separate object, and
make it get access to a list of test functions without having to cross the
object file boundary.

One way we could solve this is to wait until we self-host. Then this
manifest/hash system will be part of the standard library and it can be
shared by the build system and the compiler.

A nice benefit of Plan B is we would get to write it in Zig instead of C++.

In summary:

  1. Update test to not need test_runner.o at all.
  2. Update the compiler to put builtin.o and compiler_rt.o in a temporary
    directory while building, and delete the tmp dir after completion.
  3. Do the caching strategy in the Zig Build System and make the caching API
    available generically for building anything. Don't try to solve caching
    builtin.o and compiler_rt.o yet.

@andrewrk
Copy link
Member Author

Steps 1 and 2 are done. Step 3 is for milestone 0.2.0. See also #330

@andrewrk
Copy link
Member Author

andrewrk commented May 2, 2017

Duplicate of #330

@andrewrk andrewrk closed this as completed May 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Solving this issue will likely involve adding new logic or components to the codebase.
Projects
None yet
Development

No branches or pull requests

1 participant