Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: improved compiler test suite (compiler errors, run and compare, translate-c, etc.) #11288

Closed
Tracked by #89
mitchellh opened this issue Mar 24, 2022 · 18 comments
Labels
accepted This proposal is planned. enhancement Solving this issue will likely involve adding new logic or components to the codebase. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@mitchellh
Copy link
Contributor

mitchellh commented Mar 24, 2022

Requirement for #89

Proposal

An improved test harness for the compiler that supports the following:

  • Utilizes a directory structure of real source files (Zig, C, and assembly) to define compiler test cases.
  • Test settings and assertions are configured using a manifest file. The manifest file can be embedded in the first (top) comment of a Zig file or it can exist alongside a test file.
  • Multiple test types:
    • run - Build and run to completion and compare output
    • error - Build and expect specific error output
    • translate-c - Build a “.c” file and assert an expected “zig” file is produced. The Zig file can further optionally define a subsequent test to run (i.e. run) so you can translate C then optionally assert the proper behavior also happens.
    • incremental - Build and incrementally update a program. Each incremental step is itself a test case so that we can support incremental updates introducing errors, and a subsequent update fixing the errors. [stage2-only]
    • cli - Run a specific zig CLI command (typically, zig build test) to test the Zig CLI or Zig build machinery.
  • Concurrency. Test cases will run on a thread pool similar to compilation.
  • Test filtering by name, test type, or backend.
  • Test dependencies. Tests can note that they are dependent on another test in order to optimize parallelism or build order.
  • Adding or modifying tests will not trigger a full rebuild of the compiler.

This will be built on the existing test-stage2 harness (src/test.zig) and won’t a totally new implementation. This is almost purely iteratively improving the existing harness rather than writing anything new.

The tests will continue to invoked using zig build test-stage2 or similar targets.

NOTE: Some of these are not new features, but are features that are currently unimplemented in the stage2 test harness or just need slight improvement in the stage2 test harness.

Backwards Compatibility

A limited amount of backwards compatibility with the existing compiler test API will be retained so we can incrementally introduce these test cases. In some cases, certain test types such as stack trace tests and certain native backend tests will continue to use the Zig API and will not support the file-based definition format. Therefore, we will continue to support the necessary subset of the Zig API.

Note that changes to these tests will not gain the benefits of the above harness. For example, changes will trigger a full rebuild. They won’t be parallelized as well, etc. However, these tests tend to have very few cases and/or are updated very infrequently.

Test Definitions

Tests are now defined using files within a directory structure rather than a Zig API. This has benefits:

  1. Adding, changing, or removing a test case does not cause the full test harness (including the compiler) to rebuild.
  2. Adding, changing, or removing test cases is a lot more contributor friendly.

Directory Structure

The directory structure has no meaning to the test definitions. Any test can go into any directory, and multiple tests can go into a single directory. The test harness determines test mode and settings using the manifest (defined below). Why? Originally, Andrew and I felt the directory structure should imply some stuff such as backend (stage1 vs stage2). But we pulled back on this initial plan because many tests should behave the same on stage1 AND stage2, and incremental tests want to test multiple test modes.

However, we may wish to enforce directory idioms, such as all stage1-exclusive tests going into a tests/stage1 directory. The test harness itself will not care about these details.

The name is determined by the filename. It is an error to have duplicate test names. Therefore, two tests named foo.zig and foo.S would produce an error since they are both named “foo” despite having different extensions.

Test Manifests

Manifest Syntax

The first line of the manifest is the test type.

Subsequent non-empty lines are key=value configuration for that test type.

An empty line followed by additional data is “trailing” configuration that is dependent on the test type. For example, for error tests, it defines the expected error output to match.

error
backend=stage1,stage2
output_mode=exe

:3:19: error: foo
run

I am expected stdout! Hello!
cli

build test

Manifest Location

The test manifest can be either embedded or adjacent. Only Zig and C files support embedded manifests.

A manifest must have a test. However, a test does not require a manifest. There a handful of implicit manifest conditions defined in a later section.

For embedded manifests, the manifest is the last comment in a Zig or C file. The comment is not stripped for test types that compile the source. Example, filename my_test.zig:

// other comments, not the manifest

export fn main() void {
  // code
}

// error
// output_mode=lib
//
// 7:2: error: bar

Manifests can also be defined adjacent to a test. In this case, the manifest file must have the same filename as the test file and end in the .manifest suffix. Adjacent manifests are supported specifically for non-Zig tests and multi-file or incremental tests.

Example:

my_test.zig:

export fn main() void {
  // code
}

my_test.manifest:

error
output_mode=lib

7:2: error: bar

Implicit Manifests

Whilst a manifest requires a test, a test does not require a manifest. If a manifest is not found for a test file, a default run manifest is assumed (must compile and exit with exit code 0).

Incremental Tests

Tests that test that the stage2 compiler can do incremental compilation have an additional special filename format: each incremental compilation step is suffixed numerically to denote ordering.

For example, foo_0.zig, foo_1.zig would denote that the test named “foo” is incremental and will run test case 0 followed by test case 1 within the same stage2 compilation context. Each individual test case (foo_0, foo_1) can define their own test mode. This enables us to test that the incremental compiler handles errors after succeses and so on.

For naming, the following test names will be created and filterable:

  • “foo” - This will run the full foo incremental sequence and all cases within foo.
  • “foo_1” - This will run all the incremental steps up to and including step 1, but no further.

It is an error for the following conditions:

  • Incremental test ordering cannot have holes. You cannot have foo_0 and foo_2 without a foo_1. Likewise, you cannot have a foo_1 without a foo_0.
  • Incremental tests cannot conflict with non-incremental tests. You cannot have a foo_0.zig and a foo.zig. This is an error since it would create a duplicate test name “foo” whilst also being unclear to contributors.
  • Incremental tests must start with a _0 suffix. The _0 is not implied. For example foo.zig and foo_1.zig do not comprise an incremental test case, the harness will see _1 as an incremental case with a hole missing _0 and error.

Test Execution and Ordering

The test harness will do the following at a very high-level:

  1. Collect and queue all tests by recursively traversing the test case directory (filtering happens here).
  2. Execute the tests using a threadpool in the order they were traversed.

There is no guarantee on test ordering.

Test failures are output to the console immediately upon happening. Failures are not buffered in memory except for the minimal amount of memory needed to build the error message. In the case of OOM, a message will be outputted with the test name that failed but no further information (if it reached this point, the filename is already in-memory).

Backends

Backends are specified using the backends manifest configuration option that is available for all test types. This is a comma-separated list of the backends to test against. The backend can be prefixed with a ! to test all backends but exclude that specific backend.

Supported backends:

  • stage1
  • stage2 - Stage2 using the default backend (LLVM or native).

Native Backends

There are handful of tests for the native backends. These are primarily testing things that behavior tests test. According to Andrew, they’re historically only there for backends before they are able to execute behavior tests using at the least the simplified test runner. Therefore, the plan for now is to remove these since they’re tested via behavior tests, or keep them using the old Zig API.

Test Filtering

The -Dtest-filter build option will be used to filter tests by name.

Different test targets from the zig build command will be used to filter by test types, i.e. error tests vs run tests.

Test Types

Run Tests

Run tests build an exe and run it to completion. Any exit code other than zero is a test failure. Additionally, output can be specified and it will be asserted to match the output on stdout byte for byte.

The following type-specific manifest options are supported:

  • translate-c (default: false) - If non-empty, C files will be processed through translate-c prior to being run. Without this, any C files will use the Zig compiler frontend for C (clang at the time of writing).

Manifest trailing data is the stdout data to assert. stderr is not matched.

Run tests use the suffix of the associated test file to perform the proper build. For example, .zig files go through the normal Zig compilation pipeline. .S files are assembled and linked.

Example, a run test that just asserts exit code 0 and ignores all output:

//! run

Example, a run test that asserts output in addition to the exit code being 0:

//! run
//!
//! Hello, World!
//!

Error Tests

Error tests build an exe, obj, or lib and assert that specific errors exist. This only tests errors that happen during compilation and not during runtime. The built exe (if the output mode is exe) is not executed. Runtime errors should be checked using the run test type with a panic handler.

The compiler is expected to error. If the compiler is subprocessed (i.e. testing stage1), then a non-zero exit code will be expected.

The following type-specific manifest options are supported:

  • output_mode (default: obj) - one of exe, lib, obj. This is the compilation output mode.
  • is_test (default: false) - non-empty specifies this has is_test set on the compilation mode.

The manifest trailing data is the error output to assert. One error is expected per line, and will be matched using a string subset match. The subset match lets you omit things like filename. Non-empty trailing data is required.

Example:

// error
//
// :1:11: error: use of undeclared identifier 'B'

Example that specifies an output mode:

// error
// output_mode=lib
//
// :1:11: error: use of undeclared identifier 'B'

Example that tests errors in test mode:

// error
// is_test=1
//
// :1:11: error: use of undeclared identifier 'B'

Translate C Tests

translate-c tests test that a C file is translated into a specific Zig file by testing one or more substring matches against the resulting Zig source after translation.

This test has no type-specific configuration.

The manifest trailing data are the one or more sets of strings to compare against. Each set is separated by a newline followed by a comma.

Example of a single match:

// translate-c
//
//pub const struct_Color = extern struct {
//    r: u8,
//    g: u8,
//    b: u8,
//};

Example of multiple matches, in which case every match must found:

// translate-c
//
//pub const struct_Color = extern struct {
//    r: u8,
//    g: u8,
//    b: u8,
//};
//,
//const struct_unnamed_2 = extern struct {};

Deprecated or Ignored Test Types

The following test types will be deprecated since they are just a special case of the file-defineable test types. In all cases, these tests are simply migrated, not lost.

  • assemble_and_link - This is just the run test case above where the run input supports assembly files.
  • runtime_safety - This is a special case of run.

The following test types are ignored and kept as-is. They are tests that aren’t frequently updated and don’t have many cases:

  • stack_traces
  • standalone - This looks like something we can convert to this new test suite later, but is ignored for now due to some complexities and details that we should probably iron out in a dedicated issue.
  • native-backend compiler error and run tests, more details on this in the "backends" section in this proposal

Abandoned Ideas

The ideas below were initially discussed or existed but ultimately abandoned:

Explicit Test Dependencies

One configuration setting that can be specified in the manifest for all test types is an after setting. This is a comma-separated list of tests that you want to be executed first before your test.

This is optional. Tests should not have side effects and therefore the ordreing should not matter. However, this feature can be used to point to other tests that test functionality that is used but not tested by the current test, therefore optimizing test suite speed.

CLI Tests

CLI tests execute specific zig CLI commands and assert exit code zero.

CLI tests have no type-specific manifest options.

The manifest trailing data specifies one command to run per line (all assumed to be subcommands of zig). This lets you test multiple targets seprately. They are run in order. If no trailing data is specified, zig build test is run.

CLI tests can run against any Zig binary, so these are a good way to test consistent behavior against both a stage1 and stage2 binary build.

Example, with a custom target:

cli

build test-custom-target

Example, running a file:

cli

run hello.zig

Example, running and building:

cli

run hello.zig
build
@andrewrk andrewrk added this to the 0.10.0 milestone Mar 24, 2022
@andrewrk andrewrk added enhancement Solving this issue will likely involve adding new logic or components to the codebase. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. labels Mar 24, 2022
@andrewrk
Copy link
Member

andrewrk commented Mar 24, 2022

A few notes:

  • translate-c - Build a “.c” file and assert an expected “zig” file is produced. The Zig file can further optionally define a subsequent test to run (i.e. run) so you can translate C then optionally assert the proper behavior also happens.

Note that for run-translated-c tests, the output zig code is not checked to match any expected text, it is only executed and checked for behavior. The existing description of translate-c tests in this proposal does not have a way to encode this. All translate-c tests only want to do one or the other- run or compare the translation, never both.

Test dependencies. Tests can note that they are dependent on another test in order to optimize parallelism or build order.

I don't think we should have any notion of test dependencies. All tests should be able to be independently run from each other.

I see that you addressed this in more detail in the "Explicit Test Dependencies" section. If you feel strongly that this is useful and helpful then I'm fine with going with it. My only resistance is that it's a lot of meta work for dubious benefits. I'm just imagining myself reviewing a big PR that tries to add test dependencies all over the place and then a bunch of people, who contribute suspiciously seldomly, bikeshedding over it.

backend=stage1,stage2

Failing backends should be excluded; the default should be all backends. This follows the philosophy of the std.builtin.CompilerBackend enum:

zig/lib/std/builtin.zig

Lines 668 to 721 in 2af6971

/// This enum is set by the compiler and communicates which compiler backend is
/// used to produce machine code.
/// Think carefully before deciding to observe this value. Nearly all code should
/// be agnostic to the backend that implements the language. The use case
/// to use this value is to **work around problems with compiler implementations.**
///
/// Avoid failing the compilation if the compiler backend does not match a
/// whitelist of backends; rather one should detect that a known problem would
/// occur in a blacklist of backends.
///
/// The enum is nonexhaustive so that alternate Zig language implementations may
/// choose a number as their tag (please use a random number generator rather
/// than a "cute" number) and codebases can interact with these values even if
/// this upstream enum does not have a name for the number. Of course, upstream
/// is happy to accept pull requests to add Zig implementations to this enum.
///
/// This data structure is part of the Zig language specification.
pub const CompilerBackend = enum(u64) {
/// It is allowed for a compiler implementation to not reveal its identity,
/// in which case this value is appropriate. Be cool and make sure your
/// code supports `other` Zig compilers!
other = 0,
/// The original Zig compiler created in 2015 by Andrew Kelley.
/// Implemented in C++. Uses LLVM.
stage1 = 1,
/// The reference implementation self-hosted compiler of Zig, using the
/// LLVM backend.
stage2_llvm = 2,
/// The reference implementation self-hosted compiler of Zig, using the
/// backend that generates C source code.
/// Note that one can observe whether the compilation will output C code
/// directly with `object_format` value rather than the `compiler_backend` value.
stage2_c = 3,
/// The reference implementation self-hosted compiler of Zig, using the
/// WebAssembly backend.
stage2_wasm = 4,
/// The reference implementation self-hosted compiler of Zig, using the
/// arm backend.
stage2_arm = 5,
/// The reference implementation self-hosted compiler of Zig, using the
/// x86_64 backend.
stage2_x86_64 = 6,
/// The reference implementation self-hosted compiler of Zig, using the
/// aarch64 backend.
stage2_aarch64 = 7,
/// The reference implementation self-hosted compiler of Zig, using the
/// x86 backend.
stage2_x86 = 8,
/// The reference implementation self-hosted compiler of Zig, using the
/// riscv64 backend.
stage2_riscv64 = 9,
_,
};

The -Dtest-filter build option will be used to filter tests by name (this already exists, but is unimplemented).

It exists, and it is implemented everywhere except for the part where the stage2 compiler does not support --test-filter for zig test.

output_mode (default: exe) - one of exe, lib, obj. This is the compilation output mode.

Default output mode for error tests should be obj because it drags in less std lib code and does not create any sub-compilations.

The manifest trailing data is the error output to assert.

The compile errors should go at the end of the file so that adding/removing lines in the expected errors does not shift the other error lines up and down. To keep it consistent I suggest that the entire manifest is at the very end of the file and consists of all the line comments in a row at the end of the file. This also frees us up to test doc comments such as the //! comments at the top of a file, which have some semantic meaning, unlike line comments.

CLI Tests

If we want to be able to move our existing CLI tests to this mechanism then we need a way to check the following things:

  • after running zig init-exe does zig build run work and produce the expected output?
  • after running zig init-lib does zig build test work and produce the expected output?
  • after executing the zig command, does there exist the expected .s file (the godbolt use case)
  • after running the zig command which is expected to produce an error, did we get the correct error message?
  • after running zig fmt did it do UTF-16 decoding and did it find and reformat the test file as expected?

See test/cli.zig for more details. All these tests currently have bespoke logic for testing these parts of the CLI.

Perhaps the best option would be to leave CLI testing alone, since the common case of exit 0 is already handled better through the other test mechanisms described in this proposal.

@andrewrk andrewrk added the accepted This proposal is planned. label Mar 24, 2022
@mitchellh
Copy link
Contributor Author

mitchellh commented Mar 24, 2022

Note that for run-translated-c tests, the output zig code is not checked to match any expected text, it is only executed and checked for behavior. The existing description of translate-c tests in this proposal does not have a way to encode this. All translate-c tests only want to do one or the other- run or compare the translation, never both.

I think a valid solution for this is to support run for c files. The implicit behavior of a .c file is to run translate-c, but you can create an explicit manifest that does run. If the file ends in .c then it is translated first, but if the mode is run, the output is not compared to anything.

I don't think we should have any notion of test dependencies. All tests should be able to be independently run from each other.

Let's drop it unless @matu3ba can convince you otherwise!

Failing backends should be excluded; the default should be all backends. This follows the philosophy of the std.builtin.CompilerBackend enum.

How about this wording:

The backend defaults to all backends, but can be explicitly specified to override that behavior.

The one thing I'm thinking is for compiler errors, we've already changed the wording for example of out of bounds messages to be much more explicit in stage2, so we don't want those to always run all backends.

Default output mode for error tests should be obj because it drags in less std lib code and does not create any sub-compilations.

Ack.

The compile errors should go at the end of the file so that adding/removing lines in the expected errors does not shift the other error lines up and down. To keep it consistent I suggest that the entire manifest is at the very end of the file and consists of all the line comments in a row at the end of the file. This also frees us up to test doc comments such as the //! comments at the top of a file, which have some semantic meaning, unlike line comments.

Greatttttt point. Updating this now.

CLI Tests

If we want to be able to move our existing CLI tests to this mechanism then we need a way to check the following things:

  • after running zig init-exe does zig build run work and produce the expected output?
  • after running zig init-lib does zig build test work and produce the expected output?
  • after executing the zig command, does there exist the expected .s file (the godbolt use case)
  • after running the zig command which is expected to produce an error, did we get the correct error message?
  • after running zig fmt did it do UTF-16 decoding and did it find and reformat the test file as expected?

See test/cli.zig for more details. All these tests currently have bespoke logic for testing these parts of the CLI.

Perhaps the best option would be to leave CLI testing alone, since the common case of exit 0 is already handled better through the other test mechanisms described in this proposal.

Agreed, let's leave them alone for now, and then reconsider later.

@topolarity
Copy link
Contributor

The compile errors should go at the end of the file so that adding/removing lines in the expected errors does not shift the other error lines up and down. To keep it consistent I suggest that the entire manifest is at the very end of the file and consists of all the line comments in a row at the end of the file. This also frees us up to test doc comments such as the //! comments at the top of a file, which have some semantic meaning, unlike line comments.

Just as a heads up, the way this is handled in #11284 right now is that the container doc comments are simply not considered part of the source, so they don't affect line numbers.

I have no problem with either direction - It was just the choice I made at the time to get things up and running

@mitchellh
Copy link
Contributor Author

I have no problem with either direction - It was just the choice I made at the time to get things up and running

It'd be much easier to not modify files, because then we can pass them as-is into the compiler, specifically around test modes that require subprocessing to zig. We can avoid a lot of machinery around creating temporary files and so on in that case.

If you have problem either way, I'd say let's move them to the end and make them normal comments (// not //!).

@topolarity
Copy link
Contributor

Great. I'm happy to roll that and other sufficiently-mature parts of the manifest syntax into #11284, depending on how quickly we want to get that one merged

Or we can target a merge as-is and re-align with this proposal later - I'll defer to @andrewrk in the review on that one

@matu3ba
Copy link
Contributor

matu3ba commented Mar 24, 2022

I don't think we should have any notion of test dependencies. All tests should be able to be independently run from each other.

Let's drop it unless @matu3ba can convince you otherwise!

I am fine with either statement. My intention was to clarify the situation.

Concurrency. Test cases will run on a thread pool similar to compilation.

This does not explain, if and (if yes) how many processes are spawn to isolate error sources on usage of a thread pool.
Probably it should be mentioned that single-process architectures will default to 1 thread to eliminate side effects of threads.
Or if, and if yes, how side effects of multi-threading with memory problems are mitigated (heisenbugs, error source identification etc).
I am asking to explicitly write the assumptions+assertions regarding parallelism/concurrency here.

There is no guarantee on test ordering.

The test output should be then defined. The error trace is buffered somewhere with allocations, so the behavior for OOM should be specified.

Question on embedded manifests:
This sounds very much like an eventual libstd feature or people will at least push for it. What is the strategy to deal with that and can you clarify on that to prevent future churn on discussing this feature?

@matu3ba
Copy link
Contributor

matu3ba commented Mar 24, 2022

Error tests build an exe, obj, or lib and assert that specific errors exist.

Rephrase to mention that the error may exist during compiling or running the binary.

For naming, the following test names will be created and filterable:

Edge cases: Is calling a test foo_0.zig then forbidden or only, if foo.zig exists? Do incremental tests start with foo_0.zig?

exit code

Please specify what makes success and error for completeness+consistency.

@mitchellh
Copy link
Contributor Author

Question on embedded manifests:
This sounds very much like an eventual libstd feature or people will at least push for it. What is the strategy to deal with that and can you clarify on that to prevent future churn on discussing this feature?

I'd say that this is a bespoke format for a bespoke use case and it doesn't belong as-is in libstd since it isn't a general purpose thing. It also isn't a particularly complex task. The tokenizer/parser is in libstd, you could build it yourself. That would be my opinion, anyways. Other projects (such as Go) do not expose their compiler test "manifest"-equivalent parsing into the standard library and have historically said the same thing: you have the parser available, build it yourself.

The test output should be then defined. The error trace is buffered somewhere with allocations, so the behavior for OOM should be specified.

Good point. I would recommend outputting failures as soon as they exist so we don't have to deal with buffering and memory allocations for that (beyond the buffering necessary for building up the error message/trace for that single failure). What does the normal test runner promise?

I agree specifying the OOM behavior here is a good idea.

This does not explain, if and (if yes) how many processes are spawn to isolate error sources on usage of a thread pool.
Probably it should be mentioned that single-process architectures will default to 1 thread to eliminate side effects of threads.
Or if, and if yes, how side effects of multi-threading with memory problems are mitigated (heisenbugs, error source identification etc).
I am asking to explicitly write the assumptions+assertions regarding parallelism/concurrency here.

Defer to @andrewrk for clarification here, but my interpretation was that we'd use multiple threads in a single process. These tests aren't expected to have any side effects (and it is a bug if there are). The compiler itself is the same way. If there are heisenbugs/flakey tests, we chase them down as we normally do with that class of bug. A -fsingle-threaded flag can be introduced to at least help check that its related to concurrency.

@mitchellh
Copy link
Contributor Author

Error tests build an exe, obj, or lib and assert that specific errors exist.

Rephrase to mention that the error may exist during compiling or running the binary.

👍

For naming, the following test names will be created and filterable:

Edge cases: Is calling a test foo_0.zig then forbidden or only, if foo.zig exists? Do incremental tests start with foo_0.zig?

Good clarifications, will make some edits. My proposal:

foo_0.zig is allowed so long as foo.zig does not exist, and having a single-sequence incremental test is completely fine. Incremental tests must start with _0-suffix. Having only foo_1 will see "0" as "missing" and holes are expressly not allowed in incremental tests.

My reasoning is just to avoid two ways to do the same thing. There is only one way to do incremental tests and that is having an explicit numeric suffix.

exit code

Please specify what makes success and error for completeness+consistency.

I did in the first sentence, but I'll repeat it for consistency and being explicit.

@matu3ba
Copy link
Contributor

matu3ba commented Mar 24, 2022

I did in the first sentence, but I'll repeat it for consistency and being explicit.

Any exit code other than zero is a test failure. is mentioned in Run tests, but not in Error tests.

@ehaas
Copy link
Contributor

ehaas commented Mar 24, 2022

translate-c tests test that a C file is translated into a specific Zig file, byte-for-byte.

The current translate-c tests do substring matching when checking for specific translations, because some predefined macros and constants can vary depending on the target.

For example with -target i386-windows-gnu:

pub const __LONG_MAX__ = @as(c_long, 2147483647);

But with -target aarch64-linux-gnu:

pub const __LONG_MAX__ = @import("std").zig.c_translation.promoteIntLiteral(c_long, 9223372036854775807, .decimal);

Even if we force the tests to always use a specific target, the double underscore namespace is reserved for the compiler (clang in this case) which means an llvm version update could add/remove those types of constants, requiring all the tests to be updated even if nothing really changed.

@mitchellh
Copy link
Contributor Author

mitchellh commented Mar 24, 2022

Thanks @ehaas, I missed that bit of nuance.

I think in that case then, we should cleanly separate translate-c into requiring its own manifest that probably contains the same assertions as https://github.com/ziglang/zig/blob/master/test/translate_c.zig For the tests that use the dynamic values (like default_enum_type), they can keep using the Zig API since that is going to stick around. And for run-translated-c tests, the existing change I made that they should just be run types still works.

For the others, they should specify a manifest with the expected output and it should follow the same substring matching semantics as today.

Thoughts on that?

Question then, since // is a valid comment syntax in C, should we support the embedded manifest syntax in C? Or should we require an adjacent manifest for each of them?

@andrewrk
Copy link
Member

andrewrk commented Mar 24, 2022

I think a valid solution for this is to support run for c files. The implicit behavior of a .c file is to run translate-c, but you can create an explicit manifest that does run. If the file ends in .c then it is translated first, but if the mode is run, the output is not compared to anything.

Note that when you do zig run hello.c -lc what is actually happening is the C code is getting compiled and executed, not translated. Zig supports compiling .c files so I don't see any reason to have .c files be different than .zig files, especially as we transition to having aro as a C frontend (#10795). However even with the current Clang frontend for .c source code, I think "translate-c" should be a distinct test category from traditional C source code compilation, error checking, and execution.

Or we can target a merge as-is and re-align with this proposal later - I'll defer to @andrewrk in the review on that one

I'm happy to do an early merge - I didn't look at it yet because you said you wanted to push some more changes today, something about directories.

Generally I think this proposal can and should be rolled out with incremental enhancements such as the one in your PR.

Defer to @andrewrk for clarification here, but my interpretation was that we'd use multiple threads in a single process.

👍

A -fsingle-threaded flag can be introduced to at least help check that its related to concurrency.

This already exists and is already observed by the test harness :-)

Question then, since // is a valid comment syntax in C, should we support the embedded manifest syntax in C? Or should we require an adjacent manifest for each of them?

It seems reasonable to have the same manifest syntax in both .c source files and .zig source files to me.

@ehaas
Copy link
Contributor

ehaas commented Mar 25, 2022

Question then, since // is a valid comment syntax in C, should we support the embedded manifest syntax in C? Or should we require an adjacent manifest for each of them?

I think embedding the manifest in C would be fine. One thing to note is that it's not always matching against one string; sometimes the test runner searches for multiple strings in the output so we'd need a good syntax for that. Example here

@mitchellh
Copy link
Contributor Author

I think embedding the manifest in C would be fine. One thing to note is that it's not always matching against one string; sometimes the test runner searches for multiple strings in the output so we'd need a good syntax for that. Example here

Got it, I've updated the proposal to support that. The manifest would look like the following. Each case would be separated by a line with only ,. I checked all translate-c tests and none expect this output, and I don't think any C code would be translated into that on a single line.

// translate-c
//
//pub const struct_Color = extern struct {
//    r: u8,
//    g: u8,
//    b: u8,
//};
//,
//const struct_unnamed_2 = extern struct {};

Note that when you do zig run hello.c -lc what is actually happening is the C code is getting compiled and executed, not translated. Zig supports compiling .c files so I don't see any reason to have .c files be different than .zig files, especially as we transition to having aro as a C frontend (#10795). However even with the current Clang frontend for .c source code, I think "translate-c" should be a distinct test category from traditional C source code compilation, error checking, and execution.

@andrewrk Yep, I've split out translate-c to be a distinct test category. For the run test type, I also added a translate-c configuration that can be set to non-empty to force C files to be run through translate-c prior to being run. This is effectively the run-translated-c test type.

It seems reasonable to have the same manifest syntax in both .c source files and .zig source files to me.

Done.

@mitchellh mitchellh changed the title proposal: improved compiler test suite (compiler errors, run and compare, CLI tests, etc.) proposal: improved compiler test suite (compiler errors, run and compare, translate-c, etc.) Mar 25, 2022
@andrewrk
Copy link
Member

Now that #11530 is merged, let's break this proposal into separate, independently completable issues for each task that is left.

@andrewrk
Copy link
Member

andrewrk commented Jun 7, 2022

Alright I'm closing this. Anyone is free to extract the unsolved issues out of this proposal into separately completable issues 👍

@andrewrk andrewrk closed this as completed Jun 7, 2022
tomhoule added a commit to prisma/prisma-engines that referenced this issue Aug 1, 2022
The main problem this addresses are:

- Diffs are noisy.
  Datamodel tests involve large strings literals, and very little
  significant information otherwise. In a single file, even for simple
  changes like reordering or renaming tests, or changing indentation,
  the diffs get unreadable quickly, as the large blocks of Prisma
  schemas and validation errors get mixed with the test function names
  and calls to test helpers. This state of things disincentivizes
  reorganizing tests. The new setup has one test per file, with the test
  name being the file name, including its full path from the validation
  tests root.

- Churn.
  We often need to subtly adjust many tests to account for trivial
  changes in the public APIs, when all we want to do is assert that a
  certain schema is valid, or invalid with a given error. This prevents
  us from evolving the datamodel API as much as we would like, and as a
  consequence of these changes being a lot of work, we have a lot of
  inconsistency in how tests are written and which test helpers are
  used. This has manifested recently in difficulties simply moving
  tests, without changing them, from `libs/datamodel/core` to `psl/psl`.

  With this test suite, we rely on a lot less context, setup and noise
  (variable names for expectations). We don't expect the shape of these
  tests to evolve much, churn should be minimal and if we need to change
  something, it will be possible to autofix through UPDATE_EXPECT=1.

- The test suite gets slower to compile as we add more tests, and that
  noticeably hurts our feedback loops.
  This setup is less rust code to compile: each test case translates to
  a single test function containing a single synchronous function call.
  We can transparently make test runs compile no rust code at all in the
  future if we need to.

Prior art:

- In prisma-engines: we already did that in introspection engine tests in that commit: ef84679
- In other compiler-like projects, this is a common approach. Here are a few examples:
  - go compiler (https://github.com/golang/go/blob/master/test/assign1.go)
  - rustc UI tests (https://github.com/rust-lang/rust/tree/master/src/test/ui)
  - zig compiler (ziglang/zig#11288)

next steps: same setup for reformatting tests
tomhoule added a commit to prisma/prisma-engines that referenced this issue Aug 1, 2022
The main problem this addresses are:

- Diffs are noisy.
  Datamodel tests involve large strings literals, and very little
  significant information otherwise. In a single file, even for simple
  changes like reordering or renaming tests, or changing indentation,
  the diffs get unreadable quickly, as the large blocks of Prisma
  schemas and validation errors get mixed with the test function names
  and calls to test helpers. This state of things disincentivizes
  reorganizing tests. The new setup has one test per file, with the test
  name being the file name, including its full path from the validation
  tests root.

- Churn.
  We often need to subtly adjust many tests to account for trivial
  changes in the public APIs, when all we want to do is assert that a
  certain schema is valid, or invalid with a given error. This prevents
  us from evolving the datamodel API as much as we would like, and as a
  consequence of these changes being a lot of work, we have a lot of
  inconsistency in how tests are written and which test helpers are
  used. This has manifested recently in difficulties simply moving
  tests, without changing them, from `libs/datamodel/core` to `psl/psl`.

  With this test suite, we rely on a lot less context, setup and noise
  (variable names for expectations). We don't expect the shape of these
  tests to evolve much, churn should be minimal and if we need to change
  something, it will be possible to autofix through UPDATE_EXPECT=1.

- The test suite gets slower to compile as we add more tests, and that
  noticeably hurts our feedback loops.
  This setup is less rust code to compile: each test case translates to
  a single test function containing a single synchronous function call.
  We can transparently make test runs compile no rust code at all in the
  future if we need to.

Prior art:

- In prisma-engines: we already did that in introspection engine tests in that commit: ef84679
- In other compiler-like projects, this is a common approach. Here are a few examples:
  - go compiler (https://github.com/golang/go/blob/master/test/assign1.go)
  - rustc UI tests (https://github.com/rust-lang/rust/tree/master/src/test/ui)
  - zig compiler (ziglang/zig#11288)

next steps: same setup for reformatting tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted This proposal is planned. enhancement Solving this issue will likely involve adding new logic or components to the codebase. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests

5 participants