Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flakiness in quick tests #2783

Open
mlavacca opened this issue Aug 5, 2022 · 0 comments
Open

Flakiness in quick tests #2783

mlavacca opened this issue Aug 5, 2022 · 0 comments

Comments

@mlavacca
Copy link
Member

mlavacca commented Aug 5, 2022

Problem Statement

The conformance tests are currently not executed: when the CI runs them, a test environment is set up, then the tests are skipped, and the environment is destroyed. This rapid situation in which nothing is really tested led to a test flake. Below are the test logs.

panic: failed to initialize kong data-plane client: making HTTP request: Get "http://172.18.0.240:8001/": read tcp 172.18.0.1:48740->172.18.0.240:8001: read: connection reset by peer

goroutine 354 [running]:
github.com/kong/kubernetes-ingress-controller/v2/internal/util/test.DeployControllerManagerForCluster.func1()
	/home/runner/work/kubernetes-ingress-controller/kubernetes-ingress-controller/internal/util/test/controller_manager.go:83 +0x248
created by github.com/kong/kubernetes-ingress-controller/v2/internal/util/test.DeployControllerManagerForCluster
	/home/runner/work/kubernetes-ingress-controller/kubernetes-ingress-controller/internal/util/test/controller_manager.go:79 +0x7a6
FAIL	github.com/kong/kubernetes-ingress-controller/v2/test/conformance	93.110s

This flake could have been caused by a race condition described as follows:

  1. the cluster is created and kong is deployed
  2. the ingress controller is started in a separate go routine
  3. the test starts and quickly ends
  4. the cleanup starts and the Kong deployment is deleted
  5. in the meantime, the ingress controller startup hasn't been completed yet, the controller tries to create a new KongClient, but that fails, because Kong is not running anymore.
  6. the test fails.

The above is a speculation based on the logs and the code that is executed during the tests. The goal of this issue is to investigate if this flake can be imputed to that race condition or to something else and if it can harm the stability of our tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants