Skip to content

Feature/python test runner #417

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 146 commits into from
Aug 24, 2022
Merged
Changes from all commits
Commits
Show all changes
146 commits
Select commit Hold shift + click to select a range
a971830
start implementing a python launch controller
dothebart Jul 4, 2022
0226c4e
make it work for the first time
dothebart Jul 5, 2022
1c311e5
try launching outside of oskar.
dothebart Aug 2, 2022
c9b3760
no more pipes needed
dothebart Jul 5, 2022
f7abec9
adjust report directory
dothebart Jul 6, 2022
f797024
fix paths, thread naming.
dothebart Jul 6, 2022
9ffa927
fallback if no env is configured
dothebart Jul 6, 2022
cb3d867
lint
dothebart Jul 6, 2022
02f086e
more work on cluster etc
dothebart Jul 6, 2022
fdccda0
silence, proper error message for missing variable
dothebart Jul 6, 2022
6ac8ab6
convert params
dothebart Jul 6, 2022
30be55e
lint
dothebart Jul 6, 2022
14c5b9a
fix slot
dothebart Jul 7, 2022
e7f4cd0
fix arangosh.conf, launching of subsequent testruns
dothebart Jul 7, 2022
cb69dd3
try to launch it from fish
dothebart Jul 7, 2022
eae8168
implement 7zip
dothebart Jul 7, 2022
b5d651e
add modules to the docker container
dothebart Jul 7, 2022
3c7825a
more printing
dothebart Jul 7, 2022
8383c4c
fix handling
dothebart Jul 7, 2022
2eb57a8
Add pip3
KVS85 Jul 7, 2022
295c7e0
Fix typo
KVS85 Jul 7, 2022
70fa195
Typo 2
KVS85 Jul 7, 2022
7d22e6f
handle INNERWORKDIR
dothebart Jul 7, 2022
e4fa44f
fix missing line break
dothebart Jul 7, 2022
6e91089
export settings
dothebart Jul 8, 2022
bd19986
fix typo
dothebart Jul 8, 2022
0a423ee
on windows skip !windows tests
dothebart Jul 8, 2022
fdfa129
lint, refactor, simplify
dothebart Jul 8, 2022
ba6d7a5
install 7z
dothebart Jul 8, 2022
8e2a71d
export core directory
dothebart Jul 11, 2022
bb55fde
work on fish integration
dothebart Jul 11, 2022
4003587
similarize for new python job scheduler
dothebart Jul 11, 2022
6915769
work on reprot generating
dothebart Jul 12, 2022
78ecbfd
try to implement timeout
dothebart Jul 12, 2022
22f9f51
also upload 7z and txt
dothebart Jul 12, 2022
5cd66d6
also upload 7z and txt
dothebart Jul 12, 2022
d6508d9
fix deadline
dothebart Jul 12, 2022
6f0b17e
fix workspace handling
dothebart Jul 12, 2022
4e4e8b2
fix temporary directory handling
dothebart Jul 12, 2022
6018f9a
make sure out temp directory exists
dothebart Jul 12, 2022
0d899f5
RTFM fail
dothebart Jul 12, 2022
9823716
don't put it to the workspace
dothebart Jul 13, 2022
156d767
implement gtest invoking
dothebart Jul 13, 2022
7562862
cleanup
dothebart Jul 13, 2022
3acedf9
sort, lint
dothebart Jul 13, 2022
4ed0cf1
prefer INNERWORKDIR
dothebart Jul 13, 2022
a584056
implement writing test.log
dothebart Jul 13, 2022
51fe696
implement html report
dothebart Jul 13, 2022
067e52f
bring back function deletet to early
dothebart Aug 2, 2022
80c9e54
install the windows boomerang handler on top level
dothebart Jul 14, 2022
f5edab9
fix include
dothebart Jul 14, 2022
9bb5fd0
fix reference
dothebart Jul 14, 2022
158268f
print before killing shit
dothebart Jul 14, 2022
9d9da8a
work on timeout
dothebart Jul 15, 2022
33324a1
finish deadline handling, rename script
dothebart Jul 18, 2022
d9b1cba
fix exit code handling
dothebart Jul 18, 2022
7c0e238
lint
dothebart Jul 18, 2022
fedca4d
thanks @mpoeter for ps aid
dothebart Jul 19, 2022
e601103
make the thread identifier the test plus a growing number
dothebart Jul 19, 2022
6774b4a
implement central final deadline, which will kick in after 2 minutes
dothebart Jul 19, 2022
e050e41
remove debug output
dothebart Jul 19, 2022
6ecf3ac
use /usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games:/snap/bi…
dothebart Jul 19, 2022
dbc55db
wintendo next try
dothebart Jul 19, 2022
37e7fe9
wintendo next try
dothebart Jul 19, 2022
d02e3f4
wintendo go home
dothebart Jul 19, 2022
9f316b3
fix calculation of hard time limit
dothebart Jul 19, 2022
2c31934
make sure nobody changes the exit code to good
dothebart Jul 19, 2022
510f52c
add monkey patches
dothebart Jul 19, 2022
299a8ad
cleanup deadline
dothebart Jul 20, 2022
192967f
ignore exceptions if no process is there
dothebart Jul 20, 2022
9dc94b2
deadline handling: prioritize incomming lines over timeout counting
dothebart Jul 20, 2022
8fa4ea2
fix directory handling
dothebart Jul 20, 2022
e6368d7
work on result presentation
dothebart Jul 21, 2022
f74ce49
cleanup
dothebart Jul 21, 2022
b2bdab6
let the file remain open for further info
dothebart Jul 21, 2022
fa89e39
fix environment variable handling
dothebart Jul 21, 2022
f8b65fa
documentation
dothebart Jul 21, 2022
f1921fd
fix port handling
dothebart Jul 21, 2022
ae07a8b
work on deadline
dothebart Jul 21, 2022
b7a29ed
fix hard deadline handling
dothebart Jul 21, 2022
6cd94a0
make it 20s
dothebart Jul 21, 2022
00144a4
need more time
dothebart Jul 21, 2022
8510dbc
list processes so we may guess whats actually going on
dothebart Jul 21, 2022
122f15c
kill all, then waitpid all
dothebart Jul 22, 2022
441ab0e
make threads provide half a slot.
dothebart Jul 22, 2022
79652af
be sure to catch
dothebart Jul 22, 2022
3f4a473
resume just in case, then kill
dothebart Jul 22, 2022
d500a89
resume just in case, then kill
dothebart Jul 22, 2022
b3f2346
ignore resume errors
dothebart Jul 25, 2022
1472a13
increase volume
dothebart Jul 25, 2022
586da13
lint
dothebart Jul 25, 2022
6dbdb4a
lint
dothebart Jul 25, 2022
b72cb41
catch more
dothebart Jul 25, 2022
79d29fc
add multipliers
dothebart Jul 26, 2022
e762bdd
more load, print load avg
dothebart Jul 27, 2022
350e481
fix sorting by prio - biggest values first
dothebart Jul 27, 2022
0617478
cleanup crash report for size
dothebart Jul 27, 2022
1ff13b7
if test indicates its been crashing create report as well.
dothebart Jul 27, 2022
b10e155
more threat to the machine.
dothebart Jul 27, 2022
f096bc9
timeout
dothebart Jul 27, 2022
12eff18
fix typo
dothebart Jul 27, 2022
66f6c69
delete tzdata subdir first
dothebart Jul 28, 2022
434ee8f
use load and sockets for throttle control
dothebart Jul 28, 2022
2154a58
install required python libs
dothebart Jul 28, 2022
f7afa24
only see for load [0, 1]
dothebart Jul 28, 2022
c9235a2
increase container version
dothebart Jul 28, 2022
ea4bb9d
anounce deadline at start
dothebart Jul 28, 2022
8c851db
don't print to logfile
dothebart Jul 29, 2022
52d7ac7
give better feedback if arangosh fails to launch in first place, than…
dothebart Jul 29, 2022
a9938a7
Update helper.linux.fish
KVS85 Aug 2, 2022
ac06090
tschuess ruby
dothebart Aug 2, 2022
c6b7b97
Merge branch 'master' into feature/python_test_runner
dothebart Aug 10, 2022
052df56
Merge branch 'master' of github.com:arangodb/oskar into feature/pytho…
dothebart Aug 12, 2022
b660d08
re-sync to be stock RTA
dothebart Aug 12, 2022
f47b0ba
fix container numbers, adjust #3
dothebart Aug 12, 2022
0cba0b0
sync to rta
dothebart Aug 12, 2022
e3f52a9
resync
dothebart Aug 15, 2022
9ce6a70
this is not needed anymore
dothebart Aug 15, 2022
e4d3303
add --fix-missing
dothebart Aug 15, 2022
1a2a812
fresh python?
dothebart Aug 15, 2022
3acc581
revert to tar.gz
dothebart Aug 15, 2022
2f70af6
chaos tests in nightlies demand for longer timeouts, since tests run …
dothebart Aug 15, 2022
a09554d
Update README.md
dothebart Aug 15, 2022
5b68416
Update README.md
dothebart Aug 15, 2022
9a5beea
Update README.md
dothebart Aug 15, 2022
b7fb017
Update README.md
dothebart Aug 15, 2022
1a9c364
Update README.md
dothebart Aug 15, 2022
70d1842
remove more old stuff
dothebart Aug 15, 2022
46f058d
Merge branch 'feature/python_test_runner' of github.com:arangodb/oska…
dothebart Aug 15, 2022
2a718d8
ignore encoding errors
dothebart Aug 16, 2022
1af8187
increase timeout to hard self kill
dothebart Aug 16, 2022
50055d4
switch to one environment variable name
dothebart Aug 17, 2022
1363b8d
env
dothebart Aug 17, 2022
69b7a0e
limit the amount of coredumps
dothebart Aug 22, 2022
7b9b50c
ignore access denied to open sockets
dothebart Aug 22, 2022
0baa8b2
if we need to wait for the system to cool down on start...
dothebart Aug 22, 2022
f8841fe
make sure we don't come back good if nothing launched at all
dothebart Aug 22, 2022
d809254
them tiny boxes need more time
dothebart Aug 22, 2022
ba8d770
need more time
dothebart Aug 23, 2022
54ffe77
add deadline status to testfailurs.txt
dothebart Aug 23, 2022
5b9bd1d
need more time
dothebart Aug 23, 2022
34bb598
beautify testfailures.txt
dothebart Aug 23, 2022
94fba70
give machine estimate reasons at the start of the run
dothebart Aug 23, 2022
4a724b8
Merge branch 'main' of github.com:arangodb/oskar into feature/python_…
dothebart Aug 23, 2022
120b28f
case may matter
dothebart Aug 23, 2022
f966a1f
one more environment variable
dothebart Aug 24, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 30 additions & 18 deletions jenkins/helper/test_launch_controller.py
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,8 @@ def get_workspace():
return Path.cwd() / 'work'

TEMP = Path("/tmp/")
if 'TMP' in os.environ:
TEMP = Path(os.environ['TMP'])
if 'TEMP' in os.environ:
TEMP = Path(os.environ['TEMP'])
if 'INNERWORKDIR' in os.environ:
Expand Down Expand Up @@ -302,9 +304,11 @@ def __init__(self, definition_file):
self.timeout = 1800
if 'timeLimit'.upper() in os.environ:
self.timeout = int(os.environ['timeLimit'.upper()])
elif 'timeLimit' in os.environ:
self.timeout = int(os.environ['timeLimit'])
if psutil.cpu_count() <= 8:
print("Small machine detected, trippling deadline!")
self.timeout *= 3
print("Small machine detected, quadrupling deadline!")
self.timeout *= 4
self.deadline = datetime.now() + timedelta(seconds=self.timeout)
self.hard_deadline = datetime.now() + timedelta(seconds=self.timeout + 660)
if definition_file.is_file():
Expand All @@ -315,7 +319,21 @@ def __init__(self, definition_file):
for target in ['RelWithdebInfo', 'Debug']:
if (bin_dir / target).exists():
bin_dir = bin_dir / target

self.no_threads = psutil.cpu_count()
self.available_slots = round(self.no_threads * 2) #logical=False)
if IS_WINDOWS:
self.max_load = 0.85
self.max_load1 = 0.75
else:
self.max_load = self.no_threads * 0.9
self.max_load1 = self.no_threads * 0.9
# self.available_slots += (psutil.cpu_count(logical=True) - self.available_slots) / 2
print(f"""Machine Info:
- {psutil.cpu_count(logical=False)} Cores / {psutil.cpu_count(logical=True)} Threads
- {psutil.virtual_memory()} virtual Memory
- {self.max_load} / {self.max_load1} configured maximum load 0 / 1
- {self.available_slots} test slots
""")
self.cfgdir = base_source_dir / 'etc' / 'relative'
self.bin_dir = bin_dir
self.base_path = base_source_dir
Expand Down Expand Up @@ -396,15 +414,6 @@ def __init__(self, cfg):
self.cfg = cfg
self.deadline_reached = False
self.slot_lock = Lock()
self.no_threads = psutil.cpu_count()
self.available_slots = round(self.no_threads * 2) #logical=False)
if IS_WINDOWS:
self.max_load = 0.85
self.max_load1 = 0.75
else:
self.max_load = self.no_threads * 0.9
self.max_load1 = self.no_threads * 0.9
# self.available_slots += (psutil.cpu_count(logical=True) - self.available_slots) / 2
self.used_slots = 0
self.scenarios = []
self.arangosh = ArangoshExecutor(self.cfg, self.slot_lock)
Expand All @@ -429,7 +438,7 @@ def done_job(self, parallelity):

def launch_next(self, offset, counter):
""" launch one testing job """
if self.scenarios[offset].parallelity > (self.available_slots - self.used_slots):
if self.scenarios[offset].parallelity > (self.cfg.available_slots - self.used_slots):
return False
try:
sock_count = get_socket_count()
Expand All @@ -439,8 +448,8 @@ def launch_next(self, offset, counter):
except psutil.AccessDenied:
pass
load = psutil.getloadavg()
if ((load[0] > self.max_load) or
(load[1] > self.max_load1)):
if ((load[0] > self.cfg.max_load) or
(load[1] > self.cfg.max_load1)):
print(F"Load to high: {str(load)} waiting before spawning more")
return False
with self.slot_lock:
Expand Down Expand Up @@ -514,6 +523,7 @@ def handle_deadline(self):

def testing_runner(self):
""" run testing suites """
# pylint: disable=too-many-branches
mem = psutil.virtual_memory()
os.environ['ARANGODB_OVERRIDE_DETECTED_TOTAL_MEMORY'] = str(int((mem.total * 0.8) / 9))

Expand All @@ -535,8 +545,8 @@ def testing_runner(self):
used_slots = 0
with self.slot_lock:
used_slots = self.used_slots
if self.available_slots > used_slots and start_offset < len(self.scenarios):
print(f"Launching more: {self.available_slots} > {used_slots} {counter}")
if self.cfg.available_slots > used_slots and start_offset < len(self.scenarios):
print(f"Launching more: {self.cfg.available_slots} > {used_slots} {counter}")
sys.stdout.flush()
if self.launch_next(start_offset, counter):
start_offset += 1
Expand Down Expand Up @@ -573,7 +583,9 @@ def generate_report_txt(self):
for testrun in self.scenarios:
print(testrun)
if testrun.crashed or not testrun.success:
summary += testrun.summary
summary += f"\n=== {testrun.name} ===\n{testrun.summary}"
if testrun.finish is None:
summary += f"\n=== {testrun.name} ===\nhasn't been launched at all!"
print(summary)
(get_workspace() / 'testfailures.txt').write_text(summary)

Expand Down