Skip to content

Commit 62a22d8

Browse files
authored
Update installation instructions (CrayLabs#262)
The changes here describe how to install SmartSim on a wide variety of platforms. Some of this material was previously available, but in general should be easier to follow for users. In particular, this PR includes instructions for how to build SmartSim with the GPU-enabled ML backends and elucidates the process for using 'site-install' of SmartSim. [ committed by @ashao ] [ reviewed by @amandarichardsonn @al-rigazzi @MattToast ]
1 parent 35857f6 commit 62a22d8

15 files changed

+968
-621
lines changed

codecov.yml

+2-2
Original file line numberDiff line numberDiff line change
@@ -10,11 +10,11 @@ coverage:
1010
patch:
1111
default:
1212
# Account for some variability in codecov
13-
threshold: 0.5%
13+
target: 0%
1414
project:
1515
default:
1616
# Account for some variability in codecov
17-
threshold: 0.5%
17+
target: 0%
1818

1919
parsers:
2020
gcov:

doc/changelog.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -276,7 +276,7 @@ Released on May 5, 2021
276276

277277
Description:
278278
This release was dedicated to making the install process
279-
easier. SmartSim can be installed from PyPi now and the
279+
easier. SmartSim can be installed from PyPI now and the
280280
``smart`` cli tool makes installing the machine learning
281281
runtimes much easier.
282282

doc/developer.rst

+63-21
Original file line numberDiff line numberDiff line change
@@ -6,18 +6,62 @@ Developer
66
This section details common practices and tips for contributors
77
to SmartSim and SmartRedis.
88

9+
==========================
10+
Building the Documentation
11+
==========================
12+
13+
Users can optionally build documentation of SmartSim through ``make docs`` or
14+
``make docks``. ``make docs`` requires the user to install the documentation
15+
build dependencies, whereas ``make docks`` only requires docker. ``make docks``
16+
is the recommended method for building the documentation locally, due to ease of
17+
use.
18+
19+
With docker
20+
===========
21+
22+
.. note::
23+
24+
To build the full documentation with ``make docks``, users need to install
25+
`docker <https://docs.docker.com/desktop/>`_ so that ``docker`` is available
26+
on the command line.
27+
28+
.. code-block:: bash
29+
30+
# From top level smartsim git repository directory
31+
make docks
32+
33+
Once the documentation has successfully built, users can open the main documents
34+
page from ``docs/develop/index.html``.
35+
36+
Without docker
37+
==============
38+
39+
.. note::
40+
41+
To build the full documentation via ``make docs``, users need to install
42+
``doxygen 1.9.1``. For Mac OS users, doxygen can be installed through ``brew
43+
install doxygen``
44+
45+
.. code-block:: bash
46+
47+
# From top level smartsim git repository directory
48+
git clone https://github.com/CrayLabs/SmartRedis.git smartredis
49+
make docs
50+
51+
Once the documentation has successfully built, users can open the main documents
52+
page from ``doc/_build/html/index.html``
53+
954
================
1055
Testing SmartSim
1156
================
1257

1358
.. note::
1459

15-
This section describes how to run the SmartSim (infrastructure library)
16-
test suite. For testing SmartRedis, see below
60+
This section describes how to run the SmartSim (infrastructure library) test
61+
suite. For testing SmartRedis, see below
1762

18-
SmartSim utilizes ``Pytest`` for running its test suite. In the
19-
top level of SmartSim, users can run multiple testing commands
20-
with the developer Makefile
63+
SmartSim utilizes ``Pytest`` for running its test suite. In the top level of
64+
SmartSim, users can run multiple testing commands with the developer Makefile
2165

2266
.. code-block:: text
2367
@@ -29,27 +73,26 @@ with the developer Makefile
2973
3074
.. note::
3175

32-
For the test to run, you must have the ``requirements-dev.txt``
33-
dependencies installed in your python environment.
76+
For the test to run, you must have the ``requirements-dev.txt`` dependencies
77+
installed in your python environment.
3478

3579

3680
Local
3781
=====
3882

39-
There are two levels of testing in SmartSim. The first
40-
runs by default and does not launch any jobs out onto
41-
a system through a workload manager like Cobalt.
83+
There are two levels of testing in SmartSim. The first runs by default and does
84+
not launch any jobs out onto a system through a workload manager like Cobalt.
4285

43-
If any of the above commands are used, the test suite will
44-
run the "light" test suite by default.
86+
If any of the above commands are used, the test suite will run the "light" test
87+
suite by default.
4588

4689

4790
PBSPro, Slurm, Cobalt, LSF
4891
==========================
4992

50-
To run the full test suite, users will have to be on a system
51-
with one of the above workload managers. Additionally, users will
52-
need to obtain an allocation of at least 3 nodes.
93+
To run the full test suite, users will have to be on a system with one of the
94+
above workload managers. Additionally, users will need to obtain an allocation
95+
of at least 3 nodes.
5396

5497
.. code-block:: bash
5598
@@ -63,22 +106,21 @@ need to obtain an allocation of at least 3 nodes.
63106
qsub -n 3 -t 00:10:00 -A account -q queue -I
64107
65108
# for LSF (with jsrun)
66-
bsub -Is -W 00:30 -nnodes 3 -P project $SHELL
109+
bsub -Is -W 00:30 -nnodes 3 -P project $SHELL
67110
68111
Values for queue, account, or project should be substituted appropriately.
69112

70-
Once in an iterative allocation, users will need to set the test
71-
launcher environment variable: ``SMARTSIM_TEST_LAUNCHER`` to one
72-
of the following values
113+
Once in an iterative allocation, users will need to set the test launcher
114+
environment variable: ``SMARTSIM_TEST_LAUNCHER`` to one of the following values
73115

74116
- slurm
75117
- cobalt
76118
- pbs
77119
- lsf
78120
- local
79121

80-
If tests have to run on an account or project,
81-
the environment variable ``SMARTSIM_TEST_ACCOUNT`` can be set.
122+
If tests have to run on an account or project, the environment variable
123+
``SMARTSIM_TEST_ACCOUNT`` can be set.
82124

83125
-------------------------------------------------------
84126

doc/experiment.rst

+31-30
Original file line numberDiff line numberDiff line change
@@ -3,14 +3,14 @@
33
Experiments
44
***********
55

6-
The Experiment acts as both a factory class for constructing the
7-
stages of an experiment (``Model``, ``Ensemble``, ``Orchestrator``, etc.)
8-
as well as an interface to interact with the entities created by the experiment.
6+
The Experiment acts as both a factory class for constructing the stages of an
7+
experiment (``Model``, ``Ensemble``, ``Orchestrator``, etc.) as well as an
8+
interface to interact with the entities created by the experiment.
99

10-
Users can initialize an :ref:`Experiment <experiment_api>` at the beginning of a Jupyter notebook,
11-
interactive python session, or Python file and use the ``Experiment`` to
12-
iteratively create, configure and launch computational kernels on the
13-
system through the specified launcher.
10+
Users can initialize an :ref:`Experiment <experiment_api>` at the beginning of a
11+
Jupyter notebook, interactive python session, or Python file and use the
12+
``Experiment`` to iteratively create, configure and launch computational kernels
13+
on the system through the specified launcher.
1414

1515
.. |SmartSim Architecture| image:: images/ss-arch-overview.png
1616
:width: 700
@@ -19,21 +19,21 @@ system through the specified launcher.
1919
|SmartSim Architecture|
2020

2121

22-
The interface was designed to be simple, with as little complexity
23-
as possible, and agnostic to the backend launching mechanism (local,
24-
Slurm, PBSPro, etc.).
22+
The interface was designed to be simple, with as little complexity as possible,
23+
and agnostic to the backend launching mechanism (local, Slurm, PBSPro, etc.).
2524

2625
Model
2726
=====
2827

2928
``Model(s)`` are subclasses of ``SmartSimEntity(s)`` and are created through the
30-
Experiment API. Models represent any computational kernel. Models are flexible enough
31-
to support many different applications, however, to be used with our clients
32-
(SmartRedis) the application will have to be written in Python, C, C++, or Fortran.
29+
Experiment API. Models represent any computational kernel. Models are flexible
30+
enough to support many different applications, however, to be used with our
31+
clients (SmartRedis) the application will have to be written in Python, C, C++,
32+
or Fortran.
3333

34-
Models are given :ref:`RunSettings <rs-api>` objects that specify how a kernel should
35-
be executed with regard to the workload manager (e.g. Slurm) and the available
36-
compute resources on the system.
34+
Models are given :ref:`RunSettings <rs-api>` objects that specify how a kernel
35+
should be executed with regard to the workload manager (e.g. Slurm) and the
36+
available compute resources on the system.
3737

3838
Each launcher supports specific types of ``RunSettings``.
3939

@@ -45,7 +45,8 @@ Each launcher supports specific types of ``RunSettings``.
4545
These settings can be manually specified by the user, or auto-detected by the
4646
SmartSim Experiment through the ``Experiment.create_run_settings`` method.
4747

48-
A simple example of using the Experiment API to create a model and run it locally:
48+
A simple example of using the Experiment API to create a model and run it
49+
locally:
4950

5051
.. code-block:: Python
5152
@@ -83,9 +84,9 @@ For example with Slurm
8384
8485
print(exp.get_status(model))
8586
86-
The above will run ``srun -n 32 -N 1 echo Hello World!``, monitor it's execution,
87-
and inform the user when it is completed. This driver script can be executed in
88-
an interactive allocation, or placed into a batch script as follows:
87+
The above will run ``srun -n 32 -N 1 echo Hello World!``, monitor its
88+
execution, and inform the user when it is completed. This driver script can be
89+
executed in an interactive allocation, or placed into a batch script as follows:
8990

9091
.. code-block:: bash
9192
@@ -108,18 +109,18 @@ An ``Ensemble`` can be constructed in three ways:
108109
2. Replica creation (by specifying ``replicas`` argument)
109110
3. Manually (by adding created ``Model`` objects) if launching as a batch job
110111

111-
Ensembles can be given parameters and permutation strategies that
112-
define how the ``Ensemble`` will create the underlying model objects.
112+
Ensembles can be given parameters and permutation strategies that define how the
113+
``Ensemble`` will create the underlying model objects.
113114

114115
Three strategies are built in:
115116
1. ``all_perm``: for generating all permutations of model parameters
116117
2. ``step``: for creating one set of parameters for each element in `n` arrays
117118
3. ``random``: for random selection from predefined parameter spaces
118119

119120
Here is an example that uses the ``random`` strategy to intialize four models
120-
with random parameters within a set range. We use the ``params_as_args``
121-
field to specify that the randomly selected learning rate parameter should
122-
be passed to the created models as a executable argument.
121+
with random parameters within a set range. We use the ``params_as_args`` field
122+
to specify that the randomly selected learning rate parameter should be passed
123+
to the created models as a executable argument.
123124

124125
.. code-block:: bash
125126
@@ -145,11 +146,11 @@ be passed to the created models as a executable argument.
145146
exp.start(ensemble, summary=True)
146147
147148
148-
A callable function can also be supplied for custom permutation strategies.
149-
The function should take two arguments: a list of parameter names, and a list of lists
150-
of potential parameter values. The function should return a list of dictionaries that
151-
will be supplied as model parameters. The length of the list returned will determine
152-
how many ``Model`` instances are created.
149+
A callable function can also be supplied for custom permutation strategies. The
150+
function should take two arguments: a list of parameter names, and a list of
151+
lists of potential parameter values. The function should return a list of
152+
dictionaries that will be supplied as model parameters. The length of the list
153+
returned will determine how many ``Model`` instances are created.
153154

154155
For example, the following is the built-in strategy ``all_perm``:
155156

doc/index.rst

+4-2
Original file line numberDiff line numberDiff line change
@@ -7,11 +7,12 @@
77
versions
88

99
.. toctree::
10-
:maxdepth: 2
10+
:maxdepth: 3
1111
:caption: Getting Started
1212

1313
overview
14-
installation
14+
installation_instructions/basic
15+
installation_instructions/platform
1516
community
1617
contributing_examples
1718

@@ -54,6 +55,7 @@
5455
changelog
5556
code_of_conduct
5657
developer
58+
testing
5759

5860

5961
Indices and tables

0 commit comments

Comments
 (0)