Skip to content

Add options docstrings to the docs #493

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/src/api-dagger/types.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,14 +14,14 @@ EagerThunk
```

## Task Options Types
```
```@docs
Options
Sch.ThunkOptions
Sch.SchedulerOptions
```

## Data Management Types
```
```@docs
Chunk
Shard
```
Expand Down
4 changes: 2 additions & 2 deletions docs/src/scheduler-internals.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ execution (called "firing"). Once all tasks are either waiting or running, the
scheduler may sleep until actions need to be performed

When fired tasks have completed executing, an entry will exist in the inbound
queue signaling the task's result and other metadata. At this point, the most
queue signalling the task's result and other metadata. At this point, the most
recently-queued task is removed from the queue, "finished", and placed in the
"finished" state. Finishing usually unlocks downstream tasks from the waiting
state and allows them to transition to the ready state.
Expand Down Expand Up @@ -117,7 +117,7 @@ outdated, or when its estimates about the task's behavior are inaccurate. To
minimize the possibility of workload imbalance, the worker schedulers'
processors will attempt to steal tasks from each other when they are
under-occupied. Tasks will only be stolen if the task's [scope](scopes.md) is
compatibl with the processor attempting the steal, so tasks with wider scopes
compatible with the processor attempting the steal, so tasks with wider scopes
have better balancing potential.

## Core: Finishing
Expand Down
36 changes: 21 additions & 15 deletions docs/src/task-spawning.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
```@meta
CurrentModule = Dagger
```

# Task Spawning

The main entrypoint to Dagger is `@spawn`:
Expand All @@ -8,8 +12,8 @@ or `spawn` if it's more convenient:

`Dagger.spawn(f, Dagger.Options(options), args...; kwargs...)`

When called, it creates an `EagerThunk` (also known as a "thunk" or "task")
object representing a call to function `f` with the arguments `args` and
When called, it creates an [`EagerThunk`](@ref) (also known as a "thunk" or
"task") object representing a call to function `f` with the arguments `args` and
keyword arguments `kwargs`. If it is called with other thunks as args/kwargs,
such as in `Dagger.@spawn f(Dagger.@spawn g())`, then, in this example, the
function `f` gets passed the results of executing `g()`, once that result is
Expand All @@ -18,21 +22,23 @@ waits on `g()` to complete before executing.

An important observation to make is that, for each argument to
`@spawn`/`spawn`, if the argument is the result of another `@spawn`/`spawn`
call (thus it's an `EagerThunk`), the argument will be computed first, and then
call (thus it's an [`EagerThunk`](@ref)), the argument will be computed first, and then
its result will be passed into the function receiving the argument. If the
argument is *not* an `EagerThunk` (instead, some other type of Julia object),
argument is *not* an [`EagerThunk`](@ref) (instead, some other type of Julia object),
it'll be passed as-is to the function `f` (with some exceptions).

## Options

The `Options` struct in the second argument position is optional; if provided,
it is passed to the scheduler to control its behavior. `Options` contains a
`NamedTuple` of option key-value pairs, which can be any of:
- Any field in `Dagger.Sch.ThunkOptions` (see [Scheduler and Thunk options](@ref))
- `meta::Bool` -- Pass the input `Chunk` objects themselves to `f` and not the value contained in them
The [`Options`](@ref Dagger.Options) struct in the second argument position is
optional; if provided, it is passed to the scheduler to control its
behavior. [`Options`](@ref Dagger.Options) contains a `NamedTuple` of option
key-value pairs, which can be any of:
- Any field in [`Sch.ThunkOptions`](@ref) (see [Scheduler and Thunk options](@ref))
- `meta::Bool` -- Pass the input [`Chunk`](@ref) objects themselves to `f` and
not the value contained in them.

There are also some extra optionss that can be passed, although they're considered advanced options to be used only by developers or library authors:
- `get_result::Bool` -- return the actual result to the scheduler instead of `Chunk` objects. Used when `f` explicitly constructs a Chunk or when return value is small (e.g. in case of reduce)
- `get_result::Bool` -- return the actual result to the scheduler instead of [`Chunk`](@ref) objects. Used when `f` explicitly constructs a [`Chunk`](@ref) or when return value is small (e.g. in case of reduce)
- `persist::Bool` -- the result of this Thunk should not be released after it becomes unused in the DAG
- `cache::Bool` -- cache the result of this Thunk such that if the thunk is evaluated again, one can just reuse the cached value. If it’s been removed from cache, recompute the value.

Expand Down Expand Up @@ -68,9 +74,9 @@ The final result (from `fetch(s)`) is the obvious consequence of the operation:
### Eager Execution

Dagger's `@spawn` macro works similarly to `@async` and `Threads.@spawn`: when
called, it wraps the function call specified by the user in an `EagerThunk`
object, and immediately places it onto a running scheduler, to be executed once
its dependencies are fulfilled.
called, it wraps the function call specified by the user in an
[`EagerThunk`](@ref) object, and immediately places it onto a running scheduler,
to be executed once its dependencies are fulfilled.

```julia
x = rand(400,400)
Expand Down Expand Up @@ -181,8 +187,8 @@ Note that, as a legacy API, usage of the lazy API is generally discouraged for m
While Dagger generally "just works", sometimes one needs to exert some more
fine-grained control over how the scheduler allocates work. There are two
parallel mechanisms to achieve this: Scheduler options (from
`Dagger.Sch.SchedulerOptions`) and Thunk options (from
`Dagger.Sch.ThunkOptions`). These two options structs contain many shared
[`Sch.SchedulerOptions`](@ref)) and Thunk options (from
[`Sch.ThunkOptions`](@ref)). These two options structs contain many shared
options, with the difference being that Scheduler options operate
globally across an entire DAG, and Thunk options operate on a thunk-by-thunk
basis.
Expand Down
7 changes: 7 additions & 0 deletions src/eager_thunk.jl
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,13 @@ function Base.fetch(t::ThunkFuture; proc=OSProc(), raw=false)
end
Base.put!(t::ThunkFuture, x; error=false) = put!(t.future, (error, x))

"""
Options(::NamedTuple)
Options(; kwargs...)

Options for thunks and the scheduler. See [Task Spawning](@ref) for more
information.
"""
struct Options
options::NamedTuple
end
Expand Down
96 changes: 48 additions & 48 deletions src/sch/Sch.jl
Original file line number Diff line number Diff line change
Expand Up @@ -153,23 +153,23 @@ Stores DAG-global options to be passed to the Dagger.Sch scheduler.

# Arguments
- `single::Int=0`: (Deprecated) Force all work onto worker with specified id.
`0` disables this option.
`0` disables this option.
- `proclist=nothing`: (Deprecated) Force scheduler to use one or more
processors that are instances/subtypes of a contained type. Alternatively, a
function can be supplied, and the function will be called with a processor as
the sole argument and should return a `Bool` result to indicate whether or not
to use the given processor. `nothing` enables all default processors.
processors that are instances/subtypes of a contained type. Alternatively, a
function can be supplied, and the function will be called with a processor as
the sole argument and should return a `Bool` result to indicate whether or not
to use the given processor. `nothing` enables all default processors.
- `allow_errors::Bool=true`: Allow thunks to error without affecting
non-dependent thunks.
non-dependent thunks.
- `checkpoint=nothing`: If not `nothing`, uses the provided function to save
the final result of the current scheduler invocation to persistent storage, for
later retrieval by `restore`.
the final result of the current scheduler invocation to persistent storage, for
later retrieval by `restore`.
- `restore=nothing`: If not `nothing`, uses the provided function to return the
(cached) final result of the current scheduler invocation, were it to execute.
If this returns a `Chunk`, all thunks will be skipped, and the `Chunk` will be
returned. If `nothing` is returned, restoring is skipped, and the scheduler
will execute as usual. If this function throws an error, restoring will be
skipped, and the error will be displayed.
(cached) final result of the current scheduler invocation, were it to execute.
If this returns a `Chunk`, all thunks will be skipped, and the `Chunk` will be
returned. If `nothing` is returned, restoring is skipped, and the scheduler
will execute as usual. If this function throws an error, restoring will be
skipped, and the error will be displayed.
"""
Base.@kwdef struct SchedulerOptions
single::Union{Int,Nothing} = nothing
Expand All @@ -186,52 +186,52 @@ Stores Thunk-local options to be passed to the Dagger.Sch scheduler.

# Arguments
- `single::Int=0`: (Deprecated) Force thunk onto worker with specified id. `0`
disables this option.
disables this option.
- `proclist=nothing`: (Deprecated) Force thunk to use one or more processors
that are instances/subtypes of a contained type. Alternatively, a function can
be supplied, and the function will be called with a processor as the sole
argument and should return a `Bool` result to indicate whether or not to use
the given processor. `nothing` enables all default processors.
that are instances/subtypes of a contained type. Alternatively, a function can
be supplied, and the function will be called with a processor as the sole
argument and should return a `Bool` result to indicate whether or not to use
the given processor. `nothing` enables all default processors.
- `time_util::Dict{Type,Any}`: Indicates the maximum expected time utilization
for this thunk. Each keypair maps a processor type to the utilization, where
the value can be a real (approximately the number of nanoseconds taken), or
`MaxUtilization()` (utilizes all processors of this type). By default, the
scheduler assumes that this thunk only uses one processor.
for this thunk. Each keypair maps a processor type to the utilization, where
the value can be a real (approximately the number of nanoseconds taken), or
`MaxUtilization()` (utilizes all processors of this type). By default, the
scheduler assumes that this thunk only uses one processor.
- `alloc_util::Dict{Type,UInt64}`: Indicates the maximum expected memory
utilization for this thunk. Each keypair maps a processor type to the
utilization, where the value is an integer representing approximately the
maximum number of bytes allocated at any one time.
utilization for this thunk. Each keypair maps a processor type to the
utilization, where the value is an integer representing approximately the
maximum number of bytes allocated at any one time.
- `occupancy::Dict{Type,Real}`: Indicates the maximum expected processor
occupancy for this thunk. Each keypair maps a processor type to the
utilization, where the value can be a real between 0 and 1 (the occupancy
ratio, where 1 is full occupancy). By default, the scheduler assumes that this
thunk has full occupancy.
occupancy for this thunk. Each keypair maps a processor type to the
utilization, where the value can be a real between 0 and 1 (the occupancy
ratio, where 1 is full occupancy). By default, the scheduler assumes that this
thunk has full occupancy.
- `allow_errors::Bool=true`: Allow this thunk to error without affecting
non-dependent thunks.
non-dependent thunks.
- `checkpoint=nothing`: If not `nothing`, uses the provided function to save
the result of the thunk to persistent storage, for later retrieval by
`restore`.
the result of the thunk to persistent storage, for later retrieval by
`restore`.
- `restore=nothing`: If not `nothing`, uses the provided function to return the
(cached) result of this thunk, were it to execute. If this returns a `Chunk`,
this thunk will be skipped, and its result will be set to the `Chunk`. If
`nothing` is returned, restoring is skipped, and the thunk will execute as
usual. If this function throws an error, restoring will be skipped, and the
error will be displayed.
(cached) result of this thunk, were it to execute. If this returns a `Chunk`,
this thunk will be skipped, and its result will be set to the `Chunk`. If
`nothing` is returned, restoring is skipped, and the thunk will execute as
usual. If this function throws an error, restoring will be skipped, and the
error will be displayed.
- `storage::Union{Chunk,Nothing}=nothing`: If not `nothing`, references a
`MemPool.StorageDevice` which will be passed to `MemPool.poolset` internally
when constructing `Chunk`s (such as when constructing the return value). The
device must support `MemPool.CPURAMResource`. When `nothing`, uses
`MemPool.GLOBAL_DEVICE[]`.
`MemPool.StorageDevice` which will be passed to `MemPool.poolset` internally
when constructing `Chunk`s (such as when constructing the return value). The
device must support `MemPool.CPURAMResource`. When `nothing`, uses
`MemPool.GLOBAL_DEVICE[]`.
- `storage_root_tag::Any=nothing`: If not `nothing`,
specifies the MemPool storage leaf tag to associate with the thunk's result.
This tag can be used by MemPool's storage devices to manipulate their behavior,
such as the file name used to store data on disk."
specifies the MemPool storage leaf tag to associate with the thunk's result.
This tag can be used by MemPool's storage devices to manipulate their behavior,
such as the file name used to store data on disk."
- `storage_leaf_tag::MemPool.Tag,Nothing}=nothing`: If not `nothing`,
specifies the MemPool storage leaf tag to associate with the thunk's result.
This tag can be used by MemPool's storage devices to manipulate their behavior,
such as the file name used to store data on disk."
specifies the MemPool storage leaf tag to associate with the thunk's result.
This tag can be used by MemPool's storage devices to manipulate their behavior,
such as the file name used to store data on disk."
- `storage_retain::Bool=false`: The value of `retain` to pass to
`MemPool.poolset` when constructing the result `Chunk`.
`MemPool.poolset` when constructing the result `Chunk`.
"""
Base.@kwdef struct ThunkOptions
single::Union{Int,Nothing} = nothing
Expand Down
Loading