Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise @simd documentation. #8704

Merged
merged 1 commit into from
Oct 23, 2014
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 14 additions & 4 deletions doc/manual/performance-tips.rst
Original file line number Diff line number Diff line change
Expand Up @@ -603,14 +603,24 @@ properties of the loop:
possibly causing different results than without ``@simd``.
- No iteration ever waits on another iteration to make forward progress.

A loop containing ``break``, ``continue``, or ``goto`` will cause a
compile-time error.

Using ``@simd`` merely gives the compiler license to vectorize. Whether
it actually does so depends on the compiler. To actually benefit from the
current implementation, your loop should have the following additional
properties:

- The loop must be an innermost loop.
- The loop body must be straight-line code. This is why ``@inbounds`` is currently needed for all array accesses.
- Accesses must have a stride pattern and cannot be "gathers" (random-index reads) or "scatters" (random-index writes).
- The stride should be unit stride.
- In some simple cases, for example with 2-3 arrays accessed in a loop, the LLVM auto-vectorization may kick in automatically, leading to no further speedup with ``@simd``.
- The loop body must be straight-line code. This is why ``@inbounds`` is
currently needed for all array accesses. The compiler can sometimes turn
short ``&&``, ``||``, and ``?:`` expressions into straight-line code,
if it is safe to evaluate all operands unconditionally. Consider using
``ifelse`` instead of ``?:`` in the loop if it is safe to do so.
- Accesses must have a stride pattern and cannot be "gathers" (random-index reads)
or "scatters" (random-index writes).
- The stride should be unit stride.
- In some simple cases, for example with 2-3 arrays accessed in a loop, the
LLVM auto-vectorization may kick in automatically, leading to no further
speedup with ``@simd``.