Skip to content

Commit 052ac56

Browse files
committed
2 parents 9b9652e + 5a4ca3c commit 052ac56

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

44 files changed

+4632
-30
lines changed

CODEOWNERS

+1-1
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ sigs/io/ @martinwicke @ewilderj @mrry @yongtang @dmitrievanthony
1414
sigs/jvm/ @martinwicke @ewilderj @sjamesr @karllessard @tzolov
1515
sigs/micro/ @martinwicke @ewilderj @petewarden
1616
sigs/mlir/ @ewilderj @pkanwar23
17-
sigs/swift/ @ewilderj @saeta @dynamicwebpaige
17+
sigs/swift/ @ewilderj @saeta @ematejska
1818
sigs/testing/ @ewilderj @dynamicwebpaige
1919

2020
# RFCs

governance/api-reviews.md

+268
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,268 @@
1+
# tensorflow/api-owners review practices
2+
3+
## Overview
4+
5+
This is an attempt to gather commonly discussed topics when doing API
6+
reviews. It’ll hopefully be a useful resource to both API owners and people
7+
proposing API changes. [TF API Owners](https://github.com/orgs/tensorflow/teams/api-owners)
8+
meet twice weekly to discuss changes. We try to get to PRs on the next meeting,
9+
but we don’t always make it all the way through. If your change is particularly
10+
urgent, please ping the PR to notify us of any urgency.
11+
12+
## Process
13+
14+
We only look at changes which have already been approved by other reviewers. If
15+
there are major outstanding comments, we will wait with API review until those
16+
are resolved. If there are questions for API owners, explicitly raise this in
17+
the comments to get an answer.
18+
19+
20+
## High level points
21+
22+
### Backward and forward compatibility
23+
We avoid backwards-incompatible API changes. We also avoid
24+
backwards-incompatible behavior changes, such as restricting the set of valid
25+
inputs to a function or extending the set of valid outputs of a function. Adding
26+
support for previously not supported behavior is okay, as are changes to
27+
explicitly experimental APIs (see section below). When needing to provide a new
28+
or different behavior, we strongly prefer a new version of the API over breaking
29+
backwards compatibility. Note that we are free to deprecate APIs; we just cannot
30+
break code which relies on their documented behavior. We need to worry about
31+
backward compatibility both of our python APIs and of the serialized GraphDefs,
32+
and in general breaking serialized GraphDefs is worse than breaking the python
33+
APIs.
34+
35+
Forward compatibility is more subtle: we should avoid changing the graph
36+
produced by currently correct python code without a three weeks notice. This
37+
comes up most frequently when adding new ops, but also applies to non-obvious
38+
things such as the graph emitted by gradients or pfor.
39+
40+
41+
### Docstrings
42+
43+
TF APIs should have comprehensive documentation in the form of docstrings. If at
44+
all possible these docstrings should have runnable examples, and these examples
45+
should form a doctest so they stay correct. The examples should demonstrate an
46+
end-to-end user workflow, such that it’s clear how to generate the necessary
47+
inputs for the API and what to do with the outputs. The docstring should be
48+
understandable by someone who is not familiar with TF. See the [guide to writing
49+
TF docstrings](https://www.tensorflow.org/community/contribute/docs_ref) for
50+
more information.
51+
52+
Our documentation generator for classes only sees methods, so prefer defining
53+
members as properties instead of assigning them in `__init__`.
54+
55+
Docstrings should only refer to other public TF API symbols (i.e. do not refer
56+
to other symbols defined in the same file as a function which is just now being
57+
made public) and should refer to public API symbols by their full exported name.
58+
59+
### Common names
60+
61+
Prefer keepdims over keep_dims. Prefer axis over dim. Data types are called
62+
dtype. name is a common last argument of ops but backward compatibility mandates
63+
that new arguments are added after the last existing argument, even if that
64+
results in name not being the last argument.
65+
66+
We generally prefer spelling things out over using abbreviations except when
67+
abbreviations are more standard than spelling things out (i.e. don’t spell out
68+
linalg or svd). When in doubt we ask domain experts or use web search to see
69+
what spelling is most common.
70+
71+
If possible we prefer to name things in a similar way to numpy (e.g., we would
72+
not pick einsum as a name, but numpy and others before it have, and that
73+
precedent is very strong).
74+
75+
We prefer experimental namespaces (i.e. tf.data.experimental.foobar) over
76+
experimental-prefixed names (i.e. tf.data.experimental_foobar) except when
77+
adding an experimental class method, or an experimental argument. Experimental
78+
endpoints should be deprecated in a minor release before they can be removed in
79+
the next. We would like new experimental symbols to be things which will
80+
eventually end up in core TF as opposed to things we expect will be phased out
81+
with no clear replacement. The best expectation to have for an experimental
82+
endpoint is that the “experimental” will simply be removed. If you don’t believe
83+
that’ll work, it should probably not be added in its current form.
84+
85+
### Style
86+
87+
Generally, follow Google style.
88+
89+
Avoid redundancy. Do not write arguments of the form `function(...,
90+
enable_feature=False, feature_config=None)` if you can also write `function(...,
91+
feature_config=None)`, where implicitly, `enable_feature = feature_config is not
92+
None`.
93+
94+
Try to embed well with the ambient language. Think about how your API interacts
95+
with language idioms (e.g., in Python: can it be hashed, i.e., used as a dict
96+
key? Is it iterable? Is it a mapping? Can it be equality compared?
97+
Ordered?). Think about how your API interacts with other pieces of the Python
98+
ecosystem as well— is there an analogue in Numpy or PyTorch that we should
99+
consider aligning with?
100+
101+
Use language-native constructs wherever you can. In Python, a tuple should be a
102+
tuple. The bar for custom configuration objects is relatively high, a dict or
103+
namedtuple goes a long way.
104+
105+
In particular, do not expose protobufs directly as part of an API. You can use
106+
protobufs for serialization or to encode network traffic. Protobufs should
107+
always be an implementation detail, and never visible on the API surface. Use
108+
language native constructs (dicts or classes for Python, structs for C/C++) if
109+
you need configuration objects.
110+
111+
Avoid global (or any non-local) state as much as possible (this includes Python
112+
'with' scopes). If you need global context, consider whether it can be
113+
thread-local. The TF API is supposed to be thread-safe. Avoid stateful operation
114+
(mutability) if you can. Both features make it hard to reason about code, and
115+
make composability harder to achieve.
116+
117+
We prefer strings ("auto", "never", etc) over enums (tf.namespace.AUTO,
118+
etc). Strings are easier to type, and forces us to document all possible values
119+
and their semantics in the docstrings of all places which accept the string, as
120+
opposed to only in the enum definition, which is a little friendlier.
121+
122+
### Orthogonality and integration with the existing APIs
123+
124+
Is the new API implementable in terms of existing APIs? If so, we might want to
125+
consider pointing users to using the existing APIs. Does the new API add enough
126+
value over a combination of existing APIs? Does the API solve only a specific
127+
problem (that’s usually a sign combinations of existing APIs would be
128+
preferred)?
129+
130+
If not, are existing APIs implementable in terms of the new API? If this is
131+
simple, we might want to steer users towards the new and away from the old API
132+
(possibly, old APIs should be deprecated along with introducing the new API).
133+
134+
If neither is the case, it might be possible that there is a more general API
135+
which makes both the existing API and the new API easy to express. We try to
136+
keep global consistency of our API in mind when reviewing new changes.
137+
138+
How will this API work together with others? Does it do something slightly
139+
differently than others? Does it expect inputs which match what other parts of
140+
TensorFlow produce? Does its output match what other parts of TensorFlow can
141+
consume?
142+
143+
Does it do things the same way other similar pieces in TensorFlow do it? E.g.,
144+
if a common pattern to achieve a behavior is an extra argument, don't use a
145+
function decorator to achieve the same in a different area of the API.
146+
147+
Two wrongs don’t make a right. That is, if a bad API already exists in TF, that
148+
does not give license to new APIs to be bad in the same way. Improvement must be
149+
balanced with consistency, however, and sometimes it’s okay to carry small
150+
imperfections into new APIs for the sake of consistency with old APIs.
151+
152+
### Optional arguments with default values
153+
154+
Many APIs have optional arguments with a default value. Our recommendation is to
155+
use `None` as the default value of any optional arguments and have the
156+
implementation be responsible for handling it as opposed to using a default
157+
value that directly represents the behavior (e.g. `aggregate='sum'`). The
158+
latter prevents the implementation from distinguishing between the caller not
159+
setting the argument vs. the caller setting the argument to the default value,
160+
which may be needed when the default behavior is changing.
161+
162+
### Does it belong in TF at all?
163+
164+
As TF evolves there’s a tendency to put everything inside of it, with costs
165+
compounding over the long term. If there is a reasonable home for a new API
166+
outside core TF (say in addons, io, TFP, or other projects entirely) that can be
167+
strongly preferrable. If new code can be released as independent libraries, it
168+
should be. This is especially true for APIs that are actively evolving; core TF
169+
imposes many restrictions, so it’s far better to trial new APIs outside of the
170+
core library.
171+
172+
## Adding new ops
173+
174+
Adding new ops to TF should be done with care. We generally prefer not adding
175+
new ops if possible, but performance, hardware compatibility, and other concerns
176+
often do require new ops.
177+
178+
When adding new ops, look for:
179+
180+
- closure under automatic differentiation (i.e. we avoid ops which are
181+
differentiable but not twice-differentiable, or which are technically
182+
differentiable but not marked as such)
183+
- performant kernels (it’s better not to have an op than to have an op with a
184+
suboptimal kernel; we need to make sure kernel experts have reviewed the
185+
code)
186+
- broadcasting (all numerical ops should broadcast using numpy rules)
187+
- does support for this op have to be added to pfor/vectorized_map?
188+
- dtype support (in general all numerical ops should support the common
189+
integer, floating point, and complex dtypes, if they all make sense; we need
190+
to watch out for int32 on GPUs though)
191+
- device support (cuda kernels should be implemented if possible, and similarly
192+
a tf/xla bridge entry should be added if it makes sense)
193+
- attributes versus inputs (anything which can be an input to an operation
194+
should be an input, and attributes should only be used to parametrize the
195+
operation in ways that affect the output dtypes or sometimes shapes)
196+
- state management (is the op stateful? Can it instead be made stateless and
197+
rely on optimizations like memory reuse for performance? Can it be made to
198+
keep its state using one of the existing mechanisms like variables? If not,
199+
its state should be encapsulated using resource handles if at all possible)
200+
- we generally don’t like ops which are supported only on a single device (be
201+
it CPU, GPU, XLA, TPU, etc) and prefer to have at least a plan for writing
202+
device-agnostic code
203+
- should the python layer for this operation support raggedtensor/sparsetensor?
204+
205+
## Experimental APIs
206+
207+
Experimental APIs are APIs which have the word 'experimental' somewhere in their
208+
name; for example `tf.experimental.foo`, or `tf.foo.experimental.Bar`, or
209+
`tf.foo(experimental_bar=True)` or `tf.Foo().experimental_bar()`. We generally
210+
prefer experimental namespaces when possible, so prefer
211+
`tf.foo.experimental.Bar` over `tf.foo.ExperimentalBar`.
212+
213+
Experimental APIs are APIs intended to be added to TensorFlow as-is, but which
214+
we reserve the right to change in backwards-incompatible ways if we have
215+
to. This is different from apis in `tensorflow/addons`, many of which are not
216+
necessarily intended to be added to core TF as they might have a more narrow use
217+
case initially (if APIs in `tensorflow/addons` do become widely useful they can
218+
"graduate" to core, either using experimental or not).
219+
220+
No temporary APIs should be added to experimental (i.e. "we just need this until
221+
certain bugfix or certain new feature becomes available" is not a valid reason
222+
to add an API with experimental in the name.)
223+
224+
No API with known deficiencies should be added to experimental. Experimental
225+
APIs should, to the best of our knowledge, not be expected to change in a known
226+
way (no argument with a known bad name, etc). Experimental can, however, be used
227+
for APIs which are a work-in-progress: it's fine to add experimental methods to
228+
a base class even if those methods are only implemented on some subclasses as
229+
long as we expect all classes to eventually implement those.
230+
231+
The same amount of due diligence required for a real API is required for an
232+
experimental API: this means tests, benchmarks, documentation, end-to-end
233+
examples, etc
234+
235+
Experimental APIs are not a license to break users. This means:
236+
1. we do not remove experimental APIs which are widely used without an effort
237+
to help migrate users away
238+
2. experimental APIs are not removed without warning and don't have
239+
backwards-incompatible changes made to them without warning (the warning can be
240+
a deprecation on version 2.x and removal on 2.x+1, but plain removal on 2.x
241+
with no notice on 2.x-1 is not ok)
242+
243+
Small changes which are mentioned in relnotes and have obvious fixes might be
244+
made (for example if adding a new argument to a long argument list and we
245+
believe there are few pass-by-position users we might allow the new argument to
246+
be added to the middle and not the end of the parameter list).
247+
248+
Large backwards-incompatible changes to experimental APIs still require an
249+
`experimental_foo_v2` or similar backwards-compatible evolution plan to avoid
250+
breaking users of the existing experimental API.
251+
252+
No API endpoint should stay in experimental forever. If a particular
253+
experimental API hasn't had major changes in two minor releases we should remove
254+
the experimental annotation from the API name or delete it. If we do want to
255+
delete it we need to have a deprecation plan that can migrate all users to some
256+
other API endpoint or composition of existing APIs. In rare cases experimental
257+
APIs can continue to be iterated on after many releases (see TPUStrategy); this
258+
only applies for fairly large API surfaces.
259+
260+
When removing the experimental annotation we should, if at all possible, allow
261+
escape routes to not break existing code. This means toplevel symbols
262+
`tf.experimental.foo` and methods like `tf.Class.experimental_foo` should get a
263+
deprecation warning on 2.x before deletion on 2.x+1; we should use the
264+
doc_controls decorators to not pollute API docs with deprecated "graduated"
265+
experimental APIs. For experimental function arguments we should consider
266+
catching `**kwargs` to raise the proper warnings for at least one version (note
267+
though that `**kwargs` is generally discouraged from our APIs; we prefer
268+
explicitly named keyword arguments if at all possible).

0 commit comments

Comments
 (0)