Skip to content
This repository was archived by the owner on Aug 9, 2024. It is now read-only.

keras-cv Scoping RFC. #23

Merged
merged 4 commits into from
Sep 15, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
115 changes: 115 additions & 0 deletions rfcs/20200827-keras-cv-scoping-design.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
# Keras CV

| Status | Proposed |
:-------------- |:---------------------------------------------------- |
| **Author(s)** | Zhenyu Tan ([email protected]), Francois Chollet ([email protected]) |
| **Updated** | 2020-08-27 |


## Objective

This document describes the scope of the [keras-cv](https://github.com/keras-team/keras-cv) package, especially:
- What use cases `keras-cv` should cover
- Boundaries between `keras-cv` and [TensorFlow Addons](https://github.com/tensorflow/addons)
- Boundaries between `keras-cv` and [TensorFlow model garden](https://github.com/tensorflow/models)
- Boundaries between `keras-cv` and [tf.keras.applications](https://keras.io/api/applications/)

## Motivation

Computer vision (CV) is a major application area for our users.
Keras on its own provides good support for image classification tasks, in particular via `tf.keras.applications`.
However, a Keras-native modeling solutions for more advanced tasks,
such as object detection, instance segmentation, etc., is still lacking.

As a result, the open-source community has rolled out many different solutions for these use cases,
made available via PyPI and GitHub. These third-party solutions are not always kept up to date, and
many still rely on the legacy multi-backend Keras. They also raise the issue of API standardization.

To fix this, we want machine learning engineers to have access to a standard Keras-native,
optimized, and well-tested set of components to build their advanced computer vision models.

This provides key user benefits:

- The package would be first-party and thus always up to date with modern best practices.
- High code quality and testing standards and strict quality control: same level of trust as core Keras
- A shared API standard across the community
- Ability for the open-source community to build more advanced solutions *on top* of this package instead of reinventing it

## Design Proposal

`keras-cv` will provide components that cover the following areas:

- Object Detection tasks.
- Instance Segmentation tasks.
- Semantic Segmentation tasks.
- Keypoint Detection tasks.
- Video Classification tasks.
- Object Tracking tasks.

Specifically, for Object Detection tasks, `keras-cv` will include most anchor-based modules:

- Common objects such as anchor generator, box matcher.
- Keras layer components such as ROI generator, NMS postprocessor.
Copy link

@seanpmorgan seanpmorgan Sep 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar question about custom op kernels as in the NLP RFC. NMS postprocessor has a few custom op implementations in TF core I believe.. but there may be new custom-ops needed in a CV repo like this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NMS is mostly supported (through 5 different versions). I agree there maybe new custom-ops needed, I believe one thing we haven't mentioned that @bhack asked is what things should go to keras-cv and what goes to tf core. In this specific case, graduating tfa custom ops to tf core, not keras, seems a better option IMO.

Copy link
Contributor

@bhack bhack Sep 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Main question probably alteady done in another ticket: are these repositories going to host c++ code?
Cause "Keras world" was historically python only.
More in genetsl see also our old thread about custom ops at tensorflow/addons#1752 (comment)

Copy link
Contributor

@bhack bhack Sep 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am asking this also by an inference point of view. As we could see from the Google mediapipe experience about putting end2end TF model "in production" on different platforms (e.g. also on devices where the python interpreter is not available) it still require many c++ calculators with impl sometime depending on external c++ library like OpenCV etc...
Just to make an example see c++ calculators code in the image folder: https://github.com/google/mediapipe/tree/master/mediapipe/calculators/image

Copy link
Contributor Author

@tanzhenyu tanzhenyu Sep 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the philosophy here is, we cannot guarantee everything has a tf op. This is not scalable, and not aligned with our future endeavor such as MLIR as well. Instead we should allow compiler to interpret them and break down into simpler ops.
But this is a good point, specifically regarding OpenCV, at least they can be executed wrapped in tf.numpy_function. For training this is good enough (and part of data preprocessing, so if you don't have accelerate support, it's fine). For serving, I can imagine people have their own solutions for optimization

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Meanwhile will this discussion alternate the proposal?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It Is hard to say cause I am not in the position to allocate TF teams resources in a specific direction. 😉 But It would be nice if you will find, as a team, an internal consesus and resources to bootstrap an MVP on this (if It really makes any sense to create a CV dialect)

Copy link
Contributor

@bhack bhack Sep 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also as you know by some concrete example we had on different tickets coordination It Is hard in the Wild with a classical approach.
E.g. Just to mention something old and fresh at the same time tensorflow/addons#914 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifically related to this package being pure python? (keras_cv is on top of Keras IMO)

Copy link
Contributor

@bhack bhack Sep 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to your mentioned NMS postprocessor and iter future candidate operators like this one. Being python only doesn't solve the runtime topic about trasformation and compilers.
Where keras-cv could run? Can run only on targets with python interpreter.
IMHO It Is a Little bit like when we used PIL preprocessing in Keras instead of rely on TF ops like in the new preprocessing. It Is not really the case of NMS cause we are now at v3 impl in Tensorflow. I picked this just cause you mentioned as an example.

EDIT:
In the TF (MLIR) dialect NMS is at v5 https://www.tensorflow.org/mlir/tf_ops?hl=en#tfnonmaxsuppressionv5_tfnonmaxsuppressionv5op

- Keras backbone components that fills the gap from keras-applications.
- Keras losses and metrics, such as Focal loss and coco metrics.
- Data loader and preprocessing for different dataset, such as COCO.

For Semantic Segmentation tasks, `keras-cv` will include:

- Keras head components such as Atrous Spatial Pyramid Pooling (ASPP).

### Success criteria for `keras-cv`

- Cover all modeling tasks listed above
- Easy-to-use API
- Models run on CPU/GPU/TPU seamlessly
- State of the art performance
- Models can be readily deployed to production

### Boundaries between keras-cv and keras-applications

- keras-applications will be improved to include basic building blocks such as mobilenet bottleneck, that
include feature maps
- keras-cv will depend on keras-applications for importing backbones.

### Boundaries between keras-cv and Tensorflow Addons

- Highly experimental modeling, layers, losses, etc, live in addons.
- Components from addons will graduate to keras-cv, given it incurs more usage,
and it works in CPU/GPU/TPU. The API interface will remain experimental after graduation.

### Boundaries between keras-cv and Model Garden

- End to end modeling workflow and model specific details live in Model Garden
- Model garden will re-use most of the building blocks from keras-cv and Tensorflow Addons.
- Components from Model Garden can graduate to keras-cv, given it is widely accepted,
it works performant in CPU/GPU/TPU. The API interface should remain stable after graduation.

## Dependencies

- Tensorflow version >= 2.4

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we elaborate on why 2.4 is the minimum version to utilize this library?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same response as NLP scoping doc.

- Tensorflow datasets
- Keras-applications

## Backwards compatibility

We propose to guarantee major release backwards compatibility.

## Maintenance & development process

The `keras-cv` codebase will be primarily maintained by the Keras team at Google,
with help and contributions from the community. The codebase will be developed
on GitHub as part of the `keras-team` organization. The same process for tracking
issues and reviewing PRs will be used as for the core Keras repository.

## Performance benchmark

We will set up Keras benchmark utilities to help users contribute to this repository.

## Detailed Design

Detailed design will be shared in a separate document (this document only focuses on scope).

## Questions and Discussion Topics

Please share any questions or suggestion.