This guide will introduce how to contribute to SQLFlow models. You can find design doc: Define SQLFLow Models, and feel free to check it out.
-
Open the SQLFlow models repo on your web browser, and fork the official repo to your account.
-
Clone the forked repo on your hosts:
> git clone https://github.com/<Your Github ID>/models.git
-
Set up your local python environment by
make setup && source venv/bin/activate
. If you are using PyCharm, you can simplymake setup
and then import themodels
folder as a new project. -
You can add a new mode definition Python script under the folder sqlflow_models. For example, adding a new Python script
mydnnclassfier.py
:`-sqlflow_models |- dnnclassifier.py `- mydnnclassifier.py
-
You can choose whatever name you like for your model. Your model definition should be a keras subclass model
import tensorflow as tf class MyDNNClassifier(tf.keras.Model): def __init__(self, feature_columns, hidden_units=[10,10], n_classes=2): ... ...
-
Import
MyDNNClassfier
in sqlflow_models/__init__.py:... from .mydnnclassfier import MyDNNClassifier
-
You can test your
MyDNNClassifier
by adding a new Python unit test scripttests/test_mydnnclassifier.py
and run the test as:python tests/test_mydnnclassifier.py
:from sqlflow_models import MyDNNClassifier from tests.base import BaseTestCases import tensorflow as tf import unittest class TestMyDNNClassifier(BaseTestCases.BaseTest): def setUp(self): self.features = {...} self.label = [...] feature_columns = [...] self.model = MyDNNClassifier(feature_columns=feature_columns) if __name__ == '__main__': unittest.main()
If you have developed a new model, please perform the integration test with the SQLFlow gRPC server to make sure it works well with SQLFlow.
-
Launch an SQLFlow all-in-one Docker container
cd ./models > docker run --rm -it -v $PWD:/models -e PYTHONPATH=/models -p 8888:8888 sqlflow/sqlflow
-
Open a web browser and go to
localhost:8888
to access the Jupyter Notebook. Using your custom model by modifying theTRAIN
parameter of the SQLFlow extend SQL:TRAIN sqlflow_models.MyDNNClassifier
:
SELECT * from iris.train
TRAIN sqlflow_models.MyDNNClassifier
WITH n_classes = 3, hidden_units = [10, 20]
COLUMN sepal_length, sepal_width, petal_length, petal_width
LABEL class
INTO sqlflow_models.my_dnn_model;
- When you need to update the model and test a gain, just modify the mode Python file on your host then run the SQL statement in the notebook one more time.
If you have already tested your code, please create a pull request and invite other develops to review it. If one of the develops approve your pull request, then you can merge it to the develop branch.
The travis-ci would build the SQLFlow all-in-one Docker image with the latest models code every night and push it to the Docker hub with tag: sqlflow/sqlflow:nightly
, you can find the latest models in it the second day.