Skip to content

Provide python or CL interface to generate StudyJob yaml and/or StudyJob #240

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
cwbeitel opened this issue Nov 11, 2018 · 5 comments
Closed

Comments

@cwbeitel
Copy link

It would be convenient if users could define and create a katib StudyJob YAML from a Python interface. It would also be nice to be able to submit and poll for the status of this from python as well.

In the case of kubeflow/examples#322 it looks like we can easily launch a kubeflow/pipelines pipeline for a single train_mnist call. But then when we want to go and tune the same with katib a yaml needs to be written.

  • One approach would be to enable katib jobs to be configured and launched from a CLI. This could then be wrapped with a kfp.ContainerOp (roughly as below) and thereby made more testable and easier to include in a broader pipeline.
  • Another would be to enable katib jobs to be triggered from a python call. This could also be wrapped in a container op but I'm guessing doing it this way would allow the result of the op to be more readily consumed.
import kfp.dsl as kfp
 def training_op(learning_rate: float, ... ):
  return kfp.ContainerOp(
    name=step_name,
    image='katib/mxnet-mnist-example',
    command=['katib-studyjob-launcher'],
    arguments=[
      '--cmd', 'python', '/mxnet/example/image-classification/train_mnist.py'
      '--hparam="--lr","%s","%s"' % (min_learningrate, max_learningrate),
      '--'
      '--batch-size', '64',
      ...
    ]
  )
 @kfp.pipeline(
  name='KatibStudyJob',
)
 def kubeflow_training(
  learning_rate: kfp.PipelineParam = kfp.PipelineParam(name='min_learningrate', value=0.1),
  learning_rate: kfp.PipelineParam = kfp.PipelineParam(name='max_learningrate', value=0.3),
...
):
  training = training_op(min_learningrate, max_learningrate, ...)

/cc @jlewi @texasmichelle

@janvdvegt
Copy link
Contributor

What about submitting a StudyJob directly via the Kubernetes API in Python? That's what I was already working on for my own project so we could fire off experiments via our GUI. The dependency it creates is the Python kubernetes API package and in my case I need to create a Role and RoleBinding to allow the Pods that are running the Python package to interact with StudyJobs. I'm open to contribute this if this is something you are interested in

@YujiOshima
Copy link
Contributor

@janvdvegt That's great! I'm so interested in your project. Very welcome to contribute!
We can make two level IF.
A high level IF is manage StudyJob with your project.
A low level iF is calling Katib API directly.

@andreyvelich
Copy link
Member

We created Python SDK for Katib to run Experiments.

@issue-label-bot
Copy link

Issue-Label Bot is automatically applying the labels:

Label Probability
kind/feature 0.98

Please mark this comment with 👍 or 👎 to give our bot feedback!
Links: app homepage, dashboard and code for this bot.

1 similar comment
@issue-label-bot
Copy link

Issue-Label Bot is automatically applying the labels:

Label Probability
kind/feature 0.98

Please mark this comment with 👍 or 👎 to give our bot feedback!
Links: app homepage, dashboard and code for this bot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants