Skip to content

allegro/bigflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

1b35f10 · Mar 12, 2025
Mar 12, 2025
Mar 11, 2025
Mar 6, 2025
Dec 28, 2021
May 20, 2024
Mar 6, 2025
Aug 21, 2020
Mar 11, 2025
Mar 6, 2025
Aug 11, 2020
Mar 11, 2025
Jul 26, 2019
Mar 23, 2021
Jul 26, 2019
Mar 29, 2023
Jun 7, 2021
Mar 11, 2025
Mar 11, 2025
Mar 6, 2025

Repository files navigation

BigFlow

Documentation

  1. What is BigFlow?
  2. Getting started
  3. Installing Bigflow
  4. Help me
  5. BigFlow tutorial
  6. CLI
  7. Configuration
  8. Project structure and build
  9. Deployment
  10. Workflow & Job
  11. Starter
  12. Technologies
  13. Development

Cookbook

What is BigFlow?

BigFlow is a Python framework for data processing pipelines on GCP.

The main features are:

Getting started

Start from installing BigFlow on your local machine. Next, go through the BigFlow tutorial.

Installing BigFlow

Prerequisites. Before you start, make sure you have the following software installed:

  1. Python = 3.8
  2. Google Cloud SDK
  3. Docker Engine

You can install the bigflow package globally, but we recommend installing it locally with venv, in your project's folder:

python -m venv .bigflow_env
source .bigflow_env/bin/activate

Install the bigflow PIP package:

pip install bigflow[bigquery,dataflow]

Test it:

bigflow -h

Read more about BigFlow CLI.

To interact with GCP you need to set a default project and log in:

gcloud config set project <your-gcp-project-id>
gcloud auth application-default login

Finally, check if your Docker is running:

docker info

Help me

You can ask questions on our gitter channel or stackoverflow.