Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure Service Data Integration with MLW Template #405

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 1 addition & 10 deletions infra/modules/providers/azure/data-factory/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,15 +17,13 @@ An instance of the `data-factory` module deploys the _**Data Factory**_ in order
- Ability to provision a single Data Factory instance
- Ability to provision a configurable Pipeline
- Ability to configure Trigger
- Ability to configure SQL server Dataset
- Ability to configure SQL server Linked Service


## Out Of Scope

The following are not support in the time being

- Creating Multiple pipelines
- Only SQL server Dataset/Linked Service are implemented.

## Definition

Expand All @@ -35,8 +33,6 @@ Terraform resources used to define the `data-factory` module include the followi
- [azurerm_data_factory_integration_runtime_managed](https://www.terraform.io/docs/providers/azurerm/r/data_factory_integration_runtime_managed.html)
- [azurerm_data_factory_pipeline](https://www.terraform.io/docs/providers/azurerm/r/data_factory_pipeline.html)
- [azurerm_data_factory_trigger_schedule](https://www.terraform.io/docs/providers/azurerm/r/data_factory_trigger_schedule.html)
- [azurerm_data_factory_dataset_sql_server](https://www.terraform.io/docs/providers/azurerm/r/data_factory_dataset_sql_server_table.html)
- [azurerm_data_factory_linked_service_sql_server](https://www.terraform.io/docs/providers/azurerm/r/data_factory_linked_service_sql_server.html)

## Usage

Expand All @@ -60,11 +56,6 @@ module "data_factory" {
data_factory_trigger_name = "adftrigger"
data_factory_trigger_interval = 1
data_factory_trigger_frequency = "Minute"
data_factory_dataset_sql_name = "adfsqldataset"
data_factory_dataset_sql_table_name = "adfsqldatasettable"
data_factory_dataset_sql_folder = ""
data_factory_linked_sql_name = "adfsqllinked"
data_factory_linked_sql_connection_string = "Server=tcp:adfsql..."
}
```

Expand Down
8 changes: 0 additions & 8 deletions infra/modules/providers/azure/data-factory/datasets.tf

This file was deleted.

This file was deleted.

10 changes: 0 additions & 10 deletions infra/modules/providers/azure/data-factory/output.tf
Original file line number Diff line number Diff line change
Expand Up @@ -28,16 +28,6 @@ output "trigger_interval" {
value = azurerm_data_factory_trigger_schedule.main.interval
}

output "sql_dataset_id" {
description = "The ID of the SQL server dataset created"
value = azurerm_data_factory_dataset_sql_server_table.main.id
}

output "sql_linked_service_id" {
description = "The ID of the SQL server Linked service created"
value = azurerm_data_factory_linked_service_sql_server.main.id
}

output "adf_identity_principal_id" {
description = "The ID of the principal(client) in Azure active directory"
value = azurerm_data_factory.main.identity[0].principal_id
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,7 @@ resource_group_name = ""
data_factory_name = ""
data_factory_runtime_name = ""
data_factory_pipeline_name = ""
data_factory_dataset_sql_name = ""
data_factory_dataset_sql_table_name = ""
data_factory_linked_sql_name = ""
data_factory_linked_sql_connection_string = ""
data_factory_trigger_name = ""
vnet_integration = {
vnet_id = ""
subnet_name = ""
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,16 +25,6 @@ func TestDataFactory(t *testing.T) {
"data_factory_name",
"pipeline_name",
),
VerifyCreatedDataset(subscription,
"resource_group_name",
"data_factory_name",
"sql_dataset_id",
),
VerifyCreatedLinkedService(subscription,
"resource_group_name",
"data_factory_name",
"sql_linked_service_id",
),
},
}
integration.RunIntegrationTests(&testFixture)
Expand Down
18 changes: 7 additions & 11 deletions infra/modules/providers/azure/data-factory/tests/test.tfvars
Original file line number Diff line number Diff line change
@@ -1,13 +1,9 @@
resource_group_name = "adftest"
data_factory_name = "adftest"
data_factory_runtime_name = "adfrttest"
data_factory_pipeline_name = "testpipeline"
data_factory_trigger_name = "testtrigger"
data_factory_dataset_sql_name = "testsql"
data_factory_dataset_sql_table_name = "adfsqltableheba"
data_factory_linked_sql_name = "testlinkedsql"
data_factory_linked_sql_connection_string = "connectionstring"
resource_group_name = ""
data_factory_name = ""
data_factory_runtime_name = ""
data_factory_pipeline_name = ""
data_factory_trigger_name = ""
vnet_integration = {
vnet_id = "/subscriptions/resourceGroups/providers/Microsoft.Network/virtualNetworks/testvnet"
subnet_name = "default"
vnet_id = ""
subnet_name = ""
}
Original file line number Diff line number Diff line change
@@ -1,24 +1,11 @@
package unit

import (
"encoding/json"
"strings"
"testing"

"github.com/gruntwork-io/terratest/modules/random"
tests "github.com/microsoft/cobalt/infra/modules/providers/azure/data-factory/tests"
"github.com/microsoft/terratest-abstraction/unit"
)

// helper function to parse blocks of JSON into a generic Go map
func asMap(t *testing.T, jsonString string) map[string]interface{} {
var theMap map[string]interface{}
if err := json.Unmarshal([]byte(jsonString), &theMap); err != nil {
t.Fatal(err)
}
return theMap
}

func TestTemplate(t *testing.T) {

expectedDataFactory := map[string]interface{}{
Expand Down Expand Up @@ -53,27 +40,16 @@ func TestTemplate(t *testing.T) {
"frequency": "Minute",
}

expectedDatasetSQL := map[string]interface{}{
"name": "testsql",
}

expectedLinkedSQL := map[string]interface{}{
"name": "testlinkedsql",
"connection_string": "connectionstring",
}

testFixture := unit.UnitTestFixture{
GoTest: t,
TfOptions: tests.DataFactoryTFOptions,
PlanAssertions: nil,
ExpectedResourceCount: 6,
ExpectedResourceCount: 4,
ExpectedResourceAttributeValues: unit.ResourceDescription{
"azurerm_data_factory.main": expectedDataFactory,
"azurerm_data_factory_integration_runtime_managed.main": expectedDFIntRunTime,
"azurerm_data_factory_pipeline.main": expectedPipeline,
"azurerm_data_factory_trigger_schedule.main": expectedTrigger,
"azurerm_data_factory_dataset_sql_server_table.main": expectedDatasetSQL,
"azurerm_data_factory_linked_service_sql_server.main": expectedLinkedSQL,
},
}

Expand Down
30 changes: 0 additions & 30 deletions infra/modules/providers/azure/data-factory/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -68,34 +68,4 @@ variable "data_factory_trigger_frequency" {
description = "The trigger freqency. Valid values include Minute, Hour, Day, Week, Month. Defaults to Minute."
type = string
default = "Minute"
}

variable "data_factory_dataset_sql_name" {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the interface to this module is changing. Why? Is that needed for this PR?

description = "Specifies the name of the Data Factory Dataset SQL Server Table. Only letters, numbers and '_' are allowed."
type = string
default = ""
}

variable "data_factory_dataset_sql_table_name" {
description = "The table name of the Data Factory Dataset SQL Server Table."
type = string
default = ""
}

variable "data_factory_dataset_sql_folder" {
description = "The folder that this Dataset is in. If not specified, the Dataset will appear at the root level."
type = string
default = ""
}

variable "data_factory_linked_sql_name" {
description = "Specifies the name of the Data Factory Linked Service SQL Server. Changing this forces a new resource to be created."
type = string
default = ""
}

variable "data_factory_linked_sql_connection_string" {
description = "The connection string in which to authenticate with the SQL Server."
type = string
default = ""
}
12 changes: 12 additions & 0 deletions infra/templates/az-svc-data-integration-mlw/.env.template
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
export ARM_ACCESS_KEY=
export ARM_CLIENT_ID=
export ARM_CLIENT_SECRET=
export ARM_SUBSCRIPTION_ID=
export ARM_TENANT_ID=
export BUILD_BUILDID=1
export GO_VERSION=1.12.5
export TF_VAR_remote_state_account=
export TF_VAR_remote_state_container=
export TF_VERSION=0.12.4
export TF_WARN_OUTPUT_ERRORS=1
export TF_VAR_resource_group_location=eastus
139 changes: 139 additions & 0 deletions infra/templates/az-svc-data-integration-mlw/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
# Azure Application Services

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this README correct? Looks the same as some of the app service environments but I don't see reference to any data factory integration in the document


The `az-svc-data-integration-mlw` template is intended to be a reference for running a set of app services.


## Use-Case

This particular template creates an Azure environment with a small set of fully managed microservices.


## Scenarios this template should avoid

This template is an adequate solution where the service count is less than 10. For Azure customers interested with provisioning more than 10 services, we recommend using AKS. Reason being that with Kubernetes you can maximize cluster node CPU cores which helps minimize cloud resourcing costs.

## Technical Design
Template design [specifications](docs/design/README.md).

## Architecture
![Template Topology](docs/design/images/deployment-topology.jpg "Template Topology")


## Prerequisites

1. Azure Subscription
2. An available Service Principal with API Permissions granted with Admin Consent within Azure app registration. The required Azure Active Directory Graph app role is `Application.ReadWrite.OwnedBy`

![image](https://user-images.githubusercontent.com/7635865/71312782-d9b91800-23f4-11ea-80ee-cc646f1c74be.png)

3. Terraform and Go are locally installed
4. Azure Storage Account is [setup](https://docs.microsoft.com/en-us/azure/terraform/terraform-backend) to store Terraform state
5. Set up your Local environment variables by creating a `.env` file that contains the following information:

```
ARM_SUBSCRIPTION_ID="<az-service-principal-subscription-id>"
ARM_CLIENT_ID="<az-service-principal-client-id>"
ARM_CLIENT_SECRET="<az-service-principal-auth-secret>"
ARM_TENANT_ID="<az-service-principal-tenant>"
ARM_ACCESS_KEY="<remote-state-storage-account-primary-key>"
TF_VAR_remote_state_account="<tf-remote-state-storage-account-name>"
TF_VAR_remote_state_container="<tf-remote-state-storage-container-name>"
```

## Cost

Azure environment cost ballpark [estimate](https://azure.com/e/92b05a7cd1e646368ab74772e3122500). This is subject to change and is driven from the resource pricing tiers configured when the template is deployed.

## Deployment Steps

1. Execute the following commands to set up your local environment variables:

*Note for Windows Users using WSL*: We recommend running dos2unix utility on the environment file via `dos2unix .env` prior to sourcing your environment variables to chop trailing newline and carriage return characters.

```bash
# these commands setup all the environment variables needed to run this template
DOT_ENV=<path to your .env file>
export $(cat $DOT_ENV | xargs)
```

2. Execute the following command to configure your local Azure CLI.

```bash
# This logs your local Azure CLI in using the configured service principal.
az login --service-principal -u $ARM_CLIENT_ID -p $ARM_CLIENT_SECRET --tenant $ARM_TENANT_ID
```

3. Navigate to the `terraform.tfvars` terraform file. Here's a sample of the terraform.tfvars file for this template.

```HCL
resource_group_location = "centralus"
prefix = "test-services"

# Targets that will be configured to also setup AuthN with Easy Auth
app_services = [
{
app_name = "tf-test-svc-1"
image = null
app_settings = {
"one_sweet_app_setting" = "brilliant"
}
},
{
app_name = "tf-test-svc-2"
image = null
app_settings = {
"another_sweet_svc_app_setting" = "ok"
}
}
]
```

4. Execute the following commands to set up your terraform workspace.

```bash
# This configures terraform to leverage a remote backend that will help you and your
# team keep consistent state
terraform init -backend-config "storage_account_name=${TF_VAR_remote_state_account}" -backend-config "container_name=${TF_VAR_remote_state_container}"

# This command configures terraform to use a workspace unique to you. This allows you to work
# without stepping over your teammate's deployments
TF_WORKSPACE="az-micro-svc-$USER"
terraform workspace new $TF_WORKSPACE || terraform workspace select $TF_WORKSPACE
```

5. Execute the following commands to orchestrate a deployment.

```bash
# See what terraform will try to deploy without actually deploying
terraform plan

# Execute a deployment
terraform apply
```

6. Optionally execute the following command to teardown your deployment and delete your resources.

```bash
# Destroy resources and tear down deployment. Only do this if you want to destroy your deployment.
terraform destroy
```

## Automated Testing

### Unit Testing

Navigate to the template folder `infra/templates/az-svc-data-integration-mlw`. Unit tests can be run using the following command:

```
go test -v $(go list ./... | grep "unit")
```

### Integration Testing

Please confirm that you've completed the `terraform apply` step before running the integration tests as we're validating the active terraform workspace.

Integration tests can be run using the following command:

```
go test -v $(go list ./... | grep "integration")
```
Loading