Skip to content

Commit 83d0b0a

Browse files
remove history
0 parents  commit 83d0b0a

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

46 files changed

+2128
-0
lines changed

Diff for: .gitignore

+39
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
./aws/base/.terraform/*
2+
./aws/base/.terraform.lock.hcl
3+
./aws/base/terraform.tfstate
4+
./aws/base/terraform.tfstate.backup
5+
./aws/base_customer_vpc/.terraform/*
6+
./aws/base_customer_vpc/.terraform.lock.hcl
7+
./aws/base_customer_vpc/terraform.tfstate
8+
./aws/base_customer_vpc/terraform.tfstate.backup
9+
./aws/security/.terraform/*
10+
./aws/security/.terraform.lock.hcl
11+
./aws/security/terraform.tfstate
12+
./aws/security/terraform.tfstate.backup
13+
./aws/fs_lakehouse/.terraform/*
14+
./aws/fs_lakehouse/.terraform.lock.hcl
15+
./aws/fs_lakehouse/terraform.tfstate
16+
./aws/fs_lakehouse/terraform.tfstate.backup
17+
./azure/base/.terraform/*
18+
./azure/base/.terraform.lock.hcl
19+
./azure/base/terraform.tfstate
20+
./azure/base/terraform.tfstate.backup
21+
./azure/managed_vnet/.terraform/*
22+
./azure/managed_vnet/.terraform.lock.hcl
23+
./azure/managed_vnet/terraform.tfstate
24+
./azure/managed_vnet/terraform.tfstate.backup
25+
./gcp/.terraform/*
26+
./gcp/.idea/*
27+
./gcp/terraform.tfstate
28+
./gcp/terraform.tfstate.backup
29+
./gcp/.terraform.lock.hcl
30+
./azure/.idea/*
31+
./azure/terraform.tfstate
32+
./azure/terraform.tfstate.backup
33+
./azure/.terraform.lock.hcl
34+
35+
.idea
36+
./.idea
37+
./.idea/*
38+
/azure/managed_vnet/.terraform/
39+
/azure/.terraform/

Diff for: LICENSE

+44
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
Databricks FS Lakehouse templates
2+
3+
Copyright (2022) Databricks, Inc.
4+
5+
This library (the "Software") may not be used except in connection with the Licensee's use of the Databricks Platform Services pursuant
6+
to an Agreement (defined below) between Licensee (defined below) and Databricks, Inc. ("Databricks"). The Object Code version of the
7+
Software shall be deemed part of the Downloadable Services under the Agreement, or if the Agreement does not define Downloadable Services,
8+
Subscription Services, or if neither are defined then the term in such Agreement that refers to the applicable Databricks Platform
9+
Services (as defined below) shall be substituted herein for “Downloadable Services.” Licensee's use of the Software must comply at
10+
all times with any restrictions applicable to the Downlodable Services and Subscription Services, generally, and must be used in
11+
accordance with any applicable documentation. For the avoidance of doubt, the Software constitutes Databricks Confidential Information
12+
under the Agreement.
13+
14+
Additionally, and notwithstanding anything in the Agreement to the contrary:
15+
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
16+
OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
17+
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR
18+
IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
19+
* you may view, make limited copies of, and may compile the Source Code version of the Software into an Object Code version of the
20+
Software. For the avoidance of doubt, you may not make derivative works of Software (or make any any changes to the Source Code
21+
version of the unless you have agreed to separate terms with Databricks permitting such modifications (e.g., a contribution license
22+
agreement)).
23+
24+
If you have not agreed to an Agreement or otherwise do not agree to these terms, you may not use the Software or view, copy or compile
25+
the Source Code of the Software.
26+
27+
This license terminates automatically upon the termination of the Agreement or Licensee's breach of these terms. Additionally,
28+
Databricks may terminate this license at any time on notice. Upon termination, you must permanently delete the Software and all
29+
copies thereof (including the Source Code).
30+
31+
Agreement: the agreement between Databricks and Licensee governing the use of the Databricks Platform Services, which shall be, with
32+
respect to Databricks, the Databricks Terms of Service located at www.databricks.com/termsofservice, and with respect to Databricks
33+
Community Edition, the Community Edition Terms of Service located at www.databricks.com/ce-termsofuse, in each case unless Licensee
34+
has entered into a separate written agreement with Databricks governing the use of the applicable Databricks Platform Services.
35+
36+
Databricks Platform Services: the Databricks services or the Databricks Community Edition services, according to where the Software is used.
37+
38+
Licensee: the user of the Software, or, if the Software is being used on behalf of a company, the company.
39+
40+
Object Code: is version of the Software produced when an interpreter or a compiler translates the Source Code into recognizable and
41+
executable machine code.
42+
43+
Source Code: the human readable portion of the Software.
44+

Diff for: NOTICE

+4
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
Databricks FS Lakehouse Blueprints
2+
Copyright 2022 Databricks, Inc.
3+
4+
“This Software includes software developed at Databricks (https://www.databricks.com/) and its use is subject to the included LICENSE file.

Diff for: README.md

+47
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
## Deploy Your Financial Services Lakehouse Architecture
2+
3+
### Purpose:
4+
5+
This set of terraform templates is designed to get every FS practitioner and devops team started quickly with financial services best practice setup as well as highly valuable FS-focused libraries directly in your environment.
6+
7+
<p align="center">
8+
<img src="fs_blueprints.jpg" width="700px"/>
9+
</p>
10+
11+
12+
=======
13+
### Architecture:
14+
15+
16+
### Details on What is Packaged:
17+
18+
What's include in this Terraform package?
19+
20+
1. Hardened Cloud Environment (restricted root bucket) for AWS
21+
2. Basic Example of Permissions using Databricks ACLs and Groups for AWS
22+
3. Pre-installed Libraries for Creating Common Data Models & Time Series Analytics (AWS | Azure | GCP)
23+
4. Example Job with Financial Services Quickstarts (AWS | Azure | GCP)
24+
5. PrivateLink Automation for AWS
25+
6. Customer-Managed VPC with GCP
26+
7. NPIP Architecture & Workspace Creation with Azure
27+
28+
29+
### AWS
30+
31+
3 main modules:
32+
33+
* Workspace from scratch (new)
34+
* Managed VPC - Private Link workspace
35+
* Managed VPC - Pre-installed FS libraries, Groups to protect PII, Private Link
36+
37+
38+
### Azure
39+
40+
41+
* Workspace from scratch (new)
42+
* Managed VNET - No public IPs in VNET with private NSGs
43+
44+
45+
### GCP
46+
47+
* Bring-your-own-VPC configuration with GCP

Diff for: aws/base/cross-account-role.tf

+27
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
data "databricks_aws_assume_role_policy" "this" {
2+
external_id = var.databricks_account_id
3+
}
4+
5+
resource "aws_iam_role" "cross_account_role" {
6+
name = "${local.prefix}-crossaccount"
7+
assume_role_policy = data.databricks_aws_assume_role_policy.this.json
8+
tags = var.tags
9+
}
10+
11+
data "databricks_aws_crossaccount_policy" "this" {
12+
}
13+
14+
resource "aws_iam_role_policy" "this" {
15+
name = "${local.prefix}-policy"
16+
role = aws_iam_role.cross_account_role.id
17+
policy = data.databricks_aws_crossaccount_policy.this.json
18+
}
19+
20+
resource "databricks_mws_credentials" "this" {
21+
provider = databricks.mws
22+
account_id = var.databricks_account_id
23+
role_arn = aws_iam_role.cross_account_role.arn
24+
credentials_name = "${local.prefix}-creds"
25+
depends_on = [aws_iam_role_policy.this]
26+
}
27+

Diff for: aws/base/init.tf

+24
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
terraform {
2+
required_providers {
3+
databricks = {
4+
source = "databrickslabs/databricks"
5+
}
6+
aws = {
7+
source = "hashicorp/aws"
8+
version = "3.49.0"
9+
}
10+
}
11+
}
12+
13+
provider "aws" {
14+
region = var.region
15+
}
16+
17+
// initialize provider in "MWS" mode to provision new workspace
18+
provider "databricks" {
19+
alias = "mws"
20+
host = "https://accounts.cloud.databricks.com"
21+
username = var.databricks_account_username
22+
password = var.databricks_account_password
23+
}
24+

Diff for: aws/base/root-bucket.tf

+34
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
resource "aws_s3_bucket" "root_storage_bucket" {
2+
bucket = "${local.prefix}-rootbucket"
3+
acl = "private"
4+
versioning {
5+
enabled = false
6+
}
7+
force_destroy = true
8+
tags = merge(var.tags, {
9+
Name = "${local.prefix}-rootbucket"
10+
})
11+
}
12+
13+
resource "aws_s3_bucket_public_access_block" "root_storage_bucket" {
14+
bucket = aws_s3_bucket.root_storage_bucket.id
15+
ignore_public_acls = true
16+
depends_on = [aws_s3_bucket.root_storage_bucket]
17+
}
18+
19+
data "databricks_aws_bucket_policy" "this" {
20+
bucket = aws_s3_bucket.root_storage_bucket.bucket
21+
}
22+
23+
resource "aws_s3_bucket_policy" "root_bucket_policy" {
24+
bucket = aws_s3_bucket.root_storage_bucket.id
25+
policy = data.databricks_aws_bucket_policy.this.json
26+
depends_on = [aws_s3_bucket_public_access_block.root_storage_bucket]
27+
}
28+
29+
resource "databricks_mws_storage_configurations" "this" {
30+
provider = databricks.mws
31+
account_id = var.databricks_account_id
32+
bucket_name = aws_s3_bucket.root_storage_bucket.bucket
33+
storage_configuration_name = "${local.prefix}-storage"
34+
}

Diff for: aws/base/vars.tf

+37
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
variable "databricks_account_username" {}
2+
variable "databricks_account_password" {}
3+
variable "databricks_account_id" {
4+
type=string
5+
description = "env variable"
6+
}
7+
8+
variable "tags" {
9+
default = {}
10+
}
11+
12+
variable "cidr_block" {
13+
default = "10.4.0.0/16"
14+
}
15+
16+
variable "region" {
17+
default = "us-east-1"
18+
}
19+
20+
locals {
21+
region_bucket_policy = (
22+
replace(var.region, "-", "_")
23+
)
24+
}
25+
26+
27+
28+
resource "random_string" "naming" {
29+
special = false
30+
upper = false
31+
length = 5
32+
}
33+
34+
locals {
35+
prefix = "fsi-ws"
36+
}
37+

Diff for: aws/base/vpc.tf

+81
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
data "aws_availability_zones" "available" {}
2+
3+
module "vpc" {
4+
source = "terraform-aws-modules/vpc/aws"
5+
version = "3.2.0"
6+
7+
name = local.prefix
8+
cidr = var.cidr_block
9+
azs = data.aws_availability_zones.available.names
10+
tags = var.tags
11+
12+
enable_dns_hostnames = true
13+
enable_nat_gateway = true
14+
single_nat_gateway = true
15+
create_igw = true
16+
17+
public_subnets = [cidrsubnet(var.cidr_block, 3, 0)]
18+
private_subnets = [cidrsubnet(var.cidr_block, 3, 1),
19+
cidrsubnet(var.cidr_block, 3, 2)]
20+
21+
manage_default_security_group = true
22+
default_security_group_name = "${local.prefix}-sg"
23+
24+
default_security_group_egress = [{
25+
cidr_blocks = "0.0.0.0/0"
26+
}]
27+
28+
default_security_group_ingress = [{
29+
description = "Allow all internal TCP and UDP"
30+
self = true
31+
}]
32+
}
33+
34+
module "vpc_endpoints" {
35+
source = "terraform-aws-modules/vpc/aws//modules/vpc-endpoints"
36+
version = "3.2.0"
37+
38+
vpc_id = module.vpc.vpc_id
39+
security_group_ids = [module.vpc.default_security_group_id]
40+
41+
endpoints = {
42+
s3 = {
43+
service = "s3"
44+
service_type = "Gateway"
45+
route_table_ids = flatten([
46+
module.vpc.private_route_table_ids,
47+
module.vpc.public_route_table_ids])
48+
tags = {
49+
Name = "${local.prefix}-s3-vpc-endpoint"
50+
}
51+
},
52+
sts = {
53+
service = "sts"
54+
private_dns_enabled = true
55+
subnet_ids = module.vpc.private_subnets
56+
tags = {
57+
Name = "${local.prefix}-sts-vpc-endpoint"
58+
}
59+
},
60+
kinesis-streams = {
61+
service = "kinesis-streams"
62+
private_dns_enabled = true
63+
subnet_ids = module.vpc.private_subnets
64+
tags = {
65+
Name = "${local.prefix}-kinesis-vpc-endpoint"
66+
}
67+
},
68+
}
69+
70+
tags = var.tags
71+
}
72+
73+
resource "databricks_mws_networks" "this" {
74+
provider = databricks.mws
75+
account_id = var.databricks_account_id
76+
network_name = "${local.prefix}-network"
77+
security_group_ids = [module.vpc.default_security_group_id]
78+
subnet_ids = module.vpc.private_subnets
79+
vpc_id = module.vpc.vpc_id
80+
}
81+

Diff for: aws/base/workspace.tf

+49
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
resource "databricks_mws_workspaces" "this" {
2+
provider = databricks.mws
3+
account_id = var.databricks_account_id
4+
aws_region = var.region
5+
workspace_name = local.prefix
6+
deployment_name = local.prefix
7+
8+
credentials_id = databricks_mws_credentials.this.credentials_id
9+
storage_configuration_id = databricks_mws_storage_configurations.this.storage_configuration_id
10+
network_id = databricks_mws_networks.this.network_id
11+
12+
token {
13+
comment = "terraform"
14+
}
15+
}
16+
17+
18+
output "databricks_host" {
19+
value = databricks_mws_workspaces.this.workspace_url
20+
}
21+
22+
// initialize provider in normal mode
23+
provider "databricks" {
24+
// in normal scenario you won't have to give providers aliases
25+
alias = "created_workspace"
26+
host = databricks_mws_workspaces.this.workspace_url
27+
username = var.databricks_account_username
28+
password = var.databricks_account_password
29+
}
30+
31+
32+
// create PAT token to provision entities within workspace
33+
resource "databricks_token" "pat" {
34+
provider = databricks.created_workspace
35+
comment = "Terraform Provisioning"
36+
lifetime_seconds = 86400
37+
}
38+
39+
// export token for integration tests to run on
40+
output "databricks_token" {
41+
value = databricks_token.pat.token_value
42+
sensitive = true
43+
}
44+
45+
46+
// export token for integration tests to run on
47+
output "databricks_workspace_id" {
48+
value = databricks_mws_networks.this.workspace_id
49+
}

0 commit comments

Comments
 (0)