Provision Snowflake Infrastructure With Terraform

CODE

|ZSOMBOR FÖLDESI|

Provision Snowflake Infrastructure With Terraform

In this post, I want to show you how you can provision Snowflake components with Terraform in our dbt projects.

Provision Snowflake Infrastructure With Terraform

Terraform is an open-source Infrastructure as a code (IaC) tool that is used to manage the cloud services with a declarative, version controlled configuration format. It is cloud-agnostic so it works on AWS, GCP, Azure and many other cloud providers. Its benefits include that you can write maintainable, reusable, modular pieces of code. So it fits really well with the mindset that makes so many of us love using dbt.

In the upcoming paragraphs I will focus on the Snowflake provider.

First you need to install the Terraform CLI. To do this, follow the instructions here: install-cli

Note that this varies from OS to OS, on MacOS, for example, I had to run these commands:

$ brew tap hashicorp/tap
$ brew install hashicorp/tap/terraform

After a successful installation, we need to initialize the Terraform project.
To do this we need a new directory, then create the main.tf file with the following content:

main.tf
terraform {
  required_providers {
       snowflake = {
       source  = "chanzuckerberg/snowflake"
       version = "0.25.28"
       }
  }
}

From there run the following command:

$ terraform init
demo0.gif

The project will consist of the following files:

Without the use of Terraform on projects, Snowflake dependencies need to be created manually. For example, in a new project, one thing that you definitely need in a Snowflake tenant is at least one warehouse:

It looks like this in pure SQL:

CREATE WAREHOUSE TRANSFORMING WITH 
     WAREHOUSE_SIZE = 'XSMALL' 
     WAREHOUSE_TYPE = 'STANDARD' 
     AUTO_SUSPEND = 600 
     AUTO_RESUME = TRUE;

There is nothing fundamentally wrong with this approach, but there is no clear way to manage this kind of code in your dbt project. You can save it to a new repo or to one of the corners of your project, but there is no trivial solution to manage these external dependencies, and to make it parametrizable, reusable. Managing these external resources can cause technical debt.

The first step is to set up the required providers. This is necessary because the technical user on whose behalf we run the commands needs different roles for different operations.(This user must be at least SYS, SECURITY admin within Snowflake.)

Add the following to the contents of main.tf:

provider "snowflake" {
  username = <snowflake-username>
  password = <snowflake-password>
  account  = <snowflake-account-name>
  region   = <snowflake-region-name>
  alias = "sys_admin"
  role = "SYSADMIN"
}

(of course, sensitive information will not be stored in main.tf later)

To create a Snowflake warehouse in Terraform is as simple as this: 


resource snowflake_warehouse w {
  name           = "test"
  comment        = "foo"
  warehouse_size = "small"
}

It’s looks something like this:

demo1.gif

We successfully provisioned a Snowflake warehouse from Terraform.

Now we are going to make the process more sophisticated.  We will derive each value into a variable, for example, if you need more than one warehouse (which you probably do). 

With the following DRY approach, it’s easy to automate the creation of warehouses in a configurable and reusable way. 
You can simply add something this to your variable.tf:


variables.tf
variable "warehouses" {
  type = map(object({
    name = string,
    size = string
  }))
  default = {
   "test01" = {
   name = "TEST_WAREHOUSE",
   size = "small"
   }
   "test02" = {
   name = "TEST_WAREHOUSE_2",
   size = "small"
   }
}

main.tf
resource "snowflake_warehouse" "warehouses" {
  for_each        = var.warehouses
  provider        = snowflake.sys_admin
  name            = upper(each.value.name)
  warehouse_size  = each.value.size
  auto_suspend    = 60
  initially_suspended = true
}

So, as described above, with Terraform we can dynamically generate the necessary resources to provision Snowflake components. I think this is a good way to demonstrate the power of Terraform.
 

demo2-min.gif

Here is high level view of our final bootstrapping project: 

This will create the following objects:

zs1.png

You can find the source code here: link

In this post, I tried to show you how to provision Snowflake infrastructure with Terraform. 
Development of the dbt Cloud Terraform provider started 3 months ago ( Oct 3, 2021), which will allow us to put dbt Cloud resources under version control in the future as well.

Author: Zsombor Földesi - Data Engineer

You can find our other blog posts here.

Snowflake

Explore more stories

The Joy of Thinking

|HIFLYLABS|

Hiflylabs is supporting Flying School, a Math development program for ninth-grade students in spring 2024.

Thanks for the memories – How to fine-tune LLMs

|HIFLYLABS|

Fine-tuning is all about using genAI to fit your own context and goals. Explore our approach for both everyday and business settings. Open-source model Mistral has hardly seen any Hungarian in its training. Yet it learned to speak fluently from only 80k messages!

We want to work with you.

Hiflylabs is your partner in building your future. Share your ideas and let's work together!