CODE

|ZSOMBOR FÖLDESI|

Level Up Your CI Pipelines with Dagger

Streamline your CI pipelines with Dagger. Write portable pipelines in your favorite language and execute on any CI/CD environment.

As an Analytics Engineer, your development workflow likely involves various tools for linting models, testing Looker views, diffing datasets, and so on. However, managing all these tools can be time-consuming and challenging.

This post explores how easy it is to leverage better CI pipelines with Dagger and create custom CI pipelines that fit together like Lego pieces.

A typical Analytics Engineer's development workflow consists of many tools. 

We use sqlfluff to lint the models, git hooks to raise issues automatically, Spectacles for testing Looker, and the Datafold CLI to compare datasets. This list varies based on the different business needs/environments.

Often you want to execute these processes: 

pasted image 0.png

Tools tends to use different interfaces for invocation. You can use some of them as git hooks and others with fancy CLIs, but managing/setting up all this can be time-consuming and challenging for less tech-savvy Analytics Engineers.


Previously, I used to wrap most of these as GitHub Actions. If you're lucky, you can find Github Actions provided by the vendor or the community that you can plug and play, but in many cases, you need to create your workflow. This setup works well, but not all CI tools have the same thrilling community as GH.  


Once the action was properly set up, I used a tool called act to execute the CI pipelines locally. 

Although I liked this process because it enabled us to evaluate changes faster, with this approach, it was still painful to reproduce these actions in different CI runners (e.g., from GitHub Actions to Gitlab-CI).

render1679479519681 (1).gif

In the following paragraphs, I will show you how easy it is to solve these problems using Dagger.

So, what on earth is Dagger, and why should you care?

Dagger is an open-source programmable CI/CD engine created by Solomon Hykes, the founder of Docker. It makes it easy to develop portable CI pipelines in your favorite programming language that executes entirely on standard OCI containers...

CI Pipelines with Dagger explained in a picture
Dagger does not replace your CI: it improves it by adding a portable development layer on top of it.’ - Image source: Dagger

CI/CD as Code

Unlike conventional CI/CD tools, Dagger lets you write a pipeline as code instead of writing proprietary YAML, you can write your CI/CD code in CUE, Go, Python, or NodeJS. This approach makes it easy to create dynamic pipelines and enables you to test your CI/CD just like any other project element.

Run Anywhere

You can test and debug instantly on your local machine. No need to push your changes to trigger the CI pipeline can run it anytime in your own isolated environment. This enables you to get instant feedback on the impact of your changes.

Dagger executes your pipelines entirely as standard OCI containers. Therefore, it is compatible with most CI/CD runtime environments, including:

This approach has the benefit of enabling you to execute the same CI process everywhere.

pasted image 0 (1).png

Performance

Caching across pipeline runs is one of Dagger's most potent but often overlooked power. 

You can designate one or more directories as cache volumes in your pipeline, and its content will be persistent across runs. Therefore, this makes it possible to reuse the cache's contents at each pipeline run, which speeds up pipeline operations.

Demo - How to streamline your CI pipelines with Dagger

To demonstrate how easy it is to set up a CI process, let's write our first pipeline with Dagger. 

Our dummy pipeline will:

- initialize the Dagger client

- mount the dummy projects dir to the container

- install some Python dependencies

- run the linter by executing the sqlfmt command

pasted image 0 (2).png

Simple as that, we were able to create a universal CI pipeline that can be executed on different CI runners.  

You can execute the example locally:

render1678277760535.gif

Or wrap the same pipeline as Github Actions just like this:

pasted image 0 (3).png

This demonstrates how Dagger makes it easy to create custom pipelines that fit together like Lego pieces. Now, my CI setup is primarily built around Dagger for the build, testing, and publish pipelines. With this approach, I can run my CI anywhere I can run Docker containers.

In conclusion, Dagger is a powerful tool that enables Analytics Engineers to create portable CI pipelines in their favorite programming language and execute them in any Docker-compatible runtime environment.

DOCKER
Dagger

Explore more stories

The Joy of Thinking

|HIFLYLABS|

Hiflylabs is supporting Flying School, a Math development program for ninth-grade students in spring 2024.

Thanks for the memories – How to fine-tune LLMs

|HIFLYLABS|

Fine-tuning is all about using genAI to fit your own context and goals. Explore our approach for both everyday and business settings. Open-source model Mistral has hardly seen any Hungarian in its training. Yet it learned to speak fluently from only 80k messages!

We want to work with you.

Hiflylabs is your partner in building your future. Share your ideas and let's work together!