Data Engineering & Infrastructure
Integrating powerful platforms and tools can drastically improve the efficiency and accuracy of data-driven decision-making. Two of these platforms—dbt and Sigma—are gaining traction for their complementary strengths.
Combining dbt's robust data transformation capabilities with Sigma's user-friendly BI interface and data application build features provides interactive, reliable, well-documented analytics, empowering organizations to make data-driven decisions confidently. In this post, we will explore the strengths of each platform and how their integration can elevate your data operations. We will also show how to set up the integration to enable its benefits.
Key Aspects of the dbt and Sigma Integration
Integrating dbt and Sigma combines the strengths of both tools to create a robust data stack. Sigma offers real-time data interaction and user-friendly BI capabilities, while dbt enhances data transformation, quality checks, and data lineage.
Key Aspects of the Integration:
- End-to-End Data Lineage: dbt extends Sigma's data lineage capabilities to cover the entire data pipeline, ensuring full traceability from source to dashboard.
- Data Freshness Visibility: Sigma can display dbt deployment metadata, showing when data was last updated for more accurate insights.
- Enhanced Discoverability: dbt allows for detailed documentation of tables and columns, accessible within Sigma, improving data discoverability and usability.
- Data Quality in the BI Layer: Sigma can surface dbt's data quality test results within dashboards, maintaining high data integrity throughout the BI layer.
Sigma: A New Era of BI
Sigma is a cloud-native BI and data application building platform for the modern data stack. Unlike legacy BI platforms, Sigma empowers users to interact directly with data in real time without needing deep technical expertise. Here's why Sigma stands out:
- User-Friendly Interface: Sigma's spreadsheet-like interface allows business users and analysts to interact easily with data sources. This accessibility reduces the dependency on technical teams, enabling faster insights and more agile decision-making.
- Real-Time Data Interaction: With Sigma, users can query live data without needing pre-aggregated datasets. This means the data you're working with is always up-to-date, helping you make decisions based on the most current information.
- Data Lineage Tracking: Sigma offers data lineage views, which helps users understand how data flows through their dashboards. This feature is critical for ensuring data accuracy and understanding how different data sources contribute to your reports.
- Ad-hoc Data Transformation: While Sigma offers some data transformation features, they're best for final-stage adjustments. For complex transformations and pipeline management, a platform specializing in these steps, like dbt, is more effective. A great way of using both platforms is to prototype in Sigma and then productionalize in dbt.
dbt: The Power-Up Your Data Stack Needs
dbt is a SQL-focused transformation workflow that enhances productivity and delivers higher-quality results. It allows you to modularize and centralize your analytics code while offering your data team the guardrails commonly found in software engineering practices. Here is what makes dbt an optimal choice:
- Powered-Up SQL: dbt uses SQL, a common language most data professionals already know. dbt simplifies writing complex transformations by allowing complex SQL queries to be broken down into more modular and reusable pieces. This modularity makes managing, understanding, and maintaining SQL code easier over time. Additionally, dbt leverages Jinja, a templating language that enables the dynamic generation of SQL code. You can create loops, conditionals, and reusable macros in SQL, which adds significant flexibility and reduces code redundancy.
- Version Control: dbt integrates with version control systems like Git, which facilitates collaboration among team members and ensures that all changes to the data models are versioned, reviewed, and managed just like any software code. This versioning capability is essential for maintaining the integrity and reliability of your data pipeline. This also enables you to define continuous integration/ continuous deployment pipelines, as it's straightforward to build CI/CD on top of version control, letting you deploy code into production safely whenever you need
- Data Lineage and Catalog Integration: dbt automatically generates a data lineage graph. This feature provides a clear view of where data comes from and how it is transformed, ensuring transparency and trust in your data.
- Data Documentation: dbt lets you add documentation that can be persisted in the catalogs of popular data warehouses. You can add custom descriptions to tables and columns, explaining their purpose and content.
- Data Quality Checks: One of dbt's standout features is its ability to embed data quality checks directly into the transformation process, enforcing rules like uniqueness, non-nullability, value ranges, and allowable values, ensuring that only high-quality data makes it through the pipeline.
- dbt Cloud: dbt offers a managed service that allows you to develop, manage, and automate most of what the tool has to offer. This service also lets you orchestrate and monitor your pipelines in a highly user-friendly interface.
dbt + Sigma: The Ultimate Power Couple
Individually, dbt and Sigma are powerful platforms, but as with any power couple, their true potential is unlocked when used together. Here is how they complement each other:
- End-to-End Data Lineage: While Sigma provides data lineage within the BI layer, dbt extends this visibility to the entire data pipeline. End-to-end lineage ensures that users can trace the origin of the data used in any dashboard back through every transformation step, providing unparalleled transparency.
- Data Freshness Visibility: By integrating dbt's deployment metadata with Sigma, users gain visibility into the freshness of their data. Sigma can display when data was last updated, helping users trust the timeliness of their insights.
- Enhanced Discoverability: A common challenge in many organizations is the lack of data discoverability—people don't use data because they don't know it exists or understand its purpose. With dbt, you can add descriptions to tables and columns, including data types and usage notes. These descriptions are then available in Sigma's interface, empowering dashboard developers to use the data better.
- Data Quality in the BI Layer: Sigma can surface dbt's data quality test results for each column, ensuring that the data displayed in dashboards meets the organization's standards and maintains high data integrity, even at the visualization stage.
Integration Tutorial
⚠️ This integration can only be set up using dbt Cloud
💡Note: This post includes screenshots of the UI of both Sigma and dbt at the time of writing (October 2024). Please be aware this is subject to change as these organizations are constantly updated and expanding functionalities.
Create a Service Token in dbt Cloud.
Personal tokens have limited permissions unless you are an owner, and service tokens are an excellent practice for keeping well-scoped access. The service token must have sufficient permissions to access the metadata of the desired project (if it does not work, check the token's permissions), such as Read-Only for Team plans.
Step 1. Navigate to the Account settings menu and click on the Service tokens tab under API tokens.
Step 2. Click the Create service token button, and a new window will appear. Please assign a name for the token to clarify its intended use. Assign the corresponding permissions and, if needed, limit the project to which it can access. Then click on Save.
Step 3. Copy the generated token. dbt will show you the token's value just once. Make sure to store it securely! Note that a service token will start with dbtc_.
Set up the dbt Integration on Sigma
Step 4. Go to Sigma and open the Admin Portal by selecting Administration in the user menu at the top right of your screen. On the Account tab, scroll to the Integrations section and click Add on dbt.
Step 5. Enter the service token you created. Depending on your dbt account, you may need to enter a value for the API host; for more details, visit the documentation here. If you leave that field empty, it will default to cloud.getdbt.com. Now click Save.
Through this integration, Sigma will fetch metadata from the deployment jobs you run on dbt cloud, meaning it will only be available once a job runs in dbt.
Step 6. Run a dbt job and look at a table affected by it; if everything was set up correctly, you should see a dbt tab at the right end of the table's tabs. Sigma will bring the following metadata for a given table:
- Table Description: According to the description assigned to the table in dbt.
- Column Description: According to the description assigned to each table column in dbt.
- Last Refresh Time: When the model was last run, indicating the most recent update to the data.
- Tests: Tests run on each of the columns of the table. For instance, Uniqueness will appear as a U and Non-null as N.
💡Note: For table and column description, enable the config persist_docs.
Maximize the Impact of Your Data Stack with dbt and Sigma
Integrating dbt and Sigma creates a powerful and reliable addition to any data stack. It enables organizations to make data-driven decisions confidently and complements their investments in platforms like Snowflake and Databricks. While dbt excels in data transformation and quality assurance, Sigma brings this high-quality data to life through intuitive, real-time exploration and visualization. Together, these tools provide an end-to-end solution bridging the gap between data engineering and business intelligence, ensuring that every decision is based on trustworthy, up-to-date, well-documented data.
At Aimpoint Digital, we help our clients select and integrate the data tools appropriate for their data and analytic aspirations. As a top Sigma partner and the dbt Labs Innovation Partner of the Year, our team of experts is ready to support your organization in adopting analytic capabilities to help you realize the total value of the modern data stack. If you would like to learn more about how Sigma and dbt can transform analytics for your organization, please click the "Meet and Expert" button, and we'll be in touch shortly!