Name: Dbt Transformation Patterns
Author: Wshobson
Install
Terminal · npx
$npx skills add https://github.com/wshobson/agents --skill dbt-transformation-patterns
Works with Paperclip
How Dbt Transformation Patterns fits into a Paperclip company.

Dbt Transformation Patterns drops into any Paperclip agent that handles this kind of work. Assign it to a specialist inside a pre-configured PaperclipOrg company and the skill becomes available on every heartbeat — no prompt engineering, no tool wiring.
SaaS FactoryPaired
Pre-configured AI company — 18 agents, 18 skills, one-time purchase.
$27$59
Explore pack
Source file
SKILL.md556 linesmarkdown
Expand
1---2name: dbt-transformation-patterns3description: Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.4---5 6# dbt Transformation Patterns7 8Production-ready patterns for dbt (data build tool) including model organization, testing strategies, documentation, and incremental processing.9 10## When to Use This Skill11 12- Building data transformation pipelines with dbt13- Organizing models into staging, intermediate, and marts layers14- Implementing data quality tests15- Creating incremental models for large datasets16- Documenting data models and lineage17- Setting up dbt project structure18 19## Core Concepts20 21### 1. Model Layers (Medallion Architecture)22 23```24sources/          Raw data definitions25    ↓26staging/          1:1 with source, light cleaning27    ↓28intermediate/     Business logic, joins, aggregations29    ↓30marts/            Final analytics tables31```32 33### 2. Naming Conventions34 35| Layer        | Prefix         | Example                       |36| ------------ | -------------- | ----------------------------- |37| Staging      | `stg_`         | `stg_stripe__payments`        |38| Intermediate | `int_`         | `int_payments_pivoted`        |39| Marts        | `dim_`, `fct_` | `dim_customers`, `fct_orders` |40 41## Quick Start42 43```yaml44# dbt_project.yml45name: "analytics"46version: "1.0.0"47profile: "analytics"48 49model-paths: ["models"]50analysis-paths: ["analyses"]51test-paths: ["tests"]52seed-paths: ["seeds"]53macro-paths: ["macros"]54 55vars:56  start_date: "2020-01-01"57 58models:59  analytics:60    staging:61      +materialized: view62      +schema: staging63    intermediate:64      +materialized: ephemeral65    marts:66      +materialized: table67      +schema: analytics68```69 70```71# Project structure72models/73├── staging/74│   ├── stripe/75│   │   ├── _stripe__sources.yml76│   │   ├── _stripe__models.yml77│   │   ├── stg_stripe__customers.sql78│   │   └── stg_stripe__payments.sql79│   └── shopify/80│       ├── _shopify__sources.yml81│       └── stg_shopify__orders.sql82├── intermediate/83│   └── finance/84│       └── int_payments_pivoted.sql85└── marts/86    ├── core/87    │   ├── _core__models.yml88    │   ├── dim_customers.sql89    │   └── fct_orders.sql90    └── finance/91        └── fct_revenue.sql92```93 94## Patterns95 96### Pattern 1: Source Definitions97 98```yaml99# models/staging/stripe/_stripe__sources.yml100version: 2101 102sources:103  - name: stripe104    description: Raw Stripe data loaded via Fivetran105    database: raw106    schema: stripe107    loader: fivetran108    loaded_at_field: _fivetran_synced109    freshness:110      warn_after: { count: 12, period: hour }111      error_after: { count: 24, period: hour }112    tables:113      - name: customers114        description: Stripe customer records115        columns:116          - name: id117            description: Primary key118            tests:119              - unique120              - not_null121          - name: email122            description: Customer email123          - name: created124            description: Account creation timestamp125 126      - name: payments127        description: Stripe payment transactions128        columns:129          - name: id130            tests:131              - unique132              - not_null133          - name: customer_id134            tests:135              - not_null136              - relationships:137                  to: source('stripe', 'customers')138                  field: id139```140 141### Pattern 2: Staging Models142 143```sql144-- models/staging/stripe/stg_stripe__customers.sql145with source as (146    select * from {{ source('stripe', 'customers') }}147),148 149renamed as (150    select151        -- ids152        id as customer_id,153 154        -- strings155        lower(email) as email,156        name as customer_name,157 158        -- timestamps159        created as created_at,160 161        -- metadata162        _fivetran_synced as _loaded_at163 164    from source165)166 167select * from renamed168```169 170```sql171-- models/staging/stripe/stg_stripe__payments.sql172{{173    config(174        materialized='incremental',175        unique_key='payment_id',176        on_schema_change='append_new_columns'177    )178}}179 180with source as (181    select * from {{ source('stripe', 'payments') }}182 183    {% if is_incremental() %}184    where _fivetran_synced > (select max(_loaded_at) from {{ this }})185    {% endif %}186),187 188renamed as (189    select190        -- ids191        id as payment_id,192        customer_id,193        invoice_id,194 195        -- amounts (convert cents to dollars)196        amount / 100.0 as amount,197        amount_refunded / 100.0 as amount_refunded,198 199        -- status200        status as payment_status,201 202        -- timestamps203        created as created_at,204 205        -- metadata206        _fivetran_synced as _loaded_at207 208    from source209)210 211select * from renamed212```213 214### Pattern 3: Intermediate Models215 216```sql217-- models/intermediate/finance/int_payments_pivoted_to_customer.sql218with payments as (219    select * from {{ ref('stg_stripe__payments') }}220),221 222customers as (223    select * from {{ ref('stg_stripe__customers') }}224),225 226payment_summary as (227    select228        customer_id,229        count(*) as total_payments,230        count(case when payment_status = 'succeeded' then 1 end) as successful_payments,231        sum(case when payment_status = 'succeeded' then amount else 0 end) as total_amount_paid,232        min(created_at) as first_payment_at,233        max(created_at) as last_payment_at234    from payments235    group by customer_id236)237 238select239    customers.customer_id,240    customers.email,241    customers.created_at as customer_created_at,242    coalesce(payment_summary.total_payments, 0) as total_payments,243    coalesce(payment_summary.successful_payments, 0) as successful_payments,244    coalesce(payment_summary.total_amount_paid, 0) as lifetime_value,245    payment_summary.first_payment_at,246    payment_summary.last_payment_at247 248from customers249left join payment_summary using (customer_id)250```251 252### Pattern 4: Mart Models (Dimensions and Facts)253 254```sql255-- models/marts/core/dim_customers.sql256{{257    config(258        materialized='table',259        unique_key='customer_id'260    )261}}262 263with customers as (264    select * from {{ ref('int_payments_pivoted_to_customer') }}265),266 267orders as (268    select * from {{ ref('stg_shopify__orders') }}269),270 271order_summary as (272    select273        customer_id,274        count(*) as total_orders,275        sum(total_price) as total_order_value,276        min(created_at) as first_order_at,277        max(created_at) as last_order_at278    from orders279    group by customer_id280),281 282final as (283    select284        -- surrogate key285        {{ dbt_utils.generate_surrogate_key(['customers.customer_id']) }} as customer_key,286 287        -- natural key288        customers.customer_id,289 290        -- attributes291        customers.email,292        customers.customer_created_at,293 294        -- payment metrics295        customers.total_payments,296        customers.successful_payments,297        customers.lifetime_value,298        customers.first_payment_at,299        customers.last_payment_at,300 301        -- order metrics302        coalesce(order_summary.total_orders, 0) as total_orders,303        coalesce(order_summary.total_order_value, 0) as total_order_value,304        order_summary.first_order_at,305        order_summary.last_order_at,306 307        -- calculated fields308        case309            when customers.lifetime_value >= 1000 then 'high'310            when customers.lifetime_value >= 100 then 'medium'311            else 'low'312        end as customer_tier,313 314        -- timestamps315        current_timestamp as _loaded_at316 317    from customers318    left join order_summary using (customer_id)319)320 321select * from final322```323 324```sql325-- models/marts/core/fct_orders.sql326{{327    config(328        materialized='incremental',329        unique_key='order_id',330        incremental_strategy='merge'331    )332}}333 334with orders as (335    select * from {{ ref('stg_shopify__orders') }}336 337    {% if is_incremental() %}338    where updated_at > (select max(updated_at) from {{ this }})339    {% endif %}340),341 342customers as (343    select * from {{ ref('dim_customers') }}344),345 346final as (347    select348        -- keys349        orders.order_id,350        customers.customer_key,351        orders.customer_id,352 353        -- dimensions354        orders.order_status,355        orders.fulfillment_status,356        orders.payment_status,357 358        -- measures359        orders.subtotal,360        orders.tax,361        orders.shipping,362        orders.total_price,363        orders.total_discount,364        orders.item_count,365 366        -- timestamps367        orders.created_at,368        orders.updated_at,369        orders.fulfilled_at,370 371        -- metadata372        current_timestamp as _loaded_at373 374    from orders375    left join customers on orders.customer_id = customers.customer_id376)377 378select * from final379```380 381### Pattern 5: Testing and Documentation382 383```yaml384# models/marts/core/_core__models.yml385version: 2386 387models:388  - name: dim_customers389    description: Customer dimension with payment and order metrics390    columns:391      - name: customer_key392        description: Surrogate key for the customer dimension393        tests:394          - unique395          - not_null396 397      - name: customer_id398        description: Natural key from source system399        tests:400          - unique401          - not_null402 403      - name: email404        description: Customer email address405        tests:406          - not_null407 408      - name: customer_tier409        description: Customer value tier based on lifetime value410        tests:411          - accepted_values:412              values: ["high", "medium", "low"]413 414      - name: lifetime_value415        description: Total amount paid by customer416        tests:417          - dbt_utils.expression_is_true:418              expression: ">= 0"419 420  - name: fct_orders421    description: Order fact table with all order transactions422    tests:423      - dbt_utils.recency:424          datepart: day425          field: created_at426          interval: 1427    columns:428      - name: order_id429        tests:430          - unique431          - not_null432      - name: customer_key433        tests:434          - not_null435          - relationships:436              to: ref('dim_customers')437              field: customer_key438```439 440### Pattern 6: Macros and DRY Code441 442```sql443-- macros/cents_to_dollars.sql444{% macro cents_to_dollars(column_name, precision=2) %}445    round({{ column_name }} / 100.0, {{ precision }})446{% endmacro %}447 448-- macros/generate_schema_name.sql449{% macro generate_schema_name(custom_schema_name, node) %}450    {%- set default_schema = target.schema -%}451    {%- if custom_schema_name is none -%}452        {{ default_schema }}453    {%- else -%}454        {{ default_schema }}_{{ custom_schema_name }}455    {%- endif -%}456{% endmacro %}457 458-- macros/limit_data_in_dev.sql459{% macro limit_data_in_dev(column_name, days=3) %}460    {% if target.name == 'dev' %}461        where {{ column_name }} >= dateadd(day, -{{ days }}, current_date)462    {% endif %}463{% endmacro %}464 465-- Usage in model466select * from {{ ref('stg_orders') }}467{{ limit_data_in_dev('created_at') }}468```469 470### Pattern 7: Incremental Strategies471 472```sql473-- Delete+Insert (default for most warehouses)474{{475    config(476        materialized='incremental',477        unique_key='id',478        incremental_strategy='delete+insert'479    )480}}481 482-- Merge (best for late-arriving data)483{{484    config(485        materialized='incremental',486        unique_key='id',487        incremental_strategy='merge',488        merge_update_columns=['status', 'amount', 'updated_at']489    )490}}491 492-- Insert Overwrite (partition-based)493{{494    config(495        materialized='incremental',496        incremental_strategy='insert_overwrite',497        partition_by={498            "field": "created_date",499            "data_type": "date",500            "granularity": "day"501        }502    )503}}504 505select506    *,507    date(created_at) as created_date508from {{ ref('stg_events') }}509 510{% if is_incremental() %}511where created_date >= dateadd(day, -3, current_date)512{% endif %}513```514 515## dbt Commands516 517```bash518# Development519dbt run                          # Run all models520dbt run --select staging         # Run staging models only521dbt run --select +fct_orders     # Run fct_orders and its upstream522dbt run --select fct_orders+     # Run fct_orders and its downstream523dbt run --full-refresh           # Rebuild incremental models524 525# Testing526dbt test                         # Run all tests527dbt test --select stg_stripe     # Test specific models528dbt build                        # Run + test in DAG order529 530# Documentation531dbt docs generate                # Generate docs532dbt docs serve                   # Serve docs locally533 534# Debugging535dbt compile                      # Compile SQL without running536dbt debug                        # Test connection537dbt ls --select tag:critical     # List models by tag538```539 540## Best Practices541 542### Do's543 544- **Use staging layer** - Clean data once, use everywhere545- **Test aggressively** - Not null, unique, relationships546- **Document everything** - Column descriptions, model descriptions547- **Use incremental** - For tables > 1M rows548- **Version control** - dbt project in Git549 550### Don'ts551 552- **Don't skip staging** - Raw → mart is tech debt553- **Don't hardcode dates** - Use `{{ var('start_date') }}`554- **Don't repeat logic** - Extract to macros555- **Don't test in prod** - Use dev target556- **Don't ignore freshness** - Monitor source data
Related skills
Accessibility Compliance

This walks you through implementing proper WCAG 2.2 compliance with real code patterns for screen readers, keyboard navigation, and mobile accessibility. It cov
Airflow Dag Patterns

If you're building data pipelines with Airflow, this skill gives you production-ready DAG patterns that actually work in the real world. It covers TaskFlow API
Angular Migration

Migrating from AngularJS to Angular is notoriously painful, and this skill tackles the practical stuff that makes or breaks these projects. It covers hybrid app