Software

14 minute read

The data engineer’s Cortex Code cheat sheet

March 26, 2026

A practical guide to the commands, prompts, patterns, and habits that make Cortex Code useful in real data work.

Cortex Code is a data-native coding agent that works directly against your data environment. It sees schemas, roles, grants, tags, lineage, query history, semantic models, and the live shape of the data you are trying not to break.

Your SQL lives in a repo. Your dbt models live in a repo. Your DAGs live in a repo. But the work usually fails or succeeds somewhere else: in the data itself, in the state of the platform, and in the operational rules wrapped around it.

That is where Cortex Code is useful. It can work against the same environment your code depends on instead of treating the repo as the whole system.

This post is a field guide for data engineers who want to use Cortex Code every day. It focuses on the commands, prompts, dbt workflows, Airflow patterns, AGENTS.md, and Skills that are worth keeping close at hand.

Start with the right mental model

The point of Cortex Code is not “SQL autocomplete, but fancier.”

The point is that it can work against your Snowflake environment, not just the files on disk.

That changes what is possible:

You can ask for tables by business concept instead of object name.
You can ask why a role cannot read an object instead of manually spelunking grants.
You can edit a dbt model, run the smallest safe build, compare dev and prod, and leave behind a validation query.
You can ask for cost and governance answers using account context.
You can package a good workflow as a reusable Skill instead of repeating the same ten-step ritual forever.

If the job depends on Snowflake account state, Cortex Code should be in the loop.

Why Cortex Code is unusually good at data work

There are a few reasons it fits data engineering better than generic coding agents:

It understands Snowflake objects, roles, schemas, and SQL semantics.
It can search Snowflake objects and documentation directly.
It has direct SQL execution instead of forcing you to copy SQL into another tool.
It can work in Snowsight for discovery, admin, and workspace tasks.
It can work in the CLI for local repo work, dbt, git, shell commands, Airflow, Skills, and automation.
It works within Snowflake RBAC and the CLI’s approval modes. That matters when your day job includes production data and access boundaries.

The repo-only agents are still valuable. I use them. But for data engineering, repo context is only half the picture.

Get your setup right once

The daily experience gets much better if you spend ten minutes on setup instead of winging it.

Before you start, install the Snowflake CLI (snow). Cortex Code CLI shares the same Snowflake connection setup and ~/.snowflake/connections.toml file. Also make sure you are on a supported platform for Cortex Code CLI per the official docs.

Install the CLI

curl -LsS https://ai.snowflake.com/static/cc-scripts/install.sh | sh

If you are on Windows, use the PowerShell installer from the official docs:

irm https://ai.snowflake.com/static/cc-scripts/install.ps1 | iex

Start it with a specific connection and working directory:

cortex -c dev -w ~/src/analytics

Resume the last session:

cortex --continue

Run a one-off prompt from the shell:

Use -p for quick one-off prompts if your account supports print mode. Subscription and trial accounts block -p / --print, so in those environments you should start cortex -c dev and run the same request interactively.

cortex -c dev -p "List every table tagged PII = TRUE in ANALYTICS_DB"

Enable the right roles and model access

The exact role setup depends on which surface you use:

Cortex Code CLI: start with SNOWFLAKE.CORTEX_USER.
Cortex Code in Snowsight: grant SNOWFLAKE.COPILOT_USER plus SNOWFLAKE.CORTEX_USER or SNOWFLAKE.CORTEX_AGENT_USER.

If you use both, the simplest baseline is COPILOT_USER plus CORTEX_USER.

ALTER ACCOUNT SET CORTEX_ENABLED_CROSS_REGION = 'AWS_US';

GRANT DATABASE ROLE SNOWFLAKE.CORTEX_USER TO ROLE DATA_ENGINEER;

-- If you use Cortex Code in Snowsight:
GRANT DATABASE ROLE SNOWFLAKE.COPILOT_USER TO ROLE DATA_ENGINEER;

-- Optional, depending on your org's access model for agentic workflows:
GRANT DATABASE ROLE SNOWFLAKE.CORTEX_AGENT_USER TO ROLE DATA_ENGINEER;

The ALTER ACCOUNT statement must be run by ACCOUNTADMIN. Replace AWS_US with the region setting your account needs. If your organization restricts model access, make sure the required models are allowed and cross-region inference is enabled for the region you actually need.

One extra nuance: some accounts inherit SNOWFLAKE.CORTEX_USER through PUBLIC unless it has been revoked, so do not assume a missing explicit grant always means a missing effective permission.

Create separate connections for dev and prod-readonly

Do this once. Future you will thank you.

~/.snowflake/connections.toml

[dev]
account = "myorg-myaccount"
user = "you@example.com"
authenticator = "externalbrowser"
warehouse = "DEV_WH"
role = "DATA_ENGINEER"
database = "ANALYTICS"
schema = "PUBLIC"

[prod_readonly]
account = "myorg-myaccount"
user = "you@example.com"
authenticator = "externalbrowser"
warehouse = "PROD_WH"
role = "ANALYST"
database = "ANALYTICS"
schema = "PUBLIC"

Now you can be explicit:

cortex -c dev -w ~/src/analytics
cortex -c prod_readonly -w ~/src/analytics

That one habit prevents a surprising amount of nonsense.

Commands worth memorizing

These are the ones I would actually keep handy.

Startup patterns

cortex
cortex -c dev
cortex -c dev -w ~/src/analytics
cortex --continue
cortex --plan
cortex -p "Explain why this query is slow and optimize it"

CLI slash commands that matter for data engineers

These are Cortex Code CLI commands. In Snowsight, / is mainly for built-in and personal skills, not the full CLI slash-command surface.

/plan
/status
/sql
/sh
/conn
/diff
/fdbt
/lineage
/worktree
/skill
/mcp

Airflow commands worth knowing

cortex airflow health
cortex airflow dags list
cortex airflow dags get my_pipeline
cortex airflow dags source my_pipeline
cortex airflow runs trigger my_pipeline
cortex airflow runs list my_pipeline
cortex airflow tasks list my_pipeline

Make the CLI fit your team

You do not need a huge amount of local configuration, but a little goes a long way.

~/.snowflake/cortex/settings.json

{
  "defaultViewMode": "compact",
  "autoUpdate": true,
  "theme": "dark"
}

~/.snowflake/cortex/settings.json

{
  "permissions": {
    "defaultMode": "ask",
    "dangerouslyAllowAll": false
  }
}

That keeps the CLI terse and keeps permission prompts on by default. permissions.json is better thought of as the remembered approval cache, not the main place to set those defaults.

If your team wants shared repo behavior, use AGENTS.md for project rules and Skills for repeatable workflows. If your organization wants harder controls, use managed settings instead of relying on everybody to configure the CLI the same way.

Snowsight versus CLI

Use both. They are not competing surfaces.

Surface	Best for
`Cortex Code in Snowsight`	SQL authoring, quick catalog discovery, dbt Projects on Snowflake, permissions questions, governance, FinOps, quick query fixes, `@`-based object context, diff review in Workspaces
`Cortex Code CLI`	local `dbt` repos, local files, `git`, shell commands, validation workflows, Airflow, Skills, `AGENTS.md`, repeatable engineering work

The easy rule is this:

Start in Snowsight when you are exploring.
Switch to the CLI when you need local files, source control, or repeatable workflows.

Daily workflow 1: find the right data faster

This is where Cortex Code immediately feels different from generic coding agents.

You are not asking it to write SQL from a blank prompt. You are asking it to navigate the warehouse with you.

Start with plain English:

Find all tables related to customers that I have write access to.

Then narrow:

List every table tagged PII = TRUE in ANALYTICS_DB and show the owning roles.

Then ask for lineage:

Show the lineage from RAW_DB.ORDERS to downstream dashboards.

In Snowsight, use @ mentions to attach objects directly to the chat context before asking follow-up questions.

Example:

@RAW_DB.ORDERS @ANALYTICS_DB.CUSTOMERS
Explain the likely join path between these objects and tell me which columns you would trust for churn analysis.

If you are in the CLI and want a quick fallback query, use /sql. In the interactive CLI, use Ctrl+J for multi-line SQL.

/sql
SELECT table_catalog, table_schema, table_name
FROM SNOWFLAKE.ACCOUNT_USAGE.TABLES
WHERE deleted IS NULL
  AND (
    table_name ILIKE '%CUSTOMER%'
    OR table_schema ILIKE '%CUSTOMER%'
  )
ORDER BY created DESC;

Generic agents are decent once you hand them the schema. Cortex Code is better when finding the schema is the job.

One caveat: SNOWFLAKE.ACCOUNT_USAGE.TABLES is a convenience fallback, not a universal or real-time discovery surface. It depends on your account visibility into ACCOUNT_USAGE, and those views can lag.

Daily workflow 2: stop guessing about permissions

Permissions bugs waste a ridiculous amount of time because they usually look like code bugs at first.

Useful prompts:

What privileges does my role have on this database?

Why am I getting a permissions error on ANALYTICS_DB.CORE.CUSTOMERS?

Show the grants my current role can see for ANALYTICS_DB.CORE.CUSTOMERS, and tell me what extra visibility I would need for a full access audit.

Find all tables that have PII in them.

This is a good example of where generic coding agents usually stall out. They can explain the idea of RBAC. Cortex Code can help investigate your real Snowflake objects and whatever grants your current role can actually see.

Daily workflow 3: use it as a SQL mechanic, not just a generator

One of the best ways to use Cortex Code is to hand it real SQL that already exists and ask it to make that SQL better.

Explain and optimize a query

What does this SQL script do?

Explain why this query is slow and optimize it without changing semantics.

WITH daily_revenue AS (
  SELECT
    customer_id,
    CAST(order_date AS DATE) AS order_day,
    SUM(amount) AS daily_revenue
  FROM ANALYTICS_DB.MART.ORDERS
  WHERE order_date >= DATEADD(day, -90, CURRENT_DATE())
  GROUP BY 1, 2
)
SELECT
  a.customer_id,
  a.order_day,
  a.daily_revenue,
  (
    SELECT SUM(b.daily_revenue)
    FROM daily_revenue b
    WHERE b.customer_id = a.customer_id
      AND b.order_day BETWEEN DATEADD(day, -6, a.order_day) AND a.order_day
  ) AS rolling_7d_revenue
FROM daily_revenue a
ORDER BY a.customer_id, a.order_day;

Then push further:

Show me the tradeoffs in your rewrite.
If there are multiple good versions, rank them by readability, cost, and likely performance.

In Snowsight, you can also use the built-in fix flow on failed SQL and then ask follow-up questions in the same context.

That matters. Good data work is almost never “write one query.” It is “write, inspect, fix, rerun, compare.”

Daily workflow 4: work a dbt project end to end

This is where Cortex Code starts earning permanent terminal space.

The strongest pattern is:

Open the repo with Cortex Code in the right working directory.
Ask for a plan.
Make the model change.
Run the narrowest possible dbt selector.
Generate proof, not just code.

Start the session:

cortex -c dev -w ~/src/analytics

Then turn on planning mode:

/plan

Then give it a concrete task:

Update models/marts/fct_customer_revenue.sql to include refunded_amount.
Run the smallest safe dbt build for the affected graph.
If tests fail, fix the model or tests.
Then write a validation query comparing dev and prod totals for the last 30 days.
Save the validation SQL in analysis/refunded_amount_check.sql.

That is a better prompt than “add refunded amount” because it asks for the full workflow, including proof.

Give Cortex Code repo rules with AGENTS.md

This is one of the highest leverage things you can do for dbt.

AGENTS.md

# analytics repo rules

- Always use fully qualified object names in generated SQL unless the model style already abstracts them.
- For changes to marts, metrics, or business logic, generate a validation query in `analysis/`.
- Prefer `TRY_TO_*` functions for raw ingestion cleanup.
- For model changes, start with `dbt build -s "+"` to validate the changed model and downstream dependents. Use `dbt ls -s "+"` when you want to inspect upstream ancestors first.
- Add or update schema tests for every new key, enum, or non-null business field.
- Never use production roles for write operations.
- When a change touches finance metrics, compare dev and prod for at least the last 30 days.

Now Cortex Code has your rules before it starts making changes.

Ask it to reason across model, graph, and warehouse

Good dbt prompts:

Create a new staging model for RAW.CUSTOMERS that handles duplicates, mixed-case email, malformed dates, and empty-string fields.

Analyze this dbt project and identify the smallest set of models I need to run after changing fct_customer_revenue.

Show me the upstream and downstream graph for this model and tell me where a breaking change would hurt the most.

Review the failing dbt tests and tell me whether the bug is in the model, test assumptions, or source data.

Keep shell access in the loop

Sometimes the best pattern is conversational reasoning plus direct shell execution:

/sh dbt build -s "fct_customer_revenue+"

/sh dbt ls -s "+fct_customer_revenue"

/sh dbt docs generate

The first command validates the changed model plus downstream dependents. The second shows upstream ancestry so you can inspect dependencies before you widen the blast radius.

If you want to stay in shell mode for a while, run /sh with no arguments. The CLI switches into terminal mode and treats subsequent input as shell commands until you exit that mode.

If you want lineage help, call it out:

/lineage fct_customer_revenue

Then ask:

Based on the lineage, which models should I validate manually before merging this change?

Daily workflow 5: use Cortex Code to tune slow dbt models

The internal Snowflake examples here are especially strong.

The useful prompt pattern is:

Analyze our dbt project, identify the slowest-running models, suggest specific optimizations, and flag any models that are not used downstream that we could drop.

That can surface things people usually do not revisit once the pipeline “works”:

bad materialization choices
unnecessary joins
stale models with no downstream value
incremental models that should not be full-refreshing
models that should probably be split

The best move after a good optimization pass is to turn it into a Skill.

Daily workflow 6: convert one-off work into Skills

This is one of the differentiators that gets overlooked.

If you do the same debugging or optimization dance every week, package it.

For a local repo or CLI workflow, project skills live under .cortex/skills/. In Snowsight Workspaces, personal skills live under .snowflake/cortex/skills in that workspace and are only available there.

Create the directory:

mkdir -p .cortex/skills/dbt-slow-model-audit

Then add SKILL.md:

---
name: dbt-slow-model-audit
description: Find slow dbt models and propose safe optimizations
---

# When to use

- The user wants to reduce runtime or credits in a dbt project.
- The user suspects one or more models are bottlenecks.

# Instructions

1. Identify the slowest models using available dbt artifacts or warehouse history.
2. Inspect materialization strategy, joins, filters, and incremental logic.
3. Flag unused downstream models that may be removable.
4. Propose the smallest set of changes that will improve runtime safely.
5. Generate a validation plan and save findings in `analysis/dbt_slow_model_audit.md`.

This example keeps the frontmatter minimal on purpose. tools: is optional. If you want to lock a Skill to a specific tool set, use the exact tool identifiers from the current Cortex Code docs or a known working Skill repo. Also note that the CLI’s custom-skill model is the most fully documented path today. In Snowsight, stick to documented personal skills.

List available Skills:

cortex skill list

Inside a session:

/skill

This is how you turn “that smart thing the agent did once” into a team workflow.

Daily workflow 7: use the built-in Airflow support

If your world includes Airflow, Cortex Code gets more interesting than a generic Python helper.

This assumes you have already configured an Airflow instance for Cortex Code and installed uv, as Snowflake’s Using Apache Airflow with Cortex Code CLI docs require.

One implementation detail worth knowing: cortex airflow is a passthrough wrapper around the Airflow helper Snowflake ships with Cortex Code. Treat the examples below as practical command patterns, not as a separately documented native command grammar owned entirely by the main CLI.

Useful commands:

cortex airflow health
cortex airflow dags list
cortex airflow dags get daily_etl
cortex airflow dags source daily_etl
cortex airflow runs trigger daily_etl
cortex airflow runs list daily_etl

Useful prompts:

Why did my_pipeline fail last night?

Create a DAG that extracts from Snowflake and loads to S3 daily.

Set up my dbt project to run in Airflow using Cosmos.

Migrate my DAGs from Airflow 2 to Airflow 3.

This is a good example of Cortex Code’s advantage. A generic coding agent can write DAG code. Cortex Code can reason about the DAG, the Airflow instance, and the Snowflake side of the workflow in one place.

Daily workflow 8: use Snowsight when you want fewer tabs

Do not force everything through the CLI.

Snowsight is excellent for the stuff people usually break flow for:

quick SQL authoring
finding the right table or column
fixing failed SQL
checking user and role access
cost and usage questions
iterating on code in Workspaces with diff review

Example prompts that work well in Snowsight:

What databases do I have access to?

Write a query for top 10 customers by revenue and a 7-day moving average.

Which 5 service types are using the most credits? Show me a visualization and how to reduce costs.

Find all tables that have PII in them.

The strongest Snowsight feature for day-to-day engineering work might be the simple one: you stay inside the same environment where you are already writing and running the query.

Where Cortex Code is differentiated for Snowflake-heavy work

This is the part people care about, so here is the blunt version. These are practical workflow comparisons, not vendor benchmarks. What a generic coding agent can do depends heavily on how you have integrated it with shells, databases, MCP servers, and other tools.

When Cortex Code wins

Task	Generic coding agent	Cortex Code advantage
Find the right table when you only know the business concept	Without Snowflake-aware integration, often needs schema pasted in or separate metadata queries	Can search Snowflake objects and docs directly, with tags and lineage surfaced when available
Debug a role or grant problem	Can explain RBAC patterns, but usually needs live Snowflake access to inspect real grants	Can help investigate grants and access questions using your current Snowflake visibility
Tune a query that is slow in your warehouse	Can reason from SQL text, but usually needs Snowflake access wired in for warehouse-specific answers	Can work from Snowflake context, docs, and account-aware patterns
Edit a `dbt` model and prove the change is safe	Good at code edits, but may be blind to live Snowflake state unless you wire queries and tools into the workflow	Can help with SQL, `dbt`, validation queries, and Snowflake-side proof in one workflow
Decide between model patterns on Snowflake	General advice unless it can inspect Snowflake objects and docs	Snowflake-aware guidance for semantic objects, platform behavior, and role-aware execution
Build a data workflow that touches SQL, files, shell, and account state	Works if you integrate shell, DB, and metadata tools; less native out of the box	Native fit for mixed data-engineering workflows
Airflow plus Snowflake debugging	Strong on Python, depends on Airflow and Snowflake integrations for live instance context	Built-in Airflow support plus Snowflake-aware reasoning

The simplest summary still holds:

If the answer depends on your Snowflake account, reach for Cortex Code.
If the answer depends on a large non-Snowflake codebase or frontend stack, a generic coding agent may be a better first tool.

When generic agents still win

Be honest about this. Cortex Code is not the best tool for everything.

Use Cursor, Claude Code, or Codex first when you are doing:

frontend work
framework-heavy app scaffolding
polyglot refactors across large non-Snowflake services
local codebase reasoning where warehouse context does not matter

The happy path for most teams is not replacement. It is pairing.

Use your favorite generic coding agent for repo-wide engineering. Use Cortex Code when the work turns into data engineering instead of pure software engineering.

Safety habits

If you want to use Cortex Code seriously, a few habits go a long way.

Use planning mode on anything non-trivial

/plan

This is especially useful before schema changes, broad dbt runs, and anything that touches production objects.

Keep a readonly prod connection

Do not make the agent safe through vibes. Make it safe through connections and roles.

Review diffs and proposed writes

This sounds obvious, but it is the line between “useful accelerator” and “expensive story.”

Put repo rules in `AGENTS.md`

Do not rely on the model to guess your team’s validation and rollout standards.

Package sensitive workflows as Skills

That gives you repeatability and makes the steps inspectable.

Avoid routine `ACCOUNTADMIN` use

If you need elevated setup, do it deliberately. Do not turn that into the default development posture.

A prompt pack worth stealing

These are the prompts I would actually keep around.

Discovery and governance

Find all tables related to customers that I have write access to.

List every table tagged PII = TRUE in ANALYTICS_DB and show the owning roles.

Show the lineage from RAW_DB.ORDERS to downstream dashboards.

Why am I getting a permissions error on ANALYTICS_DB.CORE.CUSTOMERS?

SQL and performance

What does this SQL script do?

Explain why this query is slow and optimize it without changing semantics.

Write a query for top 10 customers by revenue and add a 7-day moving average.

dbt

Create a staging model for RAW.CUSTOMERS that handles duplicates, malformed dates, empty strings, and mixed-case emails.

Update models/marts/fct_customer_revenue.sql to include refunded_amount, run the smallest safe dbt selector, and generate a dev-versus-prod validation query.

Analyze this dbt project, identify the slowest-running models, and suggest the smallest safe changes to reduce runtime and credits.

Airflow

Why did my_pipeline fail last night?

Create a DAG that extracts from Snowflake and loads to S3 daily.

Set up my dbt project to run in Airflow using Cosmos.

Skills and repeatability

In a local repo, turn this repeated optimization workflow into a Skill and save it in .cortex/skills/dbt-slow-model-audit.

The real differentiator

The best way to think about Cortex Code is is better workflow compression for data work.

That is what shows up again and again in the strong use cases:

dbt changes plus tests plus validation
schema discovery plus lineage plus governance
query tuning plus account context
Airflow plus Snowflake debugging
repeated workflows converted into Skills

That is why it feels different from a repo-only agent. It is not just helping you write the step you are on. It is helping you get through the whole chain without losing context.

If you are a data engineer, that is the part worth caring about.

ByteDance’s new AI video generation model, Dreamina Seedance 2.0, comes to CapCut

March 26, 2026

Quality Assurance

VIDEO PODCAST | A Metrology Journey with CMS

March 26, 2026

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

The data engineer’s Cortex Code cheat sheet

Start with the right mental model

Why Cortex Code is unusually good at data work

Get your setup right once

Install the CLI

Enable the right roles and model access

Create separate connections for dev and prod-readonly

Commands worth memorizing

Startup patterns

CLI slash commands that matter for data engineers

Airflow commands worth knowing

Make the CLI fit your team

Snowsight versus CLI

Daily workflow 1: find the right data faster

Daily workflow 2: stop guessing about permissions

Daily workflow 3: use it as a SQL mechanic, not just a generator

Explain and optimize a query

Daily workflow 4: work a dbt project end to end

Give Cortex Code repo rules with AGENTS.md

Ask it to reason across model, graph, and warehouse

Keep shell access in the loop

Daily workflow 5: use Cortex Code to tune slow dbt models

Daily workflow 6: convert one-off work into Skills

Daily workflow 7: use the built-in Airflow support

Daily workflow 8: use Snowsight when you want fewer tabs

Where Cortex Code is differentiated for Snowflake-heavy work

When Cortex Code wins

When generic agents still win

Safety habits

Use planning mode on anything non-trivial

Keep a readonly prod connection

Review diffs and proposed writes

Put repo rules in AGENTS.md

Package sensitive workflows as Skills

Avoid routine ACCOUNTADMIN use

A prompt pack worth stealing

Discovery and governance

SQL and performance

dbt

Airflow

Skills and repeatability

The real differentiator

Further reading

Leave a Reply Cancel reply

Previous Post

Next Post

Related Posts

Put repo rules in `AGENTS.md`

Avoid routine `ACCOUNTADMIN` use