← Back to Blogs
The Complete Redshift to Snowflake Migration Playbook: A Phased Approach for Enterprise Teams
DATA ENGINEERING

The Complete Redshift to Snowflake Migration Playbook: A Phased Approach for Enterprise Teams

A battle-tested 5-phase framework for migrating from Amazon Redshift to Snowflake, covering schema conversion, data migration, validation, and cutover.

March 4, 202610 min read

Migrating from Amazon Redshift to Snowflake is one of the most common — and most complex — data platform transitions enterprises undertake. After leading multiple large-scale migrations, we've developed a battle-tested 5-phase framework that minimizes risk and accelerates time-to-value.

Why Organizations Are Moving from Redshift to Snowflake

The most common drivers we see are: Redshift's cluster-based model requiring constant right-sizing and capacity planning, limited concurrency handling that creates bottlenecks during peak hours, growing maintenance overhead for vacuuming, distribution keys, and sort keys, and the desire for true separation of compute and storage with per-second billing.

Phase 1: Assessment and Discovery (Weeks 1–2)

Every successful migration starts with a thorough assessment. We catalog all databases, schemas, tables, views, stored procedures, and UDFs. We profile query patterns using STL_QUERY and STL_QUERYTEXT to understand workload characteristics. Key deliverables include: a complete inventory of database objects, query complexity analysis identifying Redshift-specific SQL that needs conversion, data volume and growth projections, dependency mapping between tables, views, and downstream applications, and a risk assessment highlighting the most complex migration components.

Phase 2: Schema Conversion (Weeks 3–4)

Redshift and Snowflake have significant SQL dialect differences that must be addressed during schema conversion. Distribution keys and sort keys have no direct equivalent in Snowflake — instead, Snowflake uses automatic micro-partitioning and clustering keys. Redshift's IDENTITY columns map to Snowflake's AUTOINCREMENT. ENCODE specifications are unnecessary in Snowflake since compression is automatic. Redshift-specific functions like LISTAGG, NVL2, and date functions often need syntax adjustments. We automate 80–90% of the conversion using custom scripts, then manually review and test the remaining edge cases.

Phase 3: Data Migration (Weeks 5–8)

For data migration, we use a parallel extraction approach: unload data from Redshift to S3 in compressed Parquet format, then use Snowflake's COPY INTO command for high-speed ingestion. For a major media and entertainment conglomerate, we migrated 800+ tables totaling 2.3 TB of compressed data. Our approach included parallel extraction with configurable concurrency (typically 8–16 threads), automatic retry logic for transient failures, progress tracking and resumability for long-running migrations, and incremental sync for tables that continue receiving writes during migration.

Phase 4: Validation and Testing (Weeks 9–10)

Validation is where many migrations fail. We implement three levels of validation: row count comparison between source and target (catches major issues), aggregate validation comparing SUM, MIN, MAX, and COUNT DISTINCT on key columns (catches data type conversion issues), and sample-based row-level comparison on a statistically significant subset (catches subtle transformation errors). We also run the top 50 most-frequent production queries against both systems and compare results. Any discrepancy triggers an investigation before proceeding to cutover.

Phase 5: Cutover and Decommission (Weeks 11–12)

The cutover phase is the highest-risk period. We minimize risk by running both systems in parallel for 1–2 weeks, redirecting read workloads to Snowflake first (lower risk than write workloads), maintaining a rollback plan with Redshift kept warm for 30 days post-cutover, and cutting over applications in priority order with monitoring at each step. After successful cutover and a stability period, we decommission Redshift clusters and clean up S3 staging data.

SQL Dialect Differences: A Quick Reference

Key syntax changes to watch for: Redshift's GETDATE() becomes Snowflake's CURRENT_TIMESTAMP(). DATEADD('day', 7, date_col) syntax is compatible but timezone handling differs. Redshift's APPROXIMATE COUNT(DISTINCT col) becomes Snowflake's APPROX_COUNT_DISTINCT(col). String concatenation with || works in both, but Redshift's CONCAT() only accepts two arguments while Snowflake accepts multiple. Redshift's UNLOAD syntax is replaced by Snowflake's COPY INTO for exports.

Cost Comparison: What to Expect

In our experience, organizations typically see 30–50% lower total cost of ownership after migrating to Snowflake, driven by: elimination of cluster management overhead, per-second billing vs. hourly billing, automatic optimization reducing the need for manual tuning, and reduced storage costs through automatic compression. However, Snowflake's consumption model means costs scale with usage — without governance, it's possible to spend more. We always implement cost monitoring and resource monitors as part of the migration.

Timeline Expectations

For mid-size deployments (100–500 tables, <5 TB), expect 8–12 weeks. For enterprise deployments (500+ tables, 5–50 TB), expect 12–20 weeks. For large-scale deployments (1000+ tables, 50+ TB), expect 20–30 weeks. These timelines assume dedicated migration resources and reasonable query complexity. Stored procedure conversion and application refactoring are typically the longest-lead items.

Frequently Asked Questions

How long does a Redshift to Snowflake migration typically take?

For mid-size deployments (100–500 tables), expect 8–12 weeks. Enterprise deployments with complex stored procedures and many downstream dependencies typically take 12–20 weeks.

Can we run Redshift and Snowflake in parallel during migration?

Yes, and we strongly recommend it. Running both systems in parallel during the validation and cutover phases allows you to compare query results, validate data integrity, and maintain a rollback option.

What are the biggest risks in a Redshift to Snowflake migration?

The top risks are: SQL dialect differences causing query failures, data type mismatches (especially with timestamps and numeric precision), stored procedure conversion complexity, and downstream application compatibility. A thorough assessment phase mitigates these risks.