Config File
Skippr uses one project config file: skippr.yaml. The same full engine schema is used by skippr and skipprd.
Example
skippr:
workspace: mssql_migration
tenant: _
skipprd_el_storage_mode: local
pipelines:
mssql_to_snowflake:
data_source: data_sources.mssql
data_sink: data_sinks.snowflake
cdc:
business_key_columns: [id]
data_sources:
mssql:
Mssql:
connection_string: ${MSSQL_CONNECTION_STRING}
tables: ["dbo.customers", "dbo.orders"]
postgres_cdc:
Postgres:
connection_string: ${POSTGRES_CONNECTION_STRING}
tables: ["public.orders"]
cdc_mode: snapshot_then_cdc
data_sinks:
snowflake:
Snowflake:
database: ANALYTICS
schema: RAW
warehouse: COMPUTE_WH
role: ACCOUNTADMIN
schema_sinks: {}
runtime_plugins: {}
react:
providers:
warehouse:
kind: snowflake
database: ANALYTICS
schema: RAW
warehouse: COMPUTE_WH
role: ACCOUNTADMIN
catalog:
enabled: true
refresh_secs: 3600
max_concurrency: 8
dbt:
enabled: true
runner: host
target: dev
naming:
target_schema: analytics
silver_suffix: silver
gold_suffix: gold
vector:
enabled: trueTop-Level Sections
| Section | Purpose |
|---|---|
skippr | Workspace, tenant, and internal skipprd extract/load defaults |
pipelines | Named pipelines and their source, sink, transform, CDC, and runtime settings |
data_sources | Source plugin configuration keyed by name |
data_sinks | Destination plugin configuration keyed by name |
schema_sinks | Catalog/schema plugin configuration keyed by name |
runtime_plugins | Optional explicit runtime plugin manifest paths |
react | Modeling provider settings used by skippr model |
Storage Settings
skippr.skipprd_el_storage_mode is an internal development/testing setting that controls where skipprd extract/load state is stored (local or s3). It does not control dbt project storage, React thread logs, or vector storage for skippr model; authenticated modeling runs use the storage credentials returned by the Skippr API.
The equivalent environment variable for direct skipprd runs is SKIPPRD_EL_STORAGE_MODE.
Pipelines
Each pipeline references registry entries by section-qualified name:
pipelines:
ingest_orders:
data_source: data_sources.postgres
data_sink: data_sinks.icebergUse skippr discover --pipeline ingest_orders to persist metadata, then skippr sync --pipeline ingest_orders --once to load data. If metadata is missing, sync runs discovery automatically before loading. Run skippr model after sync when you are ready to generate and validate dbt assets.
Plugin Entries
Plugin sections use the plugin name as the single key under each named entry:
data_sources:
postgres:
Postgres:
connection_string: ${POSTGRES_CONNECTION_STRING}
tables: ["public.orders"]Destination entries follow the same shape:
data_sinks:
iceberg:
Iceberg:
table_namespace: analytics
table_location_prefix: s3://my-bucket/warehouse
catalog:
type: glue
warehouse: s3://my-bucket/warehouse
database: analytics
region: us-east-1CDC-capable source plugins use cdc_mode to choose how source reads begin:
| Value | Behavior |
|---|---|
snapshot | Bounded snapshot only. |
snapshot_then_cdc | Full initial snapshot, then native CDC stream. |
cdc_only | Native CDC stream only, with no initial snapshot. |
Modeling Settings
skippr model reads modeling provider settings from react.providers. Extract and load providers are not part of the modeling workflow; use discover and sync for those steps. By default, model resumes the latest modeling thread for the project; use skippr model --no-resume to start a fresh thread.
Environment Variables
Use ${ENV_VAR} syntax for secrets and deployment-specific values:
data_sources:
mssql:
Mssql:
connection_string: ${MSSQL_CONNECTION_STRING}Keep secure values in the environment or your secret manager, not in skippr.yaml.
