CDC Sources
Skippr reads native change logs from five source systems. Each source captures inserts, updates, and deletes and emits them as CDC events with mutation kind and order token metadata.
PostgreSQL
PostgreSQL CDC uses WAL logical replication with the pgoutput output plugin. Skippr creates a replication slot and publication, then streams row-level changes in real time.
Prerequisites
- Set
wal_level = logicalinpostgresql.conf(requires restart) - The replication user must have the
REPLICATIONattribute or be a superuser max_replication_slotsmust be at least 1 (default is usually 10)
Configuration
source:
kind: postgres
host: localhost
port: 5432
user: replicator
password: ${POSTGRES_PASSWORD}
database: mydb
cdc_enabled: true| Field | Default | Description |
|---|---|---|
cdc_enabled | false | Enable CDC via logical replication |
replication_slot_name | skippr_slot | Name of the replication slot |
publication_name | skippr_pub | Name of the publication |
What gets captured
INSERTrows with mutation kindinsertUPDATErows with mutation kindupdate(full row after image)DELETErows with mutation kinddelete
Resume behavior
Skippr stores the committed LSN (Log Sequence Number) after each WAL segment is flushed to the destination. On restart, replication resumes from the stored LSN -- no data is re-read and no events are duplicated.
The replication slot is reused across restarts (not recreated), so PostgreSQL retains WAL segments only until Skippr has confirmed them.
MySQL
MySQL CDC uses binlog replication. Skippr connects as a replication client, reads row-level events from the binary log, and emits them as CDC mutations.
Prerequisites
- Set
binlog_format = ROWinmy.cnf - Set
binlog_row_image = FULL(ensures complete before/after images) - The replication user must have
REPLICATION SLAVEandREPLICATION CLIENTprivileges
Configuration
source:
kind: mysql
connection_string: mysql://replicator:${MYSQL_PASSWORD}@host:3306/mydb
cdc_enabled: true| Field | Default | Description |
|---|---|---|
cdc_enabled | false | Enable CDC via binlog replication |
server_id | auto-generated | MySQL server ID for the replication client |
What gets captured
WRITE_ROWSevents (inserts)UPDATE_ROWSevents (updates with full row image)DELETE_ROWSevents (deletes)
Resume behavior
Skippr stores the binlog filename and position after each committed segment. On restart, the binlog stream resumes from the stored position. The initial snapshot is skipped when a stored position exists.
MongoDB
MongoDB CDC uses change streams, which are backed by the oplog. Skippr opens a change stream on the target database and receives real-time notifications for document mutations.
Prerequisites
- MongoDB must be running as a replica set or sharded cluster (change streams require an oplog)
- The connection user must have
readaccess on the target database
Configuration
source:
kind: mongodb
connection_string: mongodb://user:${MONGO_PASSWORD}@host:27017/mydb
cdc_enabled: true| Field | Default | Description |
|---|---|---|
cdc_enabled | false | Enable CDC via change streams |
What gets captured
insertoperationsupdateoperations (full document after image viafullDocument: updateLookup)deleteoperations
Resume behavior
Skippr stores the MongoDB resume token after each committed segment. On restart, the change stream resumes from the stored token using resume_after. No events are re-processed.
DynamoDB
DynamoDB CDC uses DynamoDB Streams to capture item-level changes. Skippr reads shard iterators and processes records with NEW_AND_OLD_IMAGES to get full before/after item state.
Prerequisites
- Enable DynamoDB Streams on the table with
StreamViewType = NEW_AND_OLD_IMAGES - The IAM role must have
dynamodb:DescribeStream,dynamodb:GetShardIterator, anddynamodb:GetRecordspermissions
Configuration
source:
kind: dynamodb
table_name: my_table
region: us-east-1
cdc_enabled: true| Field | Default | Description |
|---|---|---|
cdc_enabled | false | Enable CDC via DynamoDB Streams |
What gets captured
INSERTevents (new items)MODIFYevents (updated items, full new image)REMOVEevents (deleted items)
Resume behavior
Skippr stores the sequence number of the last processed record per shard. On restart, each shard iterator starts from AT_SEQUENCE_NUMBER using the stored value. New shards start from TRIM_HORIZON.
Kafka
Kafka CDC consumes Debezium-formatted messages from Kafka topics. Skippr parses the Debezium envelope to extract mutation kind, key fields, and payload.
Prerequisites
- A Debezium connector must be running and publishing change events to the Kafka topic
- Messages must use the standard Debezium envelope format with
op,before, andafterfields
Configuration
source:
kind: kafka
brokers: "localhost:9092"
topic: dbserver1.public.customers
cdc_enabled: true| Field | Default | Description |
|---|---|---|
cdc_enabled | false | Enable CDC via Debezium envelope parsing |
debezium_format | true (when cdc_enabled) | Parse messages as Debezium envelopes |
group_id | skippr-{project} | Kafka consumer group ID |
What gets captured
op: c(create / insert)op: u(update)op: d(delete)
Resume behavior
Skippr uses a stable group_id derived from the project name. Kafka's consumer group offset tracking provides durable resume -- on restart, consumption resumes from the last committed offset.
