S3
Reads files from an Amazon S3 bucket.
Configuration
yaml
source:
kind: s3
s3_bucket: my-bucket
s3_prefix: data/| Field | Default | Description |
|---|---|---|
s3_bucket | (required) | S3 bucket name |
s3_prefix | Key prefix for filtering objects | |
region | AWS region | |
endpoint_url | Custom endpoint |
CLI
bash
skippr connect source s3 \
--bucket my-data-bucket \
--prefix raw/| Flag | Description |
|---|---|
--bucket | S3 bucket name |
--prefix | Key prefix to scan |
Config output
Running connect source s3 writes the following to skippr.yaml:
yaml
source:
kind: s3
s3_bucket: my-data-bucket
s3_prefix: raw/Authentication
S3 access uses standard AWS credentials.
| Variable | Description |
|---|---|
AWS_ACCESS_KEY_ID | AWS access key |
AWS_SECRET_ACCESS_KEY | AWS secret key |
AWS_DEFAULT_REGION | AWS region (if not inferred from bucket location) |
bash
export AWS_ACCESS_KEY_ID="AKIA..."
export AWS_SECRET_ACCESS_KEY="..."
export AWS_DEFAULT_REGION="us-east-1"IAM policy
The IAM user or role needs at minimum:
json
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::my-data-bucket",
"arn:aws:s3:::my-data-bucket/*"
]
}Using an IAM role (EC2 / ECS / SSO)
If running on AWS infrastructure or using SSO, you can omit the access key variables entirely. The SDK will pick up credentials from the instance metadata service or SSO profile automatically.
Supported file formats
Skippr auto-detects the format of files in the bucket:
- JSON / JSONL
- CSV / TSV / delimited
- Parquet
- Avro
Troubleshooting
| Symptom | Fix |
|---|---|
Access Denied | Check AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY are set, and the IAM policy allows s3:GetObject and s3:ListBucket on the bucket |
NoSuchBucket | Verify the bucket name and region |
No files found | Check the --prefix value matches the actual key prefix in the bucket |
