Intercom S3 Backups: Dealing with Empty Files
Today I learned the hard way that Intercom’s S3 backup integration creates empty files for every export window with no conversations.
After months of daily backups, my bucket had 17,569 files. Only 405 contained actual data.
The Problem
Intercom exports conversations to S3 as JSONL files, one file per time window (typically hourly). If no conversations happened during that window, you get a 0-byte file. Over time, this adds up:
Total files: 17,569
Non-empty files: 405 (2.3%)
Zero-byte files: 17,164 (97.7%)
That’s a lot of clutter to sift through when you actually need the data.
Finding and Deleting Empty Files
First, identify the scale of the problem:
# List all files with sizes
aws s3 ls s3://your-bucket/ --summarize | tail -3
# Count non-empty files
aws s3 ls s3://your-bucket/ | awk '$3 > 0 {count++} END {print count}'
To delete all zero-byte files, list them first, then use batch deletion:
# List empty files
aws s3 ls s3://your-bucket/ | awk '$3 == 0 {print $4}'
# Batch delete using s3api (much faster than individual s3 rm calls)
aws s3api delete-objects --bucket your-bucket \
--delete '{"Objects": [{"Key": "file1.jsonl"}, {"Key": "file2.jsonl"}]}'
Why not aws s3 rm?
Each s3 rm call makes a separate API request. For 17,164 files, that’s 17,164 API calls. Using s3api delete-objects with batches of 100 keys, the same cleanup takes only 172 API calls.
That’s 100x fewer requests, which translates to faster execution and lower costs.
The s3api delete-objects command accepts up to 1,000 keys per call. For this cleanup, I & Claude code, wrote a Python script that batches deletions in groups of 100.
The Setting I Wish I Knew About
After all that cleanup, I discovered Intercom has an option to skip empty exports entirely. In the Intercom data export settings, there’s a checkbox to exclude empty time windows from S3 backups.
Where to find it: Intercom Settings > Data & Privacy > Data export > Configure S3 export > “Skip empty export files”
This would have prevented all 17,164 empty files from being created in the first place.
Quick Reference
| Task | Command |
|---|---|
| Count all files | aws s3 ls s3://bucket/ --summarize |
| Count non-empty | aws s3 ls s3://bucket/ | awk '$3 > 0 {c++} END {print c}' |
| List empty files | aws s3 ls s3://bucket/ | awk '$3 == 0 {print $4}' |
| Batch delete | aws s3api delete-objects --bucket X --delete '{"Objects":[...]}' |
| Get total size | aws s3 ls s3://bucket/ --summarize | grep "Total Size" |
What I Learned
- Intercom creates empty JSONL files for every export window with no conversations
- Check vendor settings before building cleanup scripts
- AWS S3
lsoutput format:date time size filename(size is column 3) s3api delete-objectshandles up to 1,000 keys per call, far faster than individuals3 rm- Always check “Total Objects” after cleanup to verify the operation worked
Dealing with S3 data exports? I’d love to hear about your cleanup strategies. Reach out on LinkedIn.