Deep DiveApril 20266 min read

Why Your Cronjob Backups Are Silently Failing

The False Sense of Security

You set up a cronjob. It runs pg_dump every night. The file appears in /backups. Everything looks fine until the day you actually need to restore.

This is the most common disaster recovery failure pattern: teams assume their backups work because the backup process ran without errors.

Five Ways Cronjob Backups Fail Silently

1. Disk Full and Dump Truncated

Your backup drive fills up. pg_dump writes a partial file and exits. The file exists, has a recent timestamp, and looks normal. But it contains only 40% of your data.

2. Permission Changes

Someone rotates the database password or changes pg_hba.conf. The cronjob connects with the old credentials and fails. Cron sends an email to root, which nobody reads because the mailbox is full.

3. Schema Changes Break Restore

Your app team adds a new column with a NOT NULL constraint. The backup succeeds. But when you try to restore on a fresh instance, the restore fails because of dependency ordering issues.

4. Corruption Goes Undetected

A disk controller silently corrupts data blocks. Your database keeps running because PostgreSQL is resilient. Your backup captures the corrupted data faithfully. You now have 30 days of corrupted backups.

5. The Backup Never Actually Ran

The server was rebooted and cron was not re-enabled. Or someone edited the crontab and accidentally deleted the backup line. Nobody checks because the monitoring only alerts on backup failure, and "never ran" is not the same as "failed."

The Solution: Verified Backups

The only way to know a backup works is to restore it. Every single time.

BackupAgent does this automatically:

Every backup is restored in an isolated Docker container
Row counts are compared against the source database
Schema integrity is verified
Custom queries can validate business-critical data
If anything fails, you get an immediate alert

How to Audit Your Current Backups

Run this checklist today:

When was the last backup? Check the actual file, not the cron schedule
Can you restore it? Actually try it on a test instance
How long does restore take? This is your RTO
How much data would you lose? Time since last backup is your RPO
Who gets alerted if a backup fails? Is anyone actually watching?

If you cannot confidently answer all five, your backups are at risk.

Ready to try BackupAgent?

AI-verified database backups in under 5 minutes. Free forever.