The story of the failed backup ;)

The story starts with some CheckPoint R55 failing to perform the good old backup command (the web interface is just a.. well, an interface to this command).

So our hero decided to run the command manually with debugging information ( -d modifier)

[Expert@fw]# backup -d –file /var/tmp/backup_test.tgz

Initializing CRemoteBackupUtils instance.

This is Local operation (0)
[…]
Are you sure you want to proceed? y
CBackupUtils::ReadSchedFile() starting
[… edited, web interface parameters]
Initializing CRemoteBackupUtils instance.
This is Local operation (0)
[…]
BACKUP operation started.
Establishing lock…
Lock established
Creating backup package…

Executing: ‘. /opt/CPshared/5.0/tmp/.CPprofile.sh ; cpbackup_util backup –file backup_fw_17_10_2010_10_33 –type all > /tmp/output.txt 2> /tmp/error.txt’

/BACKUP operation failed.

Details: Cannot complete the backup process due to package compression errors. Same error we got in the web interface log.
[…]
Backup operation failed.

Looks like backup is failing when performing the command cpbackup_util […] but what does “package compression error” mean?

Our hero misunderstood this and spent a lot of time checking the RPM package which contains cpbackup_util, checking integrity, etc.

Without success…

But the intrepid security engineer of this story didn’t give up and decided to check a recent backup log…

[Expert@fw]# tail -40 backups/backup_fw_10_9_2010_00_20_100910002000.backup.log

[10.09.10-00:20:00]: <<<<<< Start Backup >>>>>>
[10.09.10-00:20:00]: LOG FILE NAME = //var/CPbackup/log/backups/backup_fw_10_9_2010_00_20_100910002000.backup.log
[10.09.10-00:20:00]: pkg_file   backup_fw_10_9_2010_00_20
[10.09.10-00:20:00]: group      all
[10.09.10-00:20:01]: sh: /bin/tar: Argument list too long
[…]
[10.09.10-00:20:01]: Backup operation failed.

Hmmm….. He stared at the screen and mumbled…

“sh: /bin/tar: Argument list too long

YOU GOTTA BE F***ING KIDDING ME :)”

This is what they mean with package compression.

The “tar” command and a lot others have a limitation in the number of arguments they can accept. The “cpbackup_util” command was executed with the “—type all” modifier, so it looked pretty obvious that somewhere in the middle of the backup operation a “tar <pkg_file.tar> file1 file2 file3 … fileN” was being executed…

So the question that remained was… what the f**k is included when “–type all” is specified?

To answer this question we need to check the SCHEMAS.

[Expert@fw]# cpbackup_util list_groups
GROUPS=snapshot all system cp_products

There exist several “groups” which are simply convenient ways to define what files should be backed up.

[Expert@fw]# pwd
/var/CPbackup/schemes
[Expert@fw]# ls -l

total 48
-r–r–r–    1 root     root          409 Jan 26  2005 cp_products.grpbak
-r–r–r–    1 root     root         1563 Jan 26  2005 dtps.cpbak
-r–r–r–    1 root     root         1671 Jan 26  2005 fg1.cpbak
-r–r–r–    1 root     root         1771 Jan 26  2005 fw1.cpbak
-r–r—–    1 root     root         1771 Aug 30  2007 fw1.cpbakold
-r–r–r–    1 root     root         1606 Jan 26  2005 ppak.cpbak
-r–r–r–    1 root     root         1628 Jan 26  2005 rt.cpbak
-r–r–r–    1 root     root         1541 Jan 26  2005 rtm.cpbak
-r–r–r–    1 root     root         1067 Jan 26  2005 snapshot.cpbak
-r–r–r–    1 root     root         1661 Jan 26  2005 svn.cpbak
-r–r–r–    1 root     root         2111 Jan 26  2005 system_configuration.cpbak
-r–r–r–    1 root     root         1610 Jan 26  2005 uag.cpbak

They all have the same structure, so let’s check one of them (fw1.cpbak)

[Expert@fw]# cat fw1.cpbak

<ID>
CPproducts_fw1
</ID>
<GROUPS>
cp_products
all           <——— muhahaha
</GROUPS>
<INCLUDE_FILES>
/var/opt/CPfw1-R55/conf/*
/var/opt/CPfw1-R55/database/*
/var/opt/CPfw1-R55/state/*
/var/opt/CPfw1-R55/log/*   <—- boom!
/opt/CPfw1-R55/lib/*.pf
/etc/fw.boot/
</INCLUDE_FILES>
<EXCLUDE_FILES>
/etc/fw.boot/modules/*.o
/var/opt/CPfw1-R55/log/fw.*
</EXCLUDE_FILES>

Our brave security engineer realized then that every <INCLUDE_FILES> would be … included in the backup if the group label was specified in <GROUPS>.

For example, all those firewall related files will be included only in backups with the group “cp_products” or “all”. It was easy to suspect that the group “all” was included in almost every schema…

[Expert@fw]# grep ^all$ -n *
dtps.cpbak:7:all
fg1.cpbak:7:all
fw1.cpbak:7:all
fw1.cpbakold:7:all
ppak.cpbak:7:all
rt.cpbak:7:all
rtm.cpbak:7:all
svn.cpbak:7:all
system_configuration.cpbak:7:all
uag.cpbak:7:all

And so it was!

What does that mean? Our backup (“all” group) will include ALL the files specified in ALL those schemas (kind of merge all of them).

These are A LOT of files… specially the logs… they backup the logs!

[Expert@fw]# ls /var/opt/CPfw1-R55/log/ -l | wc -l
2567

The handsome security engineer realized that this was for sure the problem and he had to cut the number of files down. The size of the files was not a problem, there was plenty of disk space. The real problem was the number of them. Having a lot of space, he decided to “tar” the 2009 logs to “logs_2009.tar” and remove the original files.

After that operation:

[Expert@fw]# ls -lc | wc -l
1151

Wow, that’s a difference, mama!

He tried the backup again and… it worked! The problem has been identified, solved and the 2009 log files are still being backed up.

Our hero has won for sure something nice for lunch!

Ah! And to the Checkpoint engineer who programmed this:

I pity the fool!

I PITY THE FOOL!!!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s