RMAN and CommVault Version 11

Posted: April 7, 2016 in Backup and Recovery, CommVault, ORA Errors
Tags: , ,

Good Afternoon, Oracle fanatics!

Today’s post deals with one of the nuances of CommVault and the management of our beloved recovery manager, formally known as RMAN.

For those of you who don’t know, CommVault is a graphical user interface (GUI) that works with recovery manager tools for data protection, backup and recovery. It does much much more but I only listed the piece I am currently using. Our team manages a multitude of Oracle single instance, 2 instance RAC and 4 instance RAC databases. CommVault not only shows a status of running jobs, it interfaces seamlessly with RMAN for configuration, scheduling and running full, differential, cumulative and archive log backups and recovery. It is a very useful tool for enterprise organizations.

Recently, our storage team upgraded CommVault to version 11. There was no need for an RMAN outage as the upgrade would take place under a short maintenance period and no backups were scheduled during the CommVault outage. As you may know, RMAN is resilient enough to withstand a paused job anyways. So the thought that backups might fail as a result of the upgrade was not fully investigated. The upgrade went smoothly. After the maintenance completed, CommVault came back up. We were able to connect after tweaking our Java settings. Each morning, we get a job summary report from CommVault showing all of the jobs that ran the previous day. There were over 200 jobs that ran. Of those, 12 backups failed, 9 were still running, 8 were delayed, 2 were killed, and 2 completed with errors. Uh oh. We have a major problem.

The script that CommVault created for backup used up to 4 channels spread across the various nodes of the instance. Each channel connects to RMAN and then connects to the database using…

allocate channel ch1 type ‘sbt_tape’ connect “sys/***************@rac_inst_name”
PARMS=”SBT_LIBRARY=/opt/simpana/simpana/Base/libobk.so,BLKSIZE=1048576,ENV=(CV_mmsApiVsn=2,CV_channelPar=ch1,ThreadCommandLine= -chg 1:2 -rac 364 -cn xdt#dbadm01 -vm Instance001)”

All of the jobs that failed overnight failed to connect to the instance with…

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
ORA-01017: invalid username/password; logon denied
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of configure command at 04/08/2016 07:23:58
RMAN-06171: not connected to target database

ORA-01017 is the root cause.

I simulated the error by connecting to RMAN and then running the connect string to the database. I was able to connect. These failed jobs seemed completely random. 13 databases including single instance, RAC, Solaris Servers and UNIX Servers. There was no connection so we turned it back over to our storage team. After a call with the vendor, we realized there was actually a pattern. All of the jobs that failed had SYS passwords that included special characters. So the connect string had to include double quotes. The CommVault connect string did not include double quotes so it could not interpret any special characters.

It turned out to be a simple solution but took days to get the storage team to look at CommVault as the culprit instead of RMAN or the database.

For the future, it makes sense to come up with a better test plan than simply see if CommVault comes back up and appears to be connected to the databases.

Thanks for reading!





Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s