Row row row your boat, gently down the Oracle Streams…

Posted: December 26, 2014 in Oracle Streams
Tags: ,

Welcome back, data enthusiasts!! I hope you had a pleasant, relaxing holiday.

Before Oracle and I completely forget about Oracle Streams, I should probably document my short experience with this replication software. On a previous project, I had the responsibility to administer Oracle Streams to a few remote sites. I used several built in DBMS Packages to monitor streams activities and some basic troubleshooting processes to either bounce streams, restore the heartbeat to real time, or execute apply errors. Creating the streams processes is outside the scope of this article. See http://docs.oracle.com/cd/B28359_01/server.111/b28322/man_gen_rep.htm#STREP013 for help.

If you haven’t picked up on it yet, one of the main purposes for this site is to document my experiences. You, the reader, get the added benefit of my experiences without having to feel my pain.

At a basic level, Oracle Streams is the flow of information from one database to another. There are three basic processes and a few database objects that are used to share data and messages. Database changes are captured at the source, propagated to local or remote destinations, and applied at the “streamed” site. http://en.wikipedia.org/wiki/Oracle_Streams Hence, the names of these processes are Capture, Propagate and Apply. If monitored frequently, it is a great product that outperforms its predecessor, Oracle Advanced Replication. The successor to Streams is a somewhat new product, acquired by Oracle in 2009 and named Oracle GoldenGate. I have not used GoldenGate but I look forward to the opportunity. Some possible applications that benefit from Oracle Streams are data replication, data warehousing, and data distribution.

DBMS_CAPTURE

There are two basic ways Oracle Streams can capture database changes implicitly: capture processes and synchronous captures. http://docs.oracle.com/cd/B28359_01/server.111/b28321/strms_over.htm#strms_what_can_streams_do
The capture process captures changes made to schemas, tables or an entire database. Synchronous capture can capture DML changes to tables. Rules determine which changes are captured where.

Database changes are recorded in the redo log for the database. The capture process captures these changes from the redo log and formats the change into a logical change record (LCR). Some of the procedures I have use within DBMS_CAPTURE are as follows:

DBMS_CAPTURE_ADM.STOP_CAPTURE(‘CAP_PROCESS’); –This process stops the capture process.

DBMS_CAPTURE_ADM.START_CAPTURE(‘CAP_PROCESS’); –This starts the specific capture process you have created.

To display information for each capture process in a database, run the following query:

COLUMN CAPTURE_NAME HEADING 'Capture|Name' FORMAT A15
COLUMN STATE HEADING 'State' FORMAT A27
COLUMN STATE_CHANGED HEADING 'State|Change Time'
COLUMN CREATE_MESSAGE HEADING 'Last Message|Create Time'

SELECT CAPTURE_NAME,
       STATE,
       TO_CHAR(STATE_CHANGED_TIME, 'HH24:MI:SS MM/DD/YY') STATE_CHANGED,
       TO_CHAR(CAPTURE_MESSAGE_CREATE_TIME, 'HH24:MI:SS MM/DD/YY') CREATE_MESSAGE
  FROM V$STREAMS_CAPTURE;

Your output looks similar to the following:

Capture                                     State             Last Message
Name            State                       Change Time       Create Time
--------------- --------------------------- ----------------- -----------------
CAPTURE_SIMP    CAPTURING CHANGES           13:24:42 11/08/04 13:24:41 11/08/04

Staged Messages

Messages are stored in a queue. These can be LCRs or user messages. Capture processes en-queue messages into an ANYDATA queue, which can stage messages of different types.

DBMS_PROPAGATION

Propagations can “propagate” messages from one queue to another. I have used the following propagation procedures within DBMS_PROPAGATION:

DBMS_PROPAGATION_ADM.STOP_PROPAGATION(‘PROP_PROCESS’); –This process stops the propagation process.

DBMS_PROPAGATION_ADM.START_PROPAGATION(‘PROP_PROCESS’); 

**Note** The V$STREAMS_PROPAGATION view provides helpful information about the running propagation processes.

DBMS_APPLY

A message is consumed when it is dequeued from a queue. An apply process can dequeue messages implicitly. A user, application, or messaging client can dequeue messages explicitly. The database where messages are applied is called the destination database. Rules determine which messages are dequeued and processed by the apply process. Some of the procedures I have used in the DBMS_APPLY package are as follows:

DBMS_APPLY_ADM.STOP_APPLY(‘APPLY_PROCESS’); –Stops the apply process.

DBMS_APPLY_ADM.START_APPLY(‘APPLY_PROCESS’); –Starts the apply process.

DBMS_APPLY_ADM.EXECUTE_ALL_ERRORS;

This is a troubleshooting step. In the streams database I managed, there was a threshold set that aborted the apply process when the error queue exceeded 1000 errors. If i could prove the error queue is approaching 1000, I would run the EXECUTE_ALL_ERRORS procedure until it attempted to process all of the errors. I eventually set this up as a cron job to be run continually.

Create Heartbeat tables

  1. You will need to create a table in your source database that has a timestamp column as one of its columns. Enable supplemental logging for the table and instantiate the table at the source. Export the table from the source and import at the destination tables with the instantiation option.
  2. Add a rule to capture changes for the heartbeat table at the source. Propagate the changes to the destination.
  3. Make sure that the target destination will apply changes to this table as well by adding a new apply rule.
  4. Set up an automated job to update this table at the source site periodically.

**Example** See http://oraprofessionals.blogspot.com/2008/08/oracle-streams-abcs-of-heartbeat-table.html

You can now create a “heartbeat” query that will compare the local time, the local database time and the destination database time. The difference is how far behind the apply process is.

If the Apply Process gets behind

I incorporated a “Heartbeat” query as well as a capture query and a propagation query into my morning checks. I would consider a 100 minute lag or a disabled process a failed check. As long as there is a lag time between databases, redo logs will not be deleted. If left unchecked, this can completely fill your archive area and crash your local database.

I followed this basic guide to restoring the Streams processes…

Stop capture… stop propagation… stop apply… start in reverse order.


 

**Note** It is important that each step is executed in order and completes before executing the next step.

  • DBMS_CAPTURE_ADM.STOP_CAPTURE(‘CAP_PROCESS’);
  • DBMS_PROPAGATION_ADM.STOP_PROPAGATION(‘PROP_PROCESS’);

If the difference of the heartbeats is less than 100 minutes, sometimes you can leave the apply process alone and proceed to start propagation and start capture steps.

  • DBMS_APPLY_ADM.STOP_APPLY(‘APPLY_PROCESS’);
  • DBMS_APPLY_ADM.START_APPLY(‘APPLY_PROCESS’);
  • DBMS_PROPAGATION_ADM.START_PROPAGATION(‘PROP_PROCESS’);
  • DBMS_CAPTURE_ADM.START_CAPTURE(‘CAP_PROCESS’);

If all went well, the heartbeat lag will begin to shorten. It could take quite a while to completely catch up.

Querying the apply error table

SELECT LOCAL_TRANSACTION_ID,
         SOURCE_DATABASE,
         MESSAGE_NUMBER,
         MESSAGE_COUNT,
         ERROR_NUMBER,
         ERROR_MESSAGE
      FROM DBA_APPLY_ERROR

I mentioned earlier that thresholds were set on the apply error table to abort the apply process if the error count exceeded 1000. There is more information collected in the error table than just the count. I monitored the capture and propagate queues but I had no direct access to the apply queues except through a database link as STRMADMIN. As a result, I could query the apply error table for the error message but could not fix the problem. Often times, the message told me exactly why the apply process was getting behind or shut down. A common message was when a datafile would fill up. Since there was not a dedicated DBA for the destination tables, no one was monitoring tablespace thresholds. A quick phone call to the destination site IT team could usually resolve the issue.

Thanks for reading!!

Jason Brown

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s