Archive for the ‘Uncategorized’ Category

Good morning, RAC and Dataguard Enthusiasts!!

I have already posted how to convert a single instance into a RAC Database. This post will focus on the subtle differences of doing this on a Dataguard Standby. When I have used Dataguard to migrate a RAC database to a new server, I created my Standby as a single instance. To make a viable Switchover, this Standby must now be converted to a RAC Standby. I must warn you up front, these are the steps I took in my 2 node RAC Exadata Server. The steps unique to a standby are in bold font. Test this in your own sandbox before you proclaim this as gospel truth.

The setup pieces can either all be done at the beginning or the end. They are ordered as I performed them so feel free to test and experiment. Some of these steps are assumed but I listed them anyways.

  • Copy/create your password file, wallet, and adump directory from the source database to both nodes. The steps work the same, regardless of the number of nodes.
  • Create oratab entry on both nodes. vi /etc/oratab
  • Add tnsnames entries on both nodes and static listener entry on Standby Node.
    • DB1 =
      (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = xdt**client01-vip)(PORT = ****))
      (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SID = db1)
      )
      )
    • DB2 =
      (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = xdt**client02-vip)(PORT = ****))
      (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SID = db2)
      )
      )
  • Shutdown Managed Apply on Dataguard Instance
    • Ensure all standby logs are applied and caught up
    • SQL> select * from v$recovery_progress; **There is a better way to check this but I don’t want to publish someone else’s work as my own. **
    • SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL; 
  • Shutdown the single instance database.
    • SQL> shutdown immediate;
    • SQL> exit
  • Set environment to node 1.
    • $ . oraenv
    • ORACLE_SID = [db] ? db1
  • Edit original pfile to make it a cluster database. Remove [*.interconnects] lines to prevent an attempt to connect to the source database.
    • *.cluster_database=true
  • Startup nomount with your original pfile.
    • SQL> startup nomount pfile=’$ORACLE_HOME/dbs/inintdb.ora’
  • Create spfile from pfile. This creates a one line, pointer file in the /dbs directory.
    • SQL> create spfile=’+DATA_XDT**/DB/PARAMETERFILE/spfiledb.ora’ from pfile=’$ORACLE_HOME/dbs/inintdb.ora’;
  • Shutdown database.

Create srvctl setup
etlpc used as concrete example:
——————————-
# ETLPC
##
srvctl remove database -d ETLPCDG
srvctl add database -d ETLPCDG -c RAC \
-o /u01/app/oracle/product/11.2.0.4/dbhome_1 \
-m int.carlsonwagonlit.com \
-p +DATA_XDT*/ETLPC/PARAMETERFILE/spfileETLPCDG.ora \
-s OPEN -t IMMEDIATE -n ETLPCDG

srvctl add instance -d ETLPCdg -i ETLPCdg1 -n xdt*dbadm01
srvctl add instance -d ETLPCdg -i ETLPCdg2 -n xdt*dbadm02
–srvctl add instance -d etlpc -i etlpc3 -n xdt*dbadm03
–srvctl add instance -d etlpc -i etlpc4 -n xdt*dbadm04

srvctl enable instance -d ETLPCdg -i “ETLPCdg1,ETLPCdg2”
–srvctl disable instance -d etlpc -i “etlpc1,etlpc2,etlpc3,etlpc4”

srvctl config database -d ETLPCdg
srvctl status database -d ETLPCdg

  • Startup instance 1 in mount. This will startup mount with the spfile on ASM.
    • $ srvctl start instance -d etlpcdg -i etlpcdg1 -o mount
  • Startup Managed Apply on Node 1
    • SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT;
  • Once MRP0 process is running standby logs are applied, startup node 2 in mount mode:
    • $ srvctl start instance -d etlpcdg -i etlpcdg2 -o mount

I hope this clears up the differences between converting a normal database to RAC verses converting a standby database to RAC. Please let me know if you have any questions or comments.

Thanks for reading!

Jason

Great post by Adityanath’s Oracle Blog! This helped me tremendously as well! I was getting Archiver Hung Error because my archivelog backups had been failing to delete archivelogs for over 60 days. Thanks!

Adityanath's Oracle Blog

We got issue in one of the production environment, RMAN archivelog backup was failing with following errors:


kgefec: fatal error 0 
kgepop: no error frame to pop to for error 603

We were using following command in rcv file to backup archivelogs:


BACKUP archivelog all delete input;

After searching on metalink, I found one note with same scenario : RMAN Archivelog backup with DELETE INPUT fails kgepop: no error frame to pop to for error 603 (Doc ID 1921228.1)

According to this note, RMAN errors out when over 5000 archivelogs were being backed up resulting in an excessively large pl/sql to be executed.

Yes in my case, RMAN was processing more than 2TB archives in one go.


SELECT NAME,ROUND(SPACE_LIMIT / 1048576) SPACE_LIMIT_MB,ROUND(SPACE_USED / 1048576) SPACE_USED_MB,ROUND(((SPACE_USED / 1048576) * 100) / (SPACE_LIMIT / 1048576), 2) PRC_USED FROM V$RECOVERY_FILE_DEST;
NAME SPACE_LIMIT_MB SPACE_USED_MB PRC_USED
------------------------------ -------------- ------------- ----------
+RECO_FIN 3145728 2085370 66.29

As…

View original post 51 more words

Happy Tuesday, Data Evangelists and disciples alike! I have made it no secret that I am blown away by the positive features in Oracle Enterprise Manager 12c. There have been groundbreaking improvements in Performance Management, ASH Analytics, Security, RMAN management, and Tablespace Management. The EMCLI is also a very powerful tool in version 12c. I can’t wait to upgrade to 13c which was just recently released!

And here comes the rant… OEM “Critical” Alerts. Do I want to know about every critical database alert at the very moment that it occurs? Of course I do. You need to rethink your career choice if you don’t.

Merriam-Webster.com defines critical (as it relates to critical alerts) as… “being or relating to an illness or condition involving danger of death <critical care> <a patient listed in critical condition>”  Synonyms are crucial, decisive, indispensable, vital. A scientific definition is “of sufficient size to sustain a chain reaction ,the reactor went critical.>”

In other words, I want to know if my database is in critical condition. I want to know if my database is about to go nuclear. That’s my definition of critical. At 2:25 AM on a Monday morning, I don’t want to know if connection to the tns listener has exceeded 500 milliseconds. Yet if you as the DBA don’t reduce your critical alerts and you set up paging for critical alerts, OEM will wake you up to let you know this and many other “critical” alerts. OEM 12c comes with monitoring templates already enabled for almost every metric you can think of. If your organization owns an Exadata, it will alert you on metrics you don’t even know exist. At this point, you have two choices: disable the default monitoring templates or, as my title suggests, reduce the number and metrics that trigger the critical alert.

The policy of our DBA team is all warnings and critical alerts are sent to the Oracle DBA distribution list 24 hours a day, seven days a week. All critical alerts are sent to the Oncall DBA through a page to their cellphone. The Oncall DBA is required to keep his cellphone charged and nearby throughout the oncall week. This was not always the case. When I started with my current company, no alert reduction had been implemented and there was no paging for the Oncall DBA. We were getting over 1500 OEM emails per day. Thank God paging wan not yet implemented! It was the very first thing I noticed when I got all of my accounts and access. The rest of the team had learned to ignore OEM Emails and send them to an Outlook Rule folder. I didn’t do this right away so it drove me bonkers. I set up a rule but I tried to sift through the mail to find the alerts I might need. Within four weeks, I used a classic article by DBAKevlar to set up monitoring templates for our databases and Exadata Servers. As you will find out, the default templates set up in OEM can’t be modified. Although my original templates were replaced, I set the groundwork for having an alert system that allowed us all to get more sleep and only be disturbed for truly critical situations.

Some of the alert metrics I adjusted were Agent metadata channel count, Page Processing Time (OMS), Number of failed logins, Checker found ** new persistent data failures, etc. I adjusted 14 settings in all and I’m sure you will find at least that many.

  1. Alert Reduction: Once you have set up your own monitoring templates and disabled the default templates, you are ready to reduce alerts! Step one in my case was to go to my OEM Spam folder I set up in Outlook and sort the emails by subject. You will quickly see which subjects are creating the most problems. I was able to eliminate five or six alerts right away that reduced email traffic by 90%. I can’t remember which these were but mentioning them really won’t help you. It is just good comedy relief and even less relevant than tns listener connection time exceeding 500ms. When you have eliminated the ridiculous alerts, take a deep breath. You are down to less than 200 emails per day to sift through. Your coworkers and your boss will thank you. But we’re just getting started.
  2. Change Critical Alerts to Warning: In my case, there were quite a few alerts that I wanted to know about but could wait until morning. This way, our team will get an email but no page will be sent to the Oncall DBA.
    1. Navigate to Enterprise –> Monitoring –> Monitoring Templates. Find the Monitoring Template you want to edit.
    2.  Click on the blank box next to the Monitoring Template to highlight it. Click Edit and then Metric Thresholds.
    3. Find the metric and click on the pencils on the right side to edit. For our example, I clicked on Failed Login Count.
    4. Adjust Threshold Settings. I kept the warning threshold to 150 and blanked out the critical threshold box. If it is empty, OEM interprets that as no critical threshold. I set the number of occurrences to 1. So it will trigger a warning as soon as it hits 150 failed logins.
  3. Change Metric Settings and number of Occurrences to meet organizational needs: For this example, I adjusted the metric “Total Global cache block lost is **”.
    1. Follow steps 1-3 above. The Target Type is Database Instance and the metric is global cache blocks lost. Click on the edit pencil.
    2. In my case, I set the warning threshold to 10, the critical threshold to blank and the number of occurrences to 12. The collection schedule is every five minutes. So, setting the number of occurrences to 12 means the warning won’t be triggered until the threshold is violated for one hour. What I found was this warning, when triggered, would clear within thirty minutes. This got rid of a huge amount of warning emails and critical pages.

This has been an iterative process and likely will continue. Some of the metrics I changed at the cluster level were still triggered at the database instance level. In this case, go in and change it at the instance level metric. Please send me a message if you run into any issues. I can help you work through some of the issues I’ve experienced.

Thanks for reading!

Jason

EM Incident: Critical:New: – An access violation detected in /u01/…/log.xml at time…

Good morning, Oracle Fans across this great, somewhat green planet. I’m attempting to start a series on alert reduction for OEM 12c. For you savvy OEM experts who have already installed 13c, this may have less of an impact for you. Also, I am little jealous as our company is not ready to upgrade OEM. If OEM 13c improves the alerting function, I would love to hear that feedback!! Let’s begin.

As part of the on call rotation I am a part of, we have configured OEM to send the On Call DBA a text message for critical alerts that come up during nights and weekends. To configure this, our team created a separate OEM user. If you are a one person show, this may not be necessary. We will call this user ONCALL-DBA. After logging in as ONCALL-DBA, navigate to…

at the top right corner of OEM screen, click the drop down for ONCALL-DBA> Enterprise Manager Password and Email

You need to enter in a line for your email and your cell phone. Each carrier has a different format but Verizon’s is phone number (without dashes) @vtext.com. i.e.(6194561234@vtext.com). Make sure the SELECT checkbox is checked for both lines.

Log out of ONCALL-DBA.

Now you will get pages at all hours of the night and weekends for any critical alert that OEM has deemed as a critical metric violation. Here’s the problem. OEM12c is preset to alert you for all kinds of metric alerts that you may or may not deem critical. I have worked over the last two months to change most of the critical alerts to warnings. I have also eliminated a great deal of alerts that just don’t apply to our organization. We utilize Exadata so there is an additional set of alerts related to the cell nodes, Infiniband switches, etc. I thought about publishing this list but these alerts may be needed in your situation. Plus, I would hate to spare you the great pleasure of configuring these alerts yourself. Each alert should be investigated as to why it triggered and if you need to know about it.

This week, as the title suggests, I am focusing on an alert that triggers several times a week and sounds like a security violation of some sort.

EM Incident: Critical:New: – An access violation detected in /u01/…/log.xml at time…

On investigating the log where this violation occurred, I discovered the following…

msg_id=’312383570′ type=’INCIDENT_ERROR’ group=’Access Violation’
level=’1′ host_id=’xdt***.com’ host_addr=’***’
prob_key=’ORA 7445 [kkorminl()+306]’ upstream_comp=” downstream_comp=’SQL_Costing’
ecid=” errid=’27841′
<txt>Errors in file /u01/***.trc (incident=27841):
ORA-07445: exception encountered: core dump [kkorminl()+306] [SIGSEGV] [ADDR:0x7FFFB6B46FF8] [PC:0x9579FC4] [Address not mapped to object] []
</txt>
</msg>
<msg time=’2016-02-29T22:00:31.609-08:00′ org_id=’oracle’ comp_id=’rdbms’
msg_id=’dbgexProcessError:1205:3370026720′ type=’TRACE’ level=’16’
host_id=’xdt***.com’ host_addr=’***’>
<txt>Incident details in: /u01/***.trc
</txt>
</msg>
<msg time=’2016-02-29T22:00:31.609-08:00′ org_id=’oracle’ comp_id=’rdbms’
client_id=” type=’UNKNOWN’ level=’16’
host_id=’xdt***.com’ host_addr=’***’ module=’DBMS_SCHEDULER’
pid=’77909′>
<txt>Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.

The first thing I noticed was the module DBMS_SCHEDULER. The trace file revealed this was a SQL Tuning Advisor job trying to run. The second thing which a coworker pointed out was the core dump which was happening every time the access violation occurred. It’s not a security violation when a SQL Tuning Advisor job fails but it could cause space issues for /u01 if a core dump happens several times per week. I did two things.

  1. I downgraded this OEM Alert to a warning.
    1. To change an alert setting, navigate to your monitoring templates. This is a separate subject I will address later. I am happy to help you set up a monitoring template and others, such as DBAKevlar, have done a great job publishing articles that will do just that.
  2. I opened up an Oracle Support ticket.
    1. It turns out this is an 11g bug that is fixed in 12c. Our organization will begin upgrading to 12c in about seven months. Again, I’m a little jealous of you who have upgraded to Oracle 12c. It is not an option to wait to apply the bug fix.
    2. Support sent me a database, rolling installable patch. We have already applied this on our sandbox Exadata box. Imagine that, a sandbox Exadata server. Now it’s time for you to get a little jealous. We will add this patch to our next quarterly bundle.

Questions? I will try to address any issues in my next blog on this subject.

Thanks for reading!

Jason

I know there is more than one way to move a sql plan hash value from one environment to another but this way was simple and worked very efficiently for me. Give it a shot!

  • A few days ago, Developers started complaining, “This query is running very slow in production, faster in development, and never finishing in test.” Have you heard this complaint before?
  • If the query has run successfully in one environment (including the one you are executing it in), and table structure is the same across environments, you can create a sql profile than can be used in your other environments.
  • Step 1: Find the sql_id and plan hash value from the good environment you want to use to create your profile. There are a couple of ways to do this. In my case the sql_id was not in memory so I had to use “display_awr”…

select * from table(dbms_xplan.display_awr(‘SQL_ID’));  — run this if the sql_id is not in memory

select * from table(dbms_xplan.display_cursor(‘SQL_ID’)); — use this if the sql_id is still in memory

  • Step 2: Create a SQL profile. SQL T has a great script to help create a sql profile that can be used in your other environments. If you don’t already have sql t installed, see Oracle Support Doc ID 1614107.1 Be advised, this link will only work if you have an Oracle Support account and a CSI#.
  • Step 3: Run this script in SQL Plus. It will prompt you for the sql_id and plan hash value.

@/home/oracle/sqlt/utl/coe_xfr_sql_profile.sql

This script will produce a file similar to coe_sqlid_planhash.sql.

  • Step 4: Copy this to the environment you want to create the profile in and run it.

sql> @coe_sqlid_planhash.sql

  • Step 5: Run your query again and verify the execution plan is now using the newly created sql profile!

select * from table(dbms_xplan.display_cursor(‘SQL_ID’));

Note
—–
– Degree of Parallelism is 8 because of hint
– SQL profile coe_sqlid_planhash used for this statement

Success!!

Please let me know if you have any difficulties with this. I was surprised by the simplicity and ease of use for this product. SQL T has many more useful products for tuning but a friend of mine showed me this little nugget and made my day!

Thanks for reading!

Jason

Good Afternoon, Oracle Fans!

I was asked by my customer to create an OEM user that their development team and management could use to get a bird’s eye view of performance on the database. What I didn’t want to do was show them all of the targets I can see because it includes the entire Exadata Stack. I found a great article from a few years ago that helped me find what I needed. http://dbastreet.com/blog/?p=395   I have modernized it for version 12C. Here goes!!

1) As SYSMAN, Cick on Setup–> Security –> Administrators

Blog_Post1

2) From the Administrators page, click create.

BlogOEM2

3) Enter all the standard info for creating your administrator on the Create Administrator: Properties page. Click Next

Blog_OEM3

4) Remove the PUBLIC role from the default roles list. Click next.

OEM_blog4

5) From the Target Privileges page, click on Connect to Any Viewable Target.

OEM_Blog5

6) This is the important Part: At the bottom of the Target Privileges Page, add the targets you want the Administrator to be able to view: Click on ADD

OEM_Blog6

Choose your targets and click SELECT.

OEM_BLOG7

There are other privilege options you can grant to the user but my customer only needs view target. Click Next.

Click REVIEW:

Review your choices and…

Click FINISH:

OEM_blog8

At this point the user can view all the targets you granted to the administrator. If you desire the administrator the ability to connect to the target databases, you will need to create a database user with the needed grants/privileges to do so.

Thanks for reading!

Jason

This is very useful for quick EM Commands. Thanks, Laurent

Oracle ... as usual

If like me, you are addicted to command line, this post will help you to realize useful operation in EM Cloud Control 12c using emcli tool rather than GUI.

You can use emcli when you need to implement batch scripts on your Oracle Management Server (OMS).

  • Login to EM
$ emcli login -username=sysman
Enter password :
Login successful

OR

$ emcli login -username=sysman -password=XXXXXX
Login successful
  • Synchronize emcli with the OMS
$ emcli sync
Synchronized successfully
  • Logout EM
$ emcli logout
Logout successful
  • List all the targets already promoted
$ emcli get_targets Status  Status           Target Type           Target Name  ID -9      n/a              group                 OFM 1       Up               host                  xxxdb-prod3.mydomain.com 1       Up               host                  xxxdb-prod2.mydomain.com 1       Up               host                  xxxlgc-prod2.mydomain.com 1       Up               host                  xxxdb-prod1.mydomain.com 1       Up               j2ee_application      /EMGC_GCDomain/GCDomain/EMGC_OMS1/e                                                mgc 1       Up               j2ee_application      /EMGC_GCDomain/GCDomain/EMGC_OMS1/O                                                CMRepeater 1       Up               j2ee_application      /EMGC_GCDomain/GCDomain/EMGC_OMS1/e                                                mpbs -9      n/a              metadata_repository   /EMGC_GCDomain/GCDomain/EMGC_ADMINS                                                ERVER/mds-sysman_mds -9      n/a              metadata_repository   /EMGC_GCDomain/GCDomain/EMGC_ADMINS                                                ERVER/mds-owsm 1      …

View original post 364 more words