Enterprise Manager Incident: Critical Alerts Part 1 (Access violation detected in /u01)

Posted: April 8, 2016 in OEM12c, ORA Errors, Uncategorized
Tags:

EM Incident: Critical:New: – An access violation detected in /u01/…/log.xml at time…

Good morning, Oracle Fans across this great, somewhat green planet. I’m attempting to start a series on alert reduction for OEM 12c. For you savvy OEM experts who have already installed 13c, this may have less of an impact for you. Also, I am little jealous as our company is not ready to upgrade OEM. If OEM 13c improves the alerting function, I would love to hear that feedback!! Let’s begin.

As part of the on call rotation I am a part of, we have configured OEM to send the On Call DBA a text message for critical alerts that come up during nights and weekends. To configure this, our team created a separate OEM user. If you are a one person show, this may not be necessary. We will call this user ONCALL-DBA. After logging in as ONCALL-DBA, navigate to…

at the top right corner of OEM screen, click the drop down for ONCALL-DBA> Enterprise Manager Password and Email

You need to enter in a line for your email and your cell phone. Each carrier has a different format but Verizon’s is phone number (without dashes) @vtext.com. i.e.(6194561234@vtext.com). Make sure the SELECT checkbox is checked for both lines.

Log out of ONCALL-DBA.

Now you will get pages at all hours of the night and weekends for any critical alert that OEM has deemed as a critical metric violation. Here’s the problem. OEM12c is preset to alert you for all kinds of metric alerts that you may or may not deem critical. I have worked over the last two months to change most of the critical alerts to warnings. I have also eliminated a great deal of alerts that just don’t apply to our organization. We utilize Exadata so there is an additional set of alerts related to the cell nodes, Infiniband switches, etc. I thought about publishing this list but these alerts may be needed in your situation. Plus, I would hate to spare you the great pleasure of configuring these alerts yourself. Each alert should be investigated as to why it triggered and if you need to know about it.

This week, as the title suggests, I am focusing on an alert that triggers several times a week and sounds like a security violation of some sort.

EM Incident: Critical:New: – An access violation detected in /u01/…/log.xml at time…

On investigating the log where this violation occurred, I discovered the following…

msg_id=’312383570′ type=’INCIDENT_ERROR’ group=’Access Violation’
level=’1′ host_id=’xdt***.com’ host_addr=’***’
prob_key=’ORA 7445 [kkorminl()+306]’ upstream_comp=” downstream_comp=’SQL_Costing’
ecid=” errid=’27841′
<txt>Errors in file /u01/***.trc (incident=27841):
ORA-07445: exception encountered: core dump [kkorminl()+306] [SIGSEGV] [ADDR:0x7FFFB6B46FF8] [PC:0x9579FC4] [Address not mapped to object] []
</txt>
</msg>
<msg time=’2016-02-29T22:00:31.609-08:00′ org_id=’oracle’ comp_id=’rdbms’
msg_id=’dbgexProcessError:1205:3370026720′ type=’TRACE’ level=’16’
host_id=’xdt***.com’ host_addr=’***’>
<txt>Incident details in: /u01/***.trc
</txt>
</msg>
<msg time=’2016-02-29T22:00:31.609-08:00′ org_id=’oracle’ comp_id=’rdbms’
client_id=” type=’UNKNOWN’ level=’16’
host_id=’xdt***.com’ host_addr=’***’ module=’DBMS_SCHEDULER’
pid=’77909′>
<txt>Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.

The first thing I noticed was the module DBMS_SCHEDULER. The trace file revealed this was a SQL Tuning Advisor job trying to run. The second thing which a coworker pointed out was the core dump which was happening every time the access violation occurred. It’s not a security violation when a SQL Tuning Advisor job fails but it could cause space issues for /u01 if a core dump happens several times per week. I did two things.

  1. I downgraded this OEM Alert to a warning.
    1. To change an alert setting, navigate to your monitoring templates. This is a separate subject I will address later. I am happy to help you set up a monitoring template and others, such as DBAKevlar, have done a great job publishing articles that will do just that.
  2. I opened up an Oracle Support ticket.
    1. It turns out this is an 11g bug that is fixed in 12c. Our organization will begin upgrading to 12c in about seven months. Again, I’m a little jealous of you who have upgraded to Oracle 12c. It is not an option to wait to apply the bug fix.
    2. Support sent me a database, rolling installable patch. We have already applied this on our sandbox Exadata box. Imagine that, a sandbox Exadata server. Now it’s time for you to get a little jealous. We will add this patch to our next quarterly bundle.

Questions? I will try to address any issues in my next blog on this subject.

Thanks for reading!

Jason

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s