Search

OakieTags

Who's online

There are currently 0 users and 19 guests online.

Recent comments

Oakies Blog Aggregator

Maintaining Tempfile in TEMP Tablespace of PDB$SEED in Oracle 12c (12.1.0.2)

During testing recovery procedures for one of the ongoing projects I wanted to test the "complete disaster" recovery scenario. In this scenario I had to recreate also all ASM disks and restore everything from backup.
Actually full backup with RMAN and subsequent restore of a pluggable 12c single-tenant database  was the solution. I will not talk about that as the main point of this post is quite different.

So the recovery was successful but after restoring  the CDB$ROOT and PDB database I found in the alert log the following message:

Errors in file /u01/app/oracle/diag/rdbms/mydb/mydb/trace/mydb_dbw0_28973.trc:
ORA-01157: cannot identify/lock data file 202 - see DBWR trace file
ORA-01110: data file 202: '+DATA/MYDB/FD9AC20F64D244D7E043B6A9E80A2F2F/DATAFILE/pdbseed_temp012015-01-03_12-22-30-pm.dbf'
ORA-17503: ksfdopn:2 Failed to open file +DATA/MYDB/FD9AC20F64D244D7E043B6A9E80A2F2F/DATAFILE/pdbseed_temp012015-01-03_12-22-30-pm.dbf
ORA-15173: entry 'FD9AC20F64D244D7E043B6A9E80A2F2F' does not exist in directory 'MYDB'
Thu Feb 05 09:12:52 2015
Errors in file /u01/app/oracle/diag/rdbms/mydb/mydb/trace/mydb_dbw0_28973.trc:
ORA-01186: file 202 failed verification tests
ORA-01157: cannot identify/lock data file 202 - see DBWR trace file
ORA-01110: data file 202: '+DATA/MYDB/FD9AC20F64D244D7E043B6A9E80A2F2F/DATAFILE/pdbseed_temp012015-01-03_12-22-30-pm.dbf'

The problem was the temp file of the temporary tablespace TEMP in PDB$SEED which is used for creation of pluggable databases. The tempfiles are not part of RMAN backup as they can always be recreated. Although I could live with this error messages as this is single-tennat database with only one pluggable database I wanted to deliver a "clean solution" to the client. Besides I wanted to know how I can solve such problem.

So here is the solution:

$ sqlplus / as sysdba

SQL*Plus: Release 12.1.0.2.0 Production on Thu Feb 5 09:35:11 2015

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
and Real Application Testing options

SQL> alter session set container=pdb$seed;

Session altered.

SQL> select open_mode from v$database;

OPEN_MODE
--------------------
READ ONLY

SQL> select name from v$tempfile;

NAME
---------------------------------------------------------------------------------------------------
+DATA/MYDB/FD9AC20F64D244D7E043B6A9E80A2F2F/DATAFILE/pdbseed_temp012015-01-03_12-22-30-pm.dbf


SQL> alter pluggable database pdb$seed close;
alter pluggable database pdb$seed close
*
ERROR at line 1:
ORA-65017: seed pluggable database may not be dropped or altered

SQL> alter session set "_oracle_script"=TRUE;

Session altered.

SQL> alter pluggable database pdb$seed close;

Pluggable database altered.

SQL> alter pluggable database pdb$seed open read write;

Pluggable database altered.

SQL> alter tablespace temp add tempfile '+DATA';

Tablespace altered.
SQL> column name format a100
SQL> select name from v$tempfile;

NAME
-------------------------------------------------------------------------------------------------
+DATA/MYDB/FD9AC20F64D244D7E043B6A9E80A2F2F/DATAFILE/pdbseed_temp012015-01-03_12-22-30-pm.dbf
+DATA/MYDB/0BBEAC43B4250445E0530A0C6D0AEC65/TEMPFILE/temp.299.870860375

SQL>  alter tablespace temp drop tempfile '+DATA/MYDB/FD9AC20F64D244D7E043B6A9E80A2F2F/DATAFILE/pdbseed_temp012015-01-03_12-22-30-pm.dbf';

Tablespace altered.

SQL>  select name from v$tempfile;

NAME
----------------------------------------------------------------------------
+DATA/MYDB/0BBEAC43B4250445E0530A0C6D0AEC65/TEMPFILE/temp.299.870860375

After performed change there were no more error messages during database startup.

The secrete sauce for changing the state of PDB$SEED was setting "alter session set "_oracle_script"=TRUE;". The idea came from this post about changing the open mode of the PDB$SEED database.
Hope this short post will help somebody to resolve his problem with dropping and creating new tempfile in PDB$SEED faster.

Parallel Execution 12c New Features Overview

Oracle 12c is the first release since a couple of years that adds significant new functionality in the area of Parallel Execution operators, plan shapes and runtime features. Although 11gR2 added the new Auto DOP feature along with In-Memory Parallel Execution and Statement Queueing, the 12c features are more significant because they introduce new operators that can change both execution plan shape and runtime behaviour.

Here is a list of new features that are worth to note (and not necessarily mentioned in the official documentation and white papers by Oracle):

- The new HYBRID HASH adaptive distribution method, that serves two purposes for parallel HASH and MERGE JOINs:

First, it allows the PX SEND / RECEIVE operation for the left (smaller estimated row source) of the hash join to decide dynamically at runtime, actually for each execution, if it should use either a BROADCAST or HASH distribution, and correspondingly for the other row source to use then either a ROUND-ROBIN or a HASH distribution, too. This is described for example in the corresponding white paper by Maria Colgan here.

Second, to some degree it allows to address data distribution skew in case of HASH distributions (and only for parallel hash joins, not merge joins), which I've described in detail already in the past. A summary post that links to all other relevant articles regarding Parallel Execution Skew can be found here, an overview of the relevant feature can be found here and a detailed description can be found here.

I'll cover some aspects of this adaptive distribution that I haven't mentioned in the existing articles in a separate post.

- The new concurrent UNION ALL operator. This is officially documented here. It comes with a new operator PX SELECTOR that is a generic functionality to pick one of the available PX slaves to perform the child operations of this operator. Since the official documentation leaves a lot details unclear how this concurrent operation will actually behave at run time I'll cover some examples with runtime profiles in a separate post.

- The new PQ_REPLICATE feature that for simple parallel FULL TABLE SCAN row sources (I haven't tested yet if a parallel INDEX FAST FULL SCAN is eligible, too, but I assume so) can decide to run the scan entirely in each PX slave instead of running a distributed scan across the PX slaves in granules and distributing by BROADCAST afterwards. It's not entirely clear to me why this was implemented. Although it reduces the number of redistributions, and in some cases where no other parallel redistributions are required can reduce the number of parallel slave sets to one instead of two, BROADCAST distributions are typically used for smaller row sources, so eliminating this distribution doesn't sound like a huge improvement to justify the development effort. Jonathan Lewis describes the feature here along with some of his ideas, why this feature might be useful.

- The new parallel FILTER operator, an important and potentially huge improvement over the previous only available serial FILTER operator. In the past when a FILTER subquery was part of a parallel plan the data of the "driving" row source of the FILTER (the first child operation) had to be passed to the Query Coordinator and only then the second to nth children could be executed as many times as indicated by the first row source (and depending on the efficiency of filter/subquery caching). Now the FILTER operator can run in the PX slaves and there are a number of distribution variants possible with this new parallel operator. I'll cover that in a separate post.

- The new PX SELECTOR operator that I already mention above as part of the new concurrent UNION ALL operator. As described above, the generic functionality of this operator is to pick one of the available PX slaves to perform the child operations of this operator. It will be used in 12c for serial access operators that are part of a parallel plan (like a serial table or index scan). In the past these parts were performed by the Query Coordinator itself, but now one slave out of a slave set will be selected to perform such operations. This has a number of implications and I'll cover that in a separate post

- The new 1 SLAVE distribution method that is a bit similar to the PX SELECTOR operator in that it will use just one slave of the slave set but gets used for serial parts of the execution plan when the data is redistributed from a parallel part of the plan to a part of the plan that needs to be run in serial, because Oracle cannot parallelize the functionality, for example when evaluating ROWNUM or certain analytic function variants (for example LAG or LEAD with no partition clause). This new 1 SLAVE distribution seems to have two purposes: First avoid activity of the query coordinator (like the PX SELECTOR above) and second avoid the decomposition of the parallel plan into multiple DFO trees. I'll cover that in a separate post

- 12c changes also the way some operations in the plan are marked as PARALLEL or not, which in my opinion can be pretty confusing (and is partly inconsistent with runtime behaviour in my opinion) when just looking at the execution plan, since the runtime activity then might look different from what the execution plan suggests. I'll cover that in a separate post and it will also be picked up in the context of other new functionality mentioned above as appropriate.

There is probably more that I haven't come across yet, but as you can see from the number of times I've mentioned "separate post" in this overview this is already enough material for a whole series of posts to follow.

Webinar Followup

Thanks everyone who attended my recent webinar at AllThingsOracle.com.

The link to the webinar recording can be found here.

The presentation PDF can be downloaded here. Note that this site uses a non-default HTTP port, so if you're behind a firewall this might be blocked.

Thanks again to AllThingsOracle.com and Amy Burrows for hosting the event.

Scrutinizing Exadata X5 Datasheet IOPS Claims…and Correcting Mistakes

I want to make these two points right out of the gate:

  1. I do not question Oracle’s IOPS claims in Exadata datasheets
  2. Everyone makes mistakes

Everyone Makes Mistakes

Like me. On January 21, 2015, Oracle announced the X5 generation of Exadata. I spent some time studying the datasheets from this product family and also compared the information to prior generations of Exadata namely the X3 and X4. Yesterday I graphed some of the datasheet numbers from these Exadata products and tweeted the graphs. I’m sorry  to report that two of the graphs were faulty–the result of hasty cut and paste. This post will clear up the mistakes but I owe an apology to Oracle for incorrectly graphing their datasheet information. Everyone makes mistakes. I fess up when I do. I am posting the fixed slides but will link to the deprecated slides at the end of this post.

We’re Only Human

Wouldn’t IT be a more enjoyable industry if certain IT vendors stepped up and admitted when they’ve made little, tiny mistakes like the one I’m blogging about here? In fact, wouldn’t it be wonderful if some of the exceedingly gruesome mistakes certain IT vendors make would result in a little soul-searching and confession? Yes. It would be really nice! But it’ll never happen–well, not for certain IT companies anyway. Enough of that. I’ll move on to the meat of this post. The rest of this article covers:

  • Three Generations of Exadata IOPS Capability
  • Exadata IOPS Per Host CPU
  • Exadata IOPS Per Flash SSD
  • IOPS Per Exadata Storage Server License Cost

Three Generations of Exadata IOPS Capability

The following chart shows how Oracle has evolved Exadata from the X3 to the X5 EF model with regard to IOPS capability. As per Oracle’s datasheets on the matter these are, of course, SQL-driven IOPS. Oracle would likely show you this chart and nothing else. Why? Because it shows favorable,  generational progress in IOPS capability. A quick glance shows that read IOPS improved just shy of 3x and write IOPS capability improved over 4x from the X3 to X5 product releases. These are good numbers. I should point out that the X3 and X4 numbers are the datasheet citations for 100% cached data in Exadata Smart Flash Cache. These models had 4 Exadata Smart Flash Cache PCIe cards in each storage server (aka, cell). The X5 numbers I’m focused on reflect the performance of the all-new Extreme Flash (EF) X5 model. It seems Oracle has started to investigate the value of all-flash technology and, indeed, the X5 EF is the top-dog in the Exadata line-up. For this reason I choose to graph X5 EF data as opposed to the more pedestrian High Capacity model which has 12 4TB SATA drives fronted with PCI Flash cards (4 per storage server). exadata-evolution-iops-gold-1 The tweets I hastily posted yesterday with the faulty data points aimed to normalize these performance numbers to important factors such as host CPU, SSD count and Exadata Storage Server Software licensing costs.  The following set of charts are the error-free versions of the tweeted charts.

Exadata IOPS Per Host CPU

Oracle’s IOPS performance citations are based on SQL-driven workloads. This can be seen in every Exadata datasheet. All Exadata datasheets for generations prior to X4 clearly stated that Exadata IOPS are limited by host CPU. Indeed, anyone who studies Oracle Database with SLOB knows how all of that works. SQL-driven IOPS requires host CPU. Sadly, however, Oracle ceased stating the fact that IOPS are host-CPU bound in Exadata as of the advent of the X4 product family. I presume Oracle stopped correctly stating the factual correlation between host CPU and SQL-driven IOPS for only the most honorable of reasons with the best of customers’ intentions in mind. In case anyone should doubt my assertion that Oracle historically associated Exadata IOPS limitations with host CPU I submit the following screen shot of the pertinent section of the X3 datasheet:   X3-datasheet-truth Now that the established relationship between SQL-driven IOPS and host CPU has been demystified, I’ll offer the following chart which normalizes IOPS to host CPU core count: exadata-evolution-iops-per-core-gold I think the data speaks for itself but I’ll add some commentary. Where Exadata is concerned, Oracle gives no choice of host CPU to customers. If you adopt Exadata you will be forced to take the top-bin Xeon SKU with the most cores offered in the respective Intel CPU family.  For example, the X3 product used 8-core Sandy Bridge Xeons. The X4 used 12-core Ivy Bridge Xeons and finally the X5 uses 18-core Haswell Xeons. In each of these CPU families there are other processors of varying core counts at the same TDP. For example, the Exadata X5 processor is the E5-2699v3 which is a 145w 18-core part. In the same line of Xeons there is also a 145w 14c part (E5-2697v3) but that is not an option to Exadata customers.

All of this is important since Oracle customers must license Oracle Database software by the host CPU core. The chart shows us that read IOPS per core from X3 to X4 improved 18% but from X4 to X5 we see only a 3.6% increase. The chart also shows that write IOPS/core peaked at X4 and has actually dropped some 9% in the X5 product. These important trends suggest Oracle’s balance between storage plumbing and I/O bandwidth in the Storage Servers is not keeping up with the rate at which Intel is packing cores into the Xeon EP family of CPUs. The nugget of truth that is missing here is whether the 145w 14-core  E5-2697v3 might in fact be able to improve this IOPS/core ratio. While such information would be quite beneficial to Exadata-minded customers, the 22% drop in expensive Oracle Database software in such an 18c versus 14c scenario is not beneficial to Oracle–especially not while Oracle is struggling to subsidize its languishing hardware business with gains from traditional software.

Exadata IOPS Per Flash SSD

Oracle uses their own branded Flash cards in all of the X3 through X5 products. While it may seem like an implementation detail, some technicians consider it important to scrutinize how well Oracle leverages their own components in their Engineered Systems. In fact, some customers expect that adding significant amounts of important performance components, like Flash cards, should pay commensurate dividends. So, before you let your eyes drift to the following graph please be reminded that X3 and X4 products came with 4 Gen3 PCI Flash Cards per Exadata Storage Server whereas X5 is fit with 8 NVMe flash cards. And now, feel free to take a gander at how well Exadata architecture leverages a 100% increase in Flash componentry: exadata-evolution-iops-per-SSD-gold This chart helps us visualize the facts sort of hidden in the datasheet information. From Exadata X3 to Exadata X4 Oracle improved IOPS per Flash device by just shy of 100% for both read and write IOPS. On the other hand, Exadata X5 exhibits nearly flat (5%) write IOPS and a troubling drop in read IOPS per SSD device of 22%.  Now, all I can do is share the facts. I cannot change people’s belief system–this I know. That said, I can’t imagine how anyone can spin a per-SSD drop of 22%–especially considering the NVMe SSD product is so significantly faster than the X4 PCIe Flash card. By significant I mean the NVMe SSD used in the X5 model is rated at 260,000 random 8KB IOPS whereas the X4 PCIe Flash card was only rated at 160,000 8KB read IOPS. So X5 has double the SSDs–each of which is rated at 63% more IOPS capacity–than the X4 yet IOPS per SSD dropped 22% from the X4 to the X5. That means an architectural imbalance–somewhere.  However, since Exadata is a completely closed system you are on your own to find out why doubling resources doesn’t double your performance. All of that might sound like taking shots at implementation details. If that seems like the case then the next section of this article might be of interest.

IOPS Per Exadata Storage Server License Cost

As I wrote earlier in this article, both Exadata X3 and Exadata X4 used PCIe Flash cards for accelerating IOPS. Each X3 and X4 Exadata Storage Server came with 12 hard disk drives and 4 PCIe Flash cards. Oracle licenses Exadata Storage Server Software by the hard drive in X3/X4 and by the NVMe SSD in the X5 EF model. To that end the license “basis” is 12 units for X3/X5 and 8 for X5. Already readers are breathing a sigh of relief because less license basis must surely mean less total license cost. Surely Not! Exadata X3 and X4 list price for Exadata Storage Server software was $10,000 per disk drive for an extended price of $120,000 per storage server. The X5 EF model, on the other hand, prices Exadata Storage Server Software at $20,000 per NVMe SSD for an extended price of $160,000 per Exadata Storage Server. With these values in mind feel free to direct your attention to the following chart which graphs the IOPS per Exadata Storage Server Software list price (IOPS/license$$). exadata-evolution-iops-per-license-cost-gold The trend in the X3 to X4 timeframe was a doubling of write IOPS/license$$ and just short of a 100% improvement in read IOPS/license$$. In stark contrast, however, the X5 EF product delivers only a 57% increase in write IOPS/license$$ and a troubling, tiny, 17% increase in read IOPS/license$$. Remember, X5 has 100% more SSD componentry when compared to the X3 and X4 products.

Summary

No summary needed. At least I don’t think so.

About Those Faulty Tweeted Graphs

As promised, I’ve left links to the faulty graphs I tweeted here: Faulty / Deleted Tweet Graph of Exadata IOPS/SSD: http://wp.me/a21zc-1ek Faulty / Deleted Tweet Graph of Exadata IOPS/license$$: http://wp.me/a21zc-1ej

References

Exadata X3-2 datasheet: http://www.oracle.com/technetwork/server-storage/engineered-systems/exadata/exadata-dbmachine-x3-2-ds-1855384.pdf Exadata X4-2 datasheet: http://www.oracle.com/technetwork/database/exadata/exadata-dbmachine-x4-2-ds-2076448.pdf Exadata X5-2 datasheet: http://www.oracle.com/technetwork/database/exadata/exadata-x5-2-ds-2406241.pdf X4 SSD info: http://www.oracle.com/us/products/servers-storage/storage/flash-storage/f80/overview/index.html X5 SSD info: http://docs.oracle.com/cd/E54943_01/html/E54944/gokdw.html#scrolltoc Engineered Systems Price List: http://www.oracle.com/us/corporate/pricing/exadata-pricelist-070598.pdf , http://www.ogs.state.ny.us/purchase/prices/7600020944pl_oracle.pdf

Filed under: oracle

A new blog post on using Snap Clone on EMC storage with ASM

Just a quick note here to say I’ve added a new blog post to the official Enterprise Manager blog site. You can find it here.

Working With AWR Reports From EM12c

I’ve had a recent request to write an update on working with AWR reports, so as promised, here it is!

The Automatic Workload Repository

The Automatic Workload Repository, (AWR) was one of the best enhancements to Oracle back in release 10g.  There was quite a goal put in front of the development group when they were asked to develop a product that:

1.  Provided significant performance recommendation and wait event data enhancements over its predecessor statspack.

2.  Was always on, meaning that the data would continually collect without manual intervention from the database administrator.

3.  Wouldn’t impact the current processing, having its own background processes and memory buffer, designated tablespace, (SYSAUX).

4.  The memory buffer would write in the opposite direction vs. direction the user reads, eliminating concurrency issues.

Along with many other requirements, all of the above was offered with the Automatic Workload Repository and we end up with architecture that looks something like this:

pt5

 

Using AWR Data

The AWR data is identified by the DBID, (Database Identifier) and a SNAP_ID, (snapshot identifier, which has an begin_interval_time and end_interval_time to isolate the date and time of the data collection.) and information about what is currently retained in the database can be queried from the DBA_HIST_SNAPSHOT.  AWR data also contains ASH, (Active Session History) samples along with the snapshot data, by default, about 1 out of every 10 samples.

The goal to using AWR data effectively really has to do with the following:

1.  Have you identified a true performance issue as part of a performance review?

2.  Has there been a user complaint or a request to investigate a performance degradation?

3.  Is there a business challenge or question that needs to be answered that AWR can offer an answer to?  (we’ll go when to use AWR vs. other features…)

Performance Review

A performance review is where you have either identified a problem or have been assigned to investigate the environment for performance issues to solve.  I have a couple Enterprise Manager environments available to me, but I chose to go out to one in particular and cross my fingers hoping I would have some heavy processing to fit the requirements of this post.

The quickest way to see workload in your database environment from EM12c, click on Targets –> Databases.  Choose to view by load map and you will then view databases by workload.  Upon going to a specific Enterprise Manager environment, I found out it was my lucky day!

pt1I really don’t know who Kurt is that has a database monitored on this EM12c cloud control environment, but boy, is he my favorite person today! :)

Hovering my cursor over the database name, (kurt) you can view the workload he has running on his test database currently:pt2

 

Boy, is Kurt my favorite person today!

EM12c Database Home Page

Logging into the database, I can see the significant IO and resource usage for the database and host from the database home page:

pt3

 

If we move to Top Activity, (Performance menu, Top Activity) I begin to view more details about the processing and different wait events:

pt4

 

Kurt is doing all KINDS of inserts, (seen by the different SQL_IDs, by SQL Type “INSERT”.  I can drill down into the individual statements and investigate this, but really, there are a TON of statements and SQL_ID’s here, wouldn’t it just be easier to view the workload with an AWR report?

Running the AWR Report

I choose to click on Performance, AWR, AWR Report.  Now I have a choice.  I could request a new snapshot to be performed immediately or I could wait till the top of the hour, as the interval is set hourly in this database.  I chose the latter for this demonstration, but if you wanted to create a snapshot immediately, you can do this easily from EM12c or request a snapshot by executing the following from SQLPlus with a user with execute privileges on the DBMS_WORKLOAD_REPOSITORY:

BEGIN
DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT ();
END;
/

For this example, I simply waited, as there was no hurry or concern here and requested the report for the previous hour and latest snapshot:

pt6

 

I always start at the Top Ten Foreground Events and commonly look at those with high wait percentages:

pt7

 

Direct Path Write, that’s it.  Nothing else to see here… :)

Direct path write involves the following:  inserts/updates, objects being written to, tablespaces being written to and those datafiles that make up the tablespace(s).

It’s also IO, which we quickly verify down in the Foreground Wait Class:

pt8

 

Looking at the Top SQL by Elapsed Time confirms that we are dealing with a workload consisting of all inserts:

pt9

 

Clicking on the SQL ID, takes me to the Complete List of SQL Text and shows me just what Bad Boy Kurt is doing to produce his testing workload:

pt10

Wow, that Kurt is quite the rebel, eh? :)

Insert in a loop into one table from the same table, rollback and then end the loop, thanks for playing. He’s kicking some tires and doing it with angst!  Don’t worry people, like I said, Kurt is doing his job, using a module called “Load Generator”.  I’d be a fool not to recognize this as anything other than what it is-  generating workload to test something.  I just get the added benefit of having a workload to do a blog post on using AWR data… :)

Now, if this was a real issue and I was trying to find out what this type of performance impact this type of insert was creating on the environment, where to go next in the AWR report?  The top SQL by elapsed time is important as it should be where you focus your efforts.  Other sections broken down by SQL is nice to have, but always remember, “If you aren’t tuning for time, you’re wasting time.”  Nothing can come of an optimization exercise if no time savings is seen after you’ve completed the work.  So by taking first the Top SQL by Elapsed Time, then looking at the statement, we now can see what objects are part of the statement, (large_block149, 191, 194, 145).

We also know that the problem is IO, so we should jump down from the SQL detailed information and go to the object level information.  These sections are identified by Segments by xxx.  

  • Segments by Logical Reads
  • Segments by Physical Reads
  • Segments by Read Requests
  • Segments by Table Scans

so on and so forth….

These all show a very similar pattern and percentage for the objects we see in our top SQL. Remember, Kurt was reading each of these tables, then inserting those same rows back into the table again, then rolling back.  As this is a workload scenario, unlike most performance issues I see, there is no outstanding object showing with an over 10% impact in any area.

pt11

As this is an Exadata, there is a ton of information to help you understand offloading, (smart scans) flash cache, etc. that will assist in relaying the information you need to make sure you are achieving the performance you desire with an engineered system, but I’d like to save that for another post and just touch on a few of the IO reports, as we were performing table scans, so we do want to make sure those were being offloaded to the cell nodes, (smart scans) vs. being performed on a database node.

We can start by looking at Top Database IO Throughput:

pt14

And then view the Top Database Requests Per Cell Throughput, (sans the Cell node names) to see how they compare:

pt12

 

Then we add to it this last graph that tells us, are we offloading enough of the workload to each of the cell nodes, Top IO Reasons by Request?

pt15

 

This has been one scenario trip through an AWR Report.  Hopefully it’s been interesting and next time I’ll search for a different load type to take you through!

Happy Friday!

 

 

 

 

 



Tags:  , ,


Del.icio.us



Facebook

TweetThis

Digg

StumbleUpon




Copyright © DBA Kevlar [Working With AWR Reports From EM12c], All Right Reserved. 2015.

Data Center of the Future – now

9087977014_a5a2d4bc11_k

 

photo by youflavio

In a recent blog post Eric D. Brown defined an Agile Data Center as

#777777;">An Agile Data Center is one that allows organizations to efficiently and effectively add, remove and change services at the speed of the business, not the speed of technology - Eric D. Brown

In follow up post he said that a Agile Data Center could be implemented by Software Defined Data Center (SDDC) for example using machine virtualization to spin environments up and down.

#555555;">With SDDC, it is possible for companies to replace their data center’s infrastructure with a virtualized environment and then deliver services and software as a service - #777777;">Eric D. Brown

The question arrises, what technologies constitute and agile SDDC? What software should be leveraged to succeed at having an agile data center, a SDDC? The most important software to look at is software that addresses the top constraints in the data center. As the theory of constraints says, any improvement not made at the constraint is an illusion. So what are the top constraints in the data center. The top constraint, as found after working with 100s of companies and surveying 1000s of companies, is provisioning environments for development and QA. Why is that? It’s because almost every industry now is finding itself to be more and more a software industry from stock trading to booksellers to taxi companies to hotels. The competitive advantage is more and more about the software used to provide and sell the service. To build that software requires development and QA and thus development and QA environments. The hard part of an environment to provision is no longer the machine thanks to machine virtualization. The hardest part of the environment to provision is the data. Data that represents the production system is required to develop applications that use, display and manage that data. Data is typically kept in large complex databases such as Oracle, SQL Server, Sybase, Postgres and DB2.

Provisioning development and QA environments that rely on databases  can be an expensive, slow endeavor. But like machine virtualization there is a new technology data virtualization that instead of making full physical copies, instead makes one copy of each unique data block on the source including a stream of changes blocks. With this “time flow” of unique blocks from the source database, data virtualization can provide copies in minutes not by actually making copies but by providing pointers back to the existing blocks. These existing blocks are read/writeable thanks to a technology of redirect on write which saves modified blocks in a different location than the original. It all sound a bit complex but when that’s the beauty of data virtualization solutions. They take the complexity, wrap it up into automated software stack and provide simple interface and APIs to provision full developer environments from the binaries, to the code files to the most complex and difficult part of the environment provisioning which is provisioning full running copies of the data. Included in most data virtualization solutions is masking as well since sensitive data is often required to be masked in development environments. The software defined data centers (SDDC) depend on machine virtualization and data virtualization.

What other technologies are also required?

Check out  Software defined everything: Throwing more software into the mix 

Restrict the query optimizer through OPTIMIZER_FEATURES_ENABLE and Top-Frequency/Hybrid Histograms

To make upgrades easier, I regularly see people considering disabling query optimizer features by setting the OPTIMIZER_FEATURES_ENABLE initialization parameter to a non-default value. My general opinion about this “habit” is summarized in TOP with the following two sentences:

Changing the default value of the OPTIMIZER_FEATURES_ENABLE initialization parameter is only a short-term workaround. Sooner or later the application should be adapted (optimized) for the new database version.

The issue is that not all new features are disabled by the OPTIMIZER_FEATURES_ENABLE initialization parameter. For example, if you set it to 10.2.0.5 in version 12.1.0.2, you will not get exactly the 10.2.0.5 query optimizer. A particular case to be considered in 12.1 is related to the new types of histograms (top-frequency and hybrid). In fact, even though you set the OPTIMIZER_FEATURES_ENABLE initialization parameter to 10.2.0.5, the DBMS_STATS package can produce top-frequency and hybrid histograms. As a result, the estimations of the query optimizer can change. An example is provided by this script.

Note that also setting the COMPATIBLE initialization parameter to, for example, 11.0.0 with the 12.1.0.2 binaries does not prevent the DBMS_STATS package to gather top-frequency and hybrid histograms.

So, if you are really planning to upgrade to 12.1 and changing the default value of the OPTIMIZER_FEATURES_ENABLE initialization parameter, in my opinion you should make sure that the DBMS_STATS package does not create top-frequency and hybrid histograms. Unfortunately, with the METHOD_OPT parameter, it is not possible to specify which type of histograms the DBMS_STATS package can create. The only input parameter of the GATHER_*_STATS procedures that affects this decision is ESTIMATE_PERCENT. When it is set to a numeric value greater than 0, the DBMS_STATS package does not create top-frequency and hybrid histograms. In other words, the DBMS_STATS package considers the creation of the new types of histograms only if the ESTIMATE_PERCENT parameter is set to the DBMS_STATS.AUTO_SAMPLE_SIZE constant (0).

Another possibility to prevent that the DBMS_STATS package creates top-frequency and hybrid histograms is to set the undocumented preferences ENABLE_TOP_FREQ_HISTOGRAMS and ENABLE_HYBRID_HISTOGRAMS. For example, to set them globally, you can use the following PL/SQL block:

BEGIN
  dbms_stats.set_global_prefs(pname => 'ENABLE_HYBRID_HISTOGRAMS', pvalue => 0);
  dbms_stats.set_global_prefs(pname => 'ENABLE_TOP_FREQ_HISTOGRAMS', pvalue => 0);
END;

WARNING: Even though according to my tests the preferences do what I expect (i.e. disable top-frequency and hybrid histograms), I never used them in a real case and I did not ask Oracle about their usage. Hence, if you want to use them, I would not only test them carefully, but also ask Oracle Support whether their utilization can lead to known problems.

Before setting them, I also adivse you to get their (default) value with a query like the following one:

SELECT dbms_stats.get_prefs('ENABLE_HYBRID_HISTOGRAMS') AS hybrid,
       dbms_stats.get_prefs('ENABLE_TOP_FREQ_HISTOGRAMS') AS top_freq
FROM dual

Finding Oracle VM Manager Core Management Port After OVM Manager Installation

Recently I was working on integration of OVM Manager with Oracle EM Cloud Control 12cR4 and I had to enter the "Oracle VM Manager URL" a.k.a. as "Core management port" which is reported at the end of OVM Manager installation. The default value for this port is 54321. As I was not in position to get this post-installation report because the installation was performed by another company, I had to find out the right port. The definition should be entered in the form: tcp://:
.
EM Cloud Control and OVM Manager Integration Screen
OVM Manager Installation report sample

One of the the first ideas was to check local firewall configuration and see if there is anything for port 54321.

# iptables -L | grep 5432
ACCEPT     tcp  --  anywhere  anywhere   state NEW tcp dpt:54322

Search showed opened port 54322 which could also be one of the possibilities.
Searching with netstat revealed no acctivity on port 54322

# netstat -vatn | grep 54322

However, a lot of rows was returned when searching for port 54321:

# netstat -vatn | grep 54321
tcp        0      0 ::ffff:127.0.0.1:54321      :::*                        LISTEN
tcp        0      0 ::ffff:127.0.0.1:36797      ::ffff:127.0.0.1:54321      ESTABLISHED
tcp        0      0 ::ffff:127.0.0.1:51828      ::ffff:127.0.0.1:54321      ESTABLISHED
tcp        0      0 ::ffff:127.0.0.1:53096      ::ffff:127.0.0.1:54321      ESTABLISHED
tcp        0      0 ::ffff:127.0.0.1:55461      ::ffff:127.0.0.1:54321      ESTABLISHED
tcp        0      0 ::ffff:127.0.0.1:53017      ::ffff:127.0.0.1:54321      ESTABLISHED

But I was still not 100% sure, if the iptables showed the right port. So further digging through the documentation revealed the right port. The definition of the "Core management port"  is hidden in .config file under the name COREPORT in OVM Manager home directory /u01/app/oracle/ovm-manager-3/.config

The contents of this file in my case was:

# cat  /u01/app/oracle/ovm-manager-3/.config

DBTYPE=MySQL
DBHOST=localhost
SID=ovs
LSNR=49500
OVSSCHEMA=ovs
APEX=8080
WLSADMIN=weblogic
OVSADMIN=admin
COREPORT=54321
UUID=0004fb000001000075e54bae74172d82
BUILDID=3.3.1.1065

The explanation for the configuration items can be found in chapter "5.1 Backing up and Restoring Oracle VM Manager"  and obviously in my case the port used was actually the default one 54321.
The .config file should be part of the backup.

Hope this post will shorten search time for this kind of information to my readers.

10046.pl by Clive Bostock

#555555;">Note:

#555555; padding-left: 30px;">This is a reposting of an old blog post that was on dboptimizer.com but is no longer accessible

#555555; padding-left: 30px;">More trace file analyzer tools at  #2970a6;" href="http://ba6.us/node/177">http://ba6.us/node/177

#555555; padding-left: 30px;">Related blog post: Oracle “Physical I/O” ? not always physical with a 10046 parser specifically for I/O parsetrc.pl and readme


#555555;">Often when I have a 10046 trace file, especially when looking at I/O issues, I want a histogram of I/O response time.  To get I/O response time I’ve hacked out incomple awk scripts from time to time, always meaning to write a more complete one, well now I don’t have to. It’s already been done!

#555555;">Here is a cool perl script from Clive Bostock: #2970a6;" href="https://sites.google.com/site/oraclemonitor/10046-pl-readme">README.TXT    #2970a6;" href="https://sites.google.com/site/oraclemonitor/10046-pl-1">10046.pl

#555555;">(also checkout #2970a6;" href="http://www.oracledba.ru/orasrp/">orasrp which produces a more indepth report in HTML. I like both. I like 10046.pl as a short  easy portable script that I can modify, whereas orasrp is a binary and only works on some ports)

#555555;">For example, if I trace a session with 10046, and retrieve the tracefile, then I can run:

#555555;">$ 10046.pl -t  mytrace.trc

#555555;">and it will output  a header and three sections

#555555;">Header

    #555555;">
  • Summary of all events for tracefile
  • Events by object summary
  • Events by object histogram

#555555;">This looks like

#555555;">Header

#555555;">Trace file mytrace.trc
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /u02/oracle
System name:    SunOS
Node name:      toto
Release:        5.10
Version:        Generic_142900-12
Machine:        sun4u
Instance name: orcl
Redo thread mounted by this instance: 1
Oracle process number: 177
Unix process pid: 16553, image: oracle@toto
Trace input file : mytrace.trc

#555555;">Wait summary

#555555;">EVENT AGGREGATES
================
Wait Event              Count Elapsed(ms)   Avg Ela (ms)  %Total
~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~ ~~~~~~~~~~~~ ~~~~~~~~~~~~ ~~~~~~~~~~
db file sequential read  2715      11593              4    3.74
       direct path read  4484       4506              1    1.45
 db file scattered read   141        898              6    0.29
          log file sync     3          8              2    0.00
                              ~~~~~~~~~~~~
             Total Elapsed:       309821

#555555;">Wait Summary by object

#555555;">Object Id  : Wait Event                  Count Tot Ela (ms) %Total Avg Ela (ms)
~~~~~~~~~~ : ~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~ ~~~~~~~~~~~~ ~~~~~ ~~~~~~~~~~~~~
28581      : direct path read            4484         4506   1.45            1
1756433    : db file sequential read      725         1891   0.61            2
764699     : db file sequential read      332         1762   0.57            5
37840      : db file sequential read      200         1044   0.34            5
38018      : db file sequential read      108         1009   0.33            9
81596      : db file scattered read       140          887   0.29            6

#555555;">wait histogram by object

#555555;">EVENT HISTOGRAM BREAKDOWN
===========================
This section splits the event counts into elapsed time
buckets so that we can see if there are any suspiciousn
or anomalous response time / frequency patterns.
Object Id : Wait Event              <1ms <2ms <4ms <8ms <16ms <32ms <64ms <128ms <256ms <512ms >=1024ms
~~~~~~~~~ : ~~~~~~~~~~~~~~~~~~~~~~~ ~~~~ ~~~~ ~~~~ ~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~~ ~~~~~~ ~~~~~~ ~~~~~~~~
28581     : direct path read        7680   87  148  221  144    40     4     0      0       0        0
1756433   : db file sequential read  606  268   45   35   66     6     2     0      0       0        0
764699    : db file sequential read   74  119   11   78   78     9     0     0      0       0        0
37840     : db file sequential read   50   72    6   45   47     5     0     0      0       0        0
38018     : db file sequential read   12   38    7   10   30    12     5     0      0       0        0
81596     : db file scattered read    64    4   13   62   18     8     3     0      0       0        0
41995     : db file sequential read   20   39    0    7   16     8     4     0      0       0        0
108718    : db file sequential read   74   54    5   12   24     4     0     0      0       0        0
33490     : db file sequential read    0    5   11   25   19     4     0     0      0       0        0