Search

OakieTags

Who's online

There are currently 0 users and 20 guests online.

Recent comments

Oakies Blog Aggregator

No Data Loss without Synchronous Network

I don’t usually write about specific products; but once in a while I encounter something worth talking about, especially if it addresses a very common issue anyone with datacenter management responsibilities will understand: avoiding the unavoidable loss of last minute data changes in database after a disaster but without expensive synchronous replication. This product solves that problem with an innovative out of the box approach. The product is Phoenix Systems from Axxana.

Replication and Last Mile Transactions

Imagine this scenario: the datacenter was on fire. All components—servers, storage, network—are destroyed. You can’t salvage anything of value. What happens to the data you need to continue your business? Fortunately you have a disaster recovery plan. You replicate the database to a remote datacenter using either Data Guard (from Oracle), or some storage level technology such as EMC’s SRDF. So the database comes up on the remote site; but there is a problem. Since you used asynchronous replication (such as maximum performance mode in Data Guard), not all changes have made it to remote site. The final few changes, usually known as last mile transactions were yet to be shipped to the remote site when the disaster struck. These are lost forever in the fire. While you do have a database at the remote datacenter, and that database is almost complete—the operative word is “almost”; not 100%, mind you—some data is inevitably lost. What if you can’t afford to be “almost” complete? Organizations such as financial institutions, hospitals, manufacturing concerns and airlines where losing data is simply not an option, this is not a luxury they can afford. Even in case of other organizations where data loss may not be as unacceptable, the loss of data creates a sense of uncertainty, since you don’t know exactly what was lost.

If you need to be 100% certain that all data is available at the remote site, what would you do?
It’s easy, using synchronousreplication. All changes on the primary database are reflected on the remote database site in real time. If you used Data Guard, you would have run it in maximum protection mode. In that case, Oracle would have committed the changes in the remote database before confirming the commit at the local site. When the primary site is gone, the remote site is available with 100% of the data. Bye bye, “almost”.
Brilliant. So, why didn’t you do it? Better yet, why doesn’t everyone do it, since it is that simple? Data loss could be unacceptable in the worst case and unsettling at best. Why accept it?

Synchronous Replication

Well, there is a pesky little detail. Synchronous replication means the network connectivity has to be rock solid. Networks have three general characteristics: throughput(how much data they can pump. Imagine a three lane highway compared to a single lane local road), latency(how much time passes to process the data; not the actual speed. For instance, the car on the highway may travel at 70 miles per hour; but it will be a few minutes spent on the ramp to the highway) and reliability(does the data move from point A to point B with 100% accuracy or it has to be re-transmitted a few times).

Synchronous replication requires the throughput to be very high (to support the large amounts change at least during bursts of activity), latency to be very low (otherwise the remote site will get the data late and respond to the primary even later, causing the primary to hold off the commit) and extremely reliable (otherwise the primary may think the remote site is not accessible and therefore shutdown itself to protect the data). If you run the Data Guard in maximum protection mode, you are essentially telling Oracle to make sure that the remote site has absolutely, positively, undoubtedly (add any other adjective you can think of) received the change and has committed.

If Oracle can’t ensure that because of any reason, such as not getting the response in time due to a less reliable network, what choices does it have? If it allows further data changes, the changes are not there at the remote site yet. At this time if the primary database fails, the data is gone. Therefore Oracle has no choice but to shut down the primary database to stop any changes coming in. It’s maximum protection, after all, and that’s what you have instructed it to do.
So, if you decide to use maximum protection, you have to use a high throughput, low latency and extremely reliable network infrastructure. Most public commercial network infrastructures are not. Either you have to contract a commercial carrier to provide this elevated level of service, or build your own, such as using dark fiber. The costs of the network infrastructure become exponentially high, especially when the remote site is farther away from the primary. In many cases, the cost of the network itself may be several times that of the database infrastructure it protects. Owing to the cost limitations, you may be forced to locate the remote site close by, e.g. in New York City and Hoboken, NJ. It will still be exceedingly expensive; and it may not offer the same degree of protection that you expect. These two cities are close enough to be in the same exposure area of disasters such as floods, hurricanes, war and so on. The farther away the remote site is, the more protected your data is; but so is your cost. Could you accomplish no data loss without this expensive proposition?
Until now, the decision was really black and white. If you want no data loss at all, you have no choice but to go for a super expensive network solution. Many companies can’t justify that high sticker price and therefore settle for potential data loss. Many, in fact, make detailed plans as a part of the business continuity efforts on handling this lost data.
It’s assumed that zero data loss is synonymous with expensive network. If you don’t have a high throughput, low latency, highly reliable network, you have to live with some data loss.
Here is the good news. No, it doesn’t have to. Now, it’s possible to have the cheaper commoditized public network infrastructure and still have complete data protection. Allow me to explain.

Data Protection in Oracle

In Oracle, the data files are written asynchronously at different intervals unrelated to the data changes and commits. In other words, when you commit a change, the data files may not have that changed data. In fact the change occurs in the memory only (called a buffer cache) and may not exist in the data files for hours afterwards. Similarly when you make a change but not commit, the data can still be persisted to the data files. Let me repeat that: the data files are updated with the changed data even if you didn’t commit yet. This is the reason why if you have a storage or operating system level replication solution—even synchronous—replicating the data files, the remote site may or may not have the data, even hours after the change.
How does Oracle protect the data that was changed and committed but in the memory, if the data files do not have them? It captures the pre- and post-change data and packages them into something called redo blocks. Remember, these have nothing to do with data blocks. These are merely changes created by activities performed on the database. This redo data—also known as redo vector—is written to a special area in memory called log buffer. When you commit, the relevant redo blocks from the log buffer are written to special files in the database called redo log files, also known as online redo log files. The commit waits until this writing—known as redo flushing—has ended. You can check the Oracle sessions waiting for this flushing to complete by looking at the event “log file sync”. Since the changes—most importantly—committed changes are recorded in the redo log files, Oracle does not need to rely on the memory alone to know which changes are committed and which are not. In case of a failure, Oracle examines the redo logs to find these changes and updates the data files accordingly. Redo logs are very small compared to the data files.
By the way, redo flushing also occurs at other times: every three seconds, every filled 1 MB of log buffer, when a third of the log buffer is full and some other events; but those additional flushes merely make sure the redo log file is up to date, even if there is no commit.
As you can see, this redo log file becomes the most important thing in the data recovery process. When a disaster occurs, you may have copies of the data files at the remote site (thanks to the replication); but as you learned in the previous section the copies are not useful yet since they may not have all the committed changes and may even have uncommitted changes. In other words, this copy is not considered “consistent” by Oracle. After a disaster, Oracle needs to check the redo log files and apply the changes to make the data files consistent. This is known as a “media recovery”. You have to initiate the media recovery process. In case of a synchronous replication at storage or Operating System level, the redo logs are perfectly in sync with the primary site and Oracle has no trouble getting to the last committed transaction just before the failure. There will be no data lost as a result of the recovery process. In case of Data Guard with maximum protection, this is not required since the changes are updated at the remote site anyway. But what about your cheaper, commodity network with asynchronous replication? The redo logs at the remote site will not be up to date with the primary site’s redo logs. When you perform media recovery, you can’t get to the very last change before the failure, simply because you may not have them. Therefore you can perform a recovery only up to the last available information in the redo at the remote site. This is known as “incomplete” media recovery, distinguished from the earlier described “complete” media recovery. To complete the media recovery, you need the information in the redo log files at the primary site; but remember, the primary site is destroyed. This information is gone. You end up with a data loss. Perhaps even worse, you won’t even know exactly how much you lost, since you don’t have the access to the primary redo log files.
Now, consider this situation carefully. All you need is the last redo log file from the primary database to complete the recovery. Unfortunately the file is not available because it’s destroyed or otherwise inaccessible since the site itself is inaccessible. This tiny little file is the only thing that stays between you and complete recovery. What if you somehow magically had access to this file, even though the rest of the data center is gone? You would have been able to complete the recovery with no data loss and looked like a hero and that too without a synchronous replication solution with super expensive network.

Enter the Black Box

Oracle’s redo data can be written to multiple files at the same time. This is called a redo log group and the files are called members. Log flushing writes to all members before confirming the flush. As you can see, all members of a group have the same data. Multiple members are created only for redundancy. As long as one member of a group is available, Oracle can use it to perform recovery. This is where the new tool from Axxana comes in. It is a storage device—named Phoenix—where you create the second member of the redo log groups. The first member of the group is on your normal storage as usual. When disaster strikes and nukes the primary site, you have a copy of the all-important redo log from the Phoenix system.
This is where the first benefit of Phoenix systems comes in. The storage is not just any ordinary one. It’s encased in a bomb-proof, fire-proof and water-proof container which preserves the internal storage from many calamities. The storage has normal connectivity such a network port to connect a network as a NAS and a fiber port to connect to a fiber switch as a SAN. Under ordinary circumstances you would use these ports to connect to your infrastructure system and use the storage. After a disaster, due to the indestructible nature of the enclosure, this storage will most likely be intact. All you have to do is to access the data from it and perform a complete recovery. No data needs to be lost.
But how do you get to the information on the Phoenix system? This is where the second benefit comes in. The Phoenix system transfers its data to another component of the Phoenix system at the remote site. The Phoenix system creates a replica of the required redo logs at the remote site. Since the only data on it are the small redo log files, the amount to transfer is very little and does not put a toll on your other network infrastructure.
Note a very important point: Phoenix system does not perform any transfer of data during normal operation. It’s only during a disaster the system transports the required redo log files to complete a 100% recovery at the remote site.


But it still depends on getting the data out of the Phoenix system that was at the primary site. What if the disaster site is physically inaccessible or it is infeasible to physically transport the Phoenix system to a location where it can be connected to your network? It’s quite possible in case of floods, hurricanes or other natural disasters or manmade ones like war or strikes. The network cables are also likely out of commission after a disaster. Without that access, how can Phoenix system extract the needed last mile transactions from the primary site—you might ask.

No worries; there is a cellular modem built into the Phoenix system that allows you to connect to it from the remote site and transfer the data over a cellular network wirelessly. The system also has its own battery that allows it to stay operational even when external power is gone—a common occurrence in almost any calamity. What’s more: the transfer of data after the disaster can also utilize this cellular connectivity; so you may not even need to physically connect to this storage at the primary site (now defunct due to the disaster) to your network. The data you need to perform the complete no-data-loss recovery may already be transferred over to the system at the remote site and is waiting for you. In any case, you have access to the data. And all this comes without the need to invest in synchronous replication and expensive network infrastructure.

In summary, assuming the primary site is gone and you are using a cheaper asynchronous network, the remote site will be partially up-to-date and the last mile transactions will be lost with the loss of the primary database. However, since you used the disaster-proof Phoenix system for an additional member of the redo log group, the last mile transactions will be intact in that system; the remote system won’t have it yet. At this point you have multiple options:

1)     

If the network infrastructure between primary and remote sites is still operational (rare; but possible), Phoenix system data transfer creates the copy of redo logs at the remote site automatically. With this in place, you can perform a complete database recovery. Data Loss: 0%.

2)     

If the network infrastructure is not operational, Phoenix will automatically engage the built-in cellular modem and initiate the data transfer to the remote site. It will be slow; but the amount of data to be transferred is small; so it won’t matter much. Once that is complete, your recovery will be complete with 0% data loss.

3)     

If you don’t have cellular access either, but have access to the physical Phoenix system device, you can mount it on your network and perform a complete recovery.

As I started off, this is a very simple and elegant out of the box solution to a complex problem. It opens up possibilities for a no-data-loss recovery scenario in case of a disaster, where the option didn’t even exist. In a future post I will describe in detail my hands on experience with screenshots, scripts, etc. with the Phoenix system. Stay tuned.
More about Axxana: www.axxana.com

Basic (newbie) install CoreOS on VirtualBox – Getting started with Docker

I got intrigued by this Dutch article mentioning Docker and CoreOS. So on this Saturday,…

Collaborate ’15

It’s time for another phenomenal Collaborate conference next week, made even more fabulous by the fact that I’ll be there for the first time! :) For some reason, Collaborate is one of the few big user group conferences I haven’t made it to before, so when my boss asked me to go and present at it I leapt at the chance.

Originally, I had one presentation timeslot, but then Alex Gorbachev of Pythian / Oak Table fame noticed I was attending and asked me to do a couple of slots at Oak Table World on the Wednesday. Then the organizers had a drop-out and asked Seth Miller and I to fill it in, so my dance card is starting to look a bit more full! So dates and times for when I will be presenting currently stand at:

  • Enterprise Manager Cloud Setup and PDBaaS – Seth and me. Banyan B => Tue, April 14 12:15 PM – 12:45 PM
  • Knowledge Sharing – Why Do It? Mandalay Ballroom K => Wed, April 15 10:30 – 11:10 AM
  • SnapClone++. Mandalay Ballroom K => Wed, April 15, sometime between 2:00 PM and 3:00 PM
  • Data Clone and Refresh Made Easy with Enterprise Manager 12c Snap Clone. Palm D => Thu, April 16, 2015 11:00 AM – 12:00 PM

The first session is also part of the virtual conference. Seth will spend 15 minutes on a quick overview of setup required to support cloud in EM and then I’ll cover PDBaaS for the remaining time.

The Knowledge Sharing session is a different sort of talk for me. For a start, it’s non-technical. Part of it will cover the sorts of knowledge sharing I’ve been doing over the past 20+ years, and then I’ll branch into the pros and cons of a few different types of knowledge sharing. It should be an interesting experience!

The SnapClone++ session is a lightning talk. As Alex describes them, “Lightning Talks are 10-minute presentations done in a rapid-fire fashion. They are always a huge success—you’ve got to be there! They may be technical, motivational, or inspirational, but regardless they are always cool speeches. The sequence of the talks may change, but everything will be presented within the hour.”

The final one is back to the normal sorts of presentations I’ve been doing for many years. It will cover why you would want to use Snap Clone, and includes a demo of how the functionality actually works.

Apart from all that, I’ll be spending some time on the Demo Grounds, as well as venturing into some of the EM SIG meetings. Hopefully at some time you can catch up and have a chat, so come on by! As I’m still travelling in a wheelchair, it should be easy to spot me. :)

Counting

There’s a live example on OTN at the moment of an interesting class of problem that can require some imaginative thinking. It revolves around a design that uses a row in one table to hold the low and high values for a range of values in another table. The problem is then simply to count the number of rows in the second table that fall into the range given by the first table. There’s an obvious query you can write (a join with inequality) but if you have to join each row in the first table to several million rows in the second table, then aggregate to count them, that’s an expensive strategy.  Here’s the query (with numbers of rows involved) that showed up on OTN; it’s an insert statement, and the problem is that it takes 7 hours to insert 37,600 rows:


    INSERT INTO SA_REPORT_DATA
    (REPORT_ID, CUTOFF_DATE, COL_1, COL_2, COL_3)
    (
    SELECT 'ISRP-734', to_date('&DateTo', 'YYYY-MM-DD'),
           SNE.ID AS HLR
    ,      SNR.FROM_NUMBER||' - '||SNR.TO_NUMBER AS NUMBER_RANGE
    ,      COUNT(M.MSISDN) AS AVAILABLE_MSISDNS
    FROM
           SA_NUMBER_RANGES SNR          -- 10,000 rows
    ,      SA_SERVICE_SYSTEMS SSS        --  1,643 rows
    ,      SA_NETWORK_ELEMENTS SNE       --    200 rows
    ,      SA_MSISDNS M                  --    72M rows
    WHERE
           SSS.SEQ = SNR.SRVSYS_SEQ
    AND    SSS.SYSTYP_ID = 'OMC HLR'
    AND    SNE.SEQ = SSS.NE_SEQ
    AND    SNR.ID_TYPE = 'M'
    AND    M.MSISDN  >= SNR.FROM_NUMBER
    AND    M.MSISDN  <= SNR.TO_NUMBER
    AND    M.STATE  = 'AVL'
    GROUP BY
           SNE.ID,SNR.FROM_NUMBER||' - '||SNR.TO_NUMBER
    )  

The feature here is that we are counting ranges of MSISDN: we take 10,000 number ranges (SNR) and join with inequality to a 72M row table. It’s perfectly conceivable that at some point the data set expands (not necessarily all at once) to literally tens of billions of rows that are then aggregated down to the 37,500 that are finally inserted.

The execution plan shows the optimizer joining the first three tables before doing a merge join between that result set and the relevant subset of the MSISDNs table – which means the MSISDNs have to be sorted and buffered (with a probably spill to disc) before they can be used. It would be interesting to see the rowsource execution stats for the query – partly to see how large the generated set became, but also to see if the ranges involved were so large that most of the time went in constantly re-reading the sorted MSISDNs from the temporary tablespace.

As far as optimisation is concerned, there are a couple of trivial things around the edges we can examine: we have 10,000 number ranges but insert 37,600 results, and the last stages of the plan generated those results so we’ve scanned and aggregated the sorted MSISDNs 37,600 times. Clearly we could look for a better table ordering that (eliminated any number ranges early), then did the minimal number of joins to MSISDN, aggregated, then scaled up to 37,600: with the best join order we might reduce the run time by a factor of 3 or more. (But that’s still a couple of hours run time.)

What we really need to do to make a difference is change the infrastructure in some way – prefereably invisibly to the rest of the application. There are a number of specific details relating to workload, read-consistency, timing, concurrency, etc. that will need to be considered, but broadly speaking, we need to take advantage of a table that effectively holds the “interesting” MSISDNs in sorted order. I’ve kept the approach simple here, it needs a few modifications for a production system. The important bit of the reports is the bit that produces the count, so I’m only going to worry about a two-table join – number ranges and msidn; here’s some model data:


execute dbms_random.seed(0)

create table msisdns
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        trunc(dbms_random.value(1e9,1e10))      msisdn
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6
;

create table number_ranges
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        trunc(dbms_random.value(1e9,1e10))      from_number,
        trunc(dbms_random.value(1e9,1e10))      to_number
from
        generator       v1
where
        rownum  <= 1000
;

update number_ranges set
        from_number = to_number,
        to_number = from_number
where
        to_number < from_number
;

commit;

I’ve created a table of numbers with values between 10e9 and 10e10 to represent 1 million MSISDNs, and a list of 1,000 number ranges – making sure that the FROM number is not greater than the TO number. Now I need a “summary” table of the MSISDNs, which I’m going to create as an index-organized table:


create table tmp_msisdns (
        msisdn,
        counter,
        constraint tmp_pk primary key (msisdn, counter)
)
organization index
as
select
        msisdn,
        row_number() over(order by msisdn)      counter
from
        msisdns
;

This is only a demonstration so I’ve haven’t bothered with production-like code to check that the MSISDNs I had generated were unique (they were); and I’ve casually included the row_number() as part of the primary key as a performance fiddle even though it’s something that could, technically, allow some other program to introduce bad data if I made the table available for public use rather than task specific.

Finally we get down to the report. To find out how many MSISDN values there are between the FROM and TO number in a range I just have to find the lowest and highest MSISDNs from tmp_msisdn in that range and find the difference between their counter values, and add 1. And there’s a very fast way to find the lowest or highest values when you have the appropriate index – the min/max range scan – but you have to access the table twice, once for the low, once for the high. Here’s the necessary SQL, with execution plan from 12.1.0.2:


select
        nr.from_number, nr.to_number,
--      fr1.msisdn, fr1.counter,
--      to1.msisdn, to1.counter,
        1 + to1.counter - fr1.counter range_count
from
        number_ranges   nr,
        tmp_msisdns     fr1,
        tmp_msisdns     to1
where
        fr1.msisdn = (
                select min(msisdn) from tmp_msisdns where tmp_msisdns.msisdn >= nr.from_number
        )
and     to1.msisdn = (
                select max(msisdn) from tmp_msisdns where tmp_msisdns.msisdn <= nr.to_number
        )
;

-------------------------------------------------------------------------------------------------
| Id  | Operation                       | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |               |       |       |  4008 (100)|          |
|   1 |  NESTED LOOPS                   |               |  1000 | 38000 |  4008   (1)| 00:00:01 |
|   2 |   NESTED LOOPS                  |               |  1000 | 26000 |  2005   (1)| 00:00:01 |
|   3 |    TABLE ACCESS FULL            | NUMBER_RANGES |  1000 | 14000 |     2   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN             | TMP_PK        |     1 |    12 |     2   (0)| 00:00:01 |
|   5 |     SORT AGGREGATE              |               |     1 |     7 |            |          |
|   6 |      FIRST ROW                  |               |     1 |     7 |     3   (0)| 00:00:01 |
|*  7 |       INDEX RANGE SCAN (MIN/MAX)| TMP_PK        |     1 |     7 |     3   (0)| 00:00:01 |
|*  8 |   INDEX RANGE SCAN              | TMP_PK        |     1 |    12 |     2   (0)| 00:00:01 |
|   9 |    SORT AGGREGATE               |               |     1 |     7 |            |          |
|  10 |     FIRST ROW                   |               |     1 |     7 |     3   (0)| 00:00:01 |
|* 11 |      INDEX RANGE SCAN (MIN/MAX) | TMP_PK        |     1 |     7 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("FR1"."MSISDN"=)
   7 - access("TMP_MSISDNS"."MSISDN">=:B1)
   8 - access("TO1"."MSISDN"=)
  11 - access("TMP_MSISDNS"."MSISDN"<=:B1)

Execution time – with 1 million MSISDNs and 1,000 ranges: 0.11 seconds.

For comparative purposes, and to check that the code is producing the right answers, here’s the basic inequality join method:


select
        nr.from_number, nr.to_number, count(*) range_count
from
        number_ranges   nr,
        msisdns         ms
where
        ms.msisdn >= nr.from_number
and     ms.msisdn <= nr.to_number
group by
        nr.from_number, nr.to_number
order by
        nr.from_number
;

-----------------------------------------------------------------------------------------------
| Id  | Operation             | Name          | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |               |       |       |       |   472K(100)|          |
|   1 |  HASH GROUP BY        |               |   707K|    14M|  6847M|   472K (17)| 00:00:19 |
|   2 |   MERGE JOIN          |               |   255M|  5107M|       | 13492  (77)| 00:00:01 |
|   3 |    SORT JOIN          |               |  1000 | 14000 |       |     3  (34)| 00:00:01 |
|   4 |     TABLE ACCESS FULL | NUMBER_RANGES |  1000 | 14000 |       |     2   (0)| 00:00:01 |
|*  5 |    FILTER             |               |       |       |       |            |          |
|*  6 |     SORT JOIN         |               |  1000K|  6835K|    30M|  3451   (7)| 00:00:01 |
|   7 |      TABLE ACCESS FULL| MSISDNS       |  1000K|  6835K|       |   245  (14)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   5 - filter("MS"."MSISDN"<="NR"."TO_NUMBER")
   6 - access("MS"."MSISDN">="NR"."FROM_NUMBER")
       filter("MS"."MSISDN">="NR"."FROM_NUMBER")

The two queries produced the same results (apart from ordering); but the second query took 2 minutes 19.4 seconds to complete.

 

Update:

In a moment of idle curiosity I recreated the data with 40 Million rows in the MSISDNs table to get some idea of how fast the entire report process could go when re-engineered (remember the OP has 72M rows, but select the subset flagged as ‘AVL’). It took 1 minute 46 seconds to create the IOT – after which the report for 1,000 number ranges still took less than 0.2 seconds.

 

 

 

 

 

Avoiding the COMMIT bomb!

I’m an Oracle dinosaur, so I like using SQL Plus.  Its simple, fast, comes with every version and installation and platform, and I’m very familiar with it.  (And who knows, it might still be at the forefront of the Oracle development teams!  http://www.slideshare.net/hillbillyToad/sqlcl-overview-a-new-command-line-interface-for-oracle-database )

But there is one important thing I always take care of when I’m using SQL Plus, and it’s easiest to explain with an example.

You start off by wanting to delete a couple of rows from a critical table as part of a patching process.  You type this:

image

Uh oh….. I forgot the WHERE clause.  That’s not so good…

Now, if I keep my cool, I can let that command finish and then type rollback, and the damage is limited to potentially blocking other users for a little while.

But of course, what normally happens, is that you see your command, your jaw drops to the floor, and you grab your mouse, race up to the top left or right corner and click to close that window

image

 

Phew !  Crisis averted – your statement never finished, so Oracle will happily clean up after you and roll all the deletions back.

But that is a very  very risky thing to do … Because what if JUST WHEN YOU CLICK, the delete completes, and the screen looks like this:

image

Well… by default, when you exit SQL Plus, it will commit any outstanding changes.  And since your delete has completed… you just wiped out your table, and you’re off to LinkedIn looking for a new place to work :-(

There is a better way.  In your login.sql file for SQL Plus, make sure you always have the following:

set exitcommit off

You never, ever want to be committing (with any tool) unless you explicitly request it.

Video Tutorial: XPLAN_ASH Active Session History - Part 5

#333333; font-family: Verdana, Arial, sans-serif; font-size: 13px; line-height: 16.8999996185303px;">The next part of the video tutorial explaining the XPLAN_ASH Active Session History functionality continuing the actual walk-through of the script output.
#333333; font-family: Verdana, Arial, sans-serif; font-size: 13px; line-height: 16.8999996185303px;" />
#333333; font-family: Verdana, Arial, sans-serif; font-size: 13px; line-height: 16.8999996185303px;" />#333333; font-family: Verdana, Arial, sans-serif; font-size: 13px; line-height: 16.8999996185303px;">More parts to follow.

DBAKevlar at Collaborate 2015

Here’s a quick guide to my schedule of sessions at IOUG Collaborate 2015 for this year.  I’m looking forward to seeing everyone next week, (I’ll be arriving on Saturday, so I really should say in a couple days) and have no doubt we’re up for another phenomenal conference at Las Vegas, this year at the Mandalay!

Additionally, there are some great opportunities in professional development at the IOUG Strategic Leadership Program.  I’ll be speaking with James Lui on how to improve your value with personal brands and social media.

I’m the WIT luncheon speaker on Monday, so for those of you that signed up for this great, yearly event for women in tech before it sold out, I’m looking forward to meeting as many of the powerful women in our industry, (and hopefully a few men, too!) before and after the event.

There are also a few SIG’s that I’ll be attending that aren’t on the schedule:

Tue-12:30 p.m. – 1:00 p.m.Cloud Computing and Virtualization SIG Meeting (ID: 943)

Wed-12:30 p.m. – 1:00 p.m.Oracle Enterprise Manager SIG Meeting (ID: 949)

Id Number Date and Time Session Title Product Line(s) Session Track(s) Session Room
976 Sun. Apr. 12
9:00 am – 1:00 pm
Everything I Needed to Know About Enterprise Manager I Learned at COLLABORATE – Hands-on Lab Oracle Enterprise Manager Manageability Palm D
0 Mon. Apr. 13
9:15 – 10:15 am
Zero to Manageability in One Hour: Build a Solid Foundation for Oracle Enterprise Manager 12c Oracle Cloud – DBaaS/PaaS| Oracle Database| Oracle Enterprise Manager Manageability Banyan B
112 Mon. Apr. 13
3:15 – 4:15 pm
The Power of the AWR Warehouse Oracle Enterprise Manager Manageability Banyan B
967 Mon. Apr. 13
4:30 – 5:30 pm
IOUG Strategic Leadership Program: Staying on Top of Key Trends and Growing Your Personal Brand with Social Media Applicable to All Professional Development Banyan E
1003 Wed. Apr. 15
2:00 – 3:00 pm
OakTable: TED-Style Lightning Talks (2:00-3:00pm) Applicable to All Database Mandalay K
986 Mon. Apr. 13
12:45 – 1:45 pm
Twitter 101 for Oracle Professionals Applicable to All| Professional Development Banyan D

Thanks to everyone at IOUG for the opportunity to be so involved with this great conference and see everyone soon!



Tags:  ,


Del.icio.us



Facebook

TweetThis

Digg

StumbleUpon




Copyright © DBA Kevlar [DBAKevlar at Collaborate 2015], All Right Reserved. 2015.

Data Cloning and Refresh

For some time now, I’ve been creating blog posts that walk you through using some of the Enterprise Manager technology to perform specific tasks. For example, there’s one here on setting up Chargeback in EM 12.1.0.4. I’ve never been really satisfied with the way these blog posts turn out, as to document a step by step process like this takes lots of screenshots and you end up with a post that’s a mile long. It also gives the impression that doing something can be quite complex, when in fact it might only take a few minutes to perform all those steps.

In this blog post, I thought I’d take a different approach. Recently, I recorded a couple of screenwatches for use on OTN. One was on creating a test master database with continuous flow using a Data Guard standby database (available here) and the other on how to use Snap Clone outside the cloud environment (available here).

Both those URL’s stream off YouTube. If you want to download them and play them back off your own computer, here are the relevant links:

Continuous Flow – the relevant checksums if you want to make sure it has downloaded properly are:

CRC-32: f4fb51db
MD4: 8c3fbefff8cb8c464bf946739d30121b
MD5: a40a2a8232e53265b04ce789f8fa5738
SHA-1: ceafcb9e2d395a7756f25799fb58e4a7c11dbd10

Snap Clone outside the cloud environment – the relevant checksums if you want to make sure it has downloaded properly are:

CRC-32: 9e0cea20
MD4: dd9d87ac4fc78a65d1ef79cb93c3b690
MD5: 9101f17ed289714aeee6b42974abc085
SHA-1: 63c2e2228b4e33a7a0552484903b51883fc80da5

The screenwatches both start with some explanation of what you can achieve using EM to clone data in the way referred to in the video, and then walk you through the process of actually doing it. Of course, the only problem with loading videos (apart from the size) instead of screenshots is you will need to manually pause them if you want to stop and read the details of the screen. In any case, try them out and let me know (post a comment below if you like) which way you prefer – long step by step screenshot posts, or these shorter video versions.

Tuning Database XQuery Statements (2)

So we concluded the post “Tuning Database XQuery Statements (1)” with the following SQL statement…

Oaktable World Las Vegas April 15, 2015 at Collaborate

oaktableworld_lasvegas1

 

Wednesday April 15 at Collaborate 2015 Las Vegas room Mandalay K
For more information see Pythian’s Blog post.

Screen Shot 2015-04-08 at 10.03.53 AM

 

Screen Shot 2015-04-08 at 10.06.48 AM

 

 

What is Oaktable World ? Oaktable World is a day organized by members of the Oaktable network. The Oaktable network is a network of Oracle database tuning geeks. Among the members are Tanel Poder, Jonathan Lewis, Cary Millsap etc.  Oaktable network was created by Mogens Nørgaard back in 2001 or so and Mogens started organizing underground get togethers, known as Oracle Closed World, during Oracle Open World in 2007, usually at Mogens’ “office” at Chevys. In 2010  Oracle Closed World became a bit more organized with T-shirts as shown below. In 2012 Oracle kindly informed us we were not allowed to use Oracle in the groups meeting name, so it was changed to Oaktable World. Since 2012 Oaktable Worlds have been hosted at Oracle Open World, UKOUG, Collaborate and DOAG.

 

OTW_2010

 

 

OTW_sponsors