Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

sqlldr, direct path loads and concurrency in 12.2 and later

In my previous post I showed you that Oracle’s SQL loader (sqlldr) utility has a built-in timeout of 30 seconds waiting for locked resources before returning SQL*Loader-951/ORA-604/ORA-54 errors and failing to load data. This can cause quite some trouble! Before showing you the enhancement in 12.2 and later, here is the gist of the previous post.

Concurrency in Oracle sqlldr 12.1 and earlier

To show you how sqlldr times out I need to simulate an exclusive lock on the table in sqlplus for example. That’s quite simple:

SQL> set time on
10:17:00 SQL> lock table t2 in exclusive mode;

Table(s) Locked.

10:17:07 SQL>

Next I started a sqlldr process in another session. Please refer to the previous post for details, or take my word that I’m using a direct path insert strategy. The only difference is the size of the input file – I had to inflate it considerably to buy some time running standard diagnostic tools:

$ date; sqlldr /@loader control=t2_2.ctl ; date
Tue 23 Jul 10:22:35 BST 2019

SQL*Loader: Release 19.0.0.0.0 - Production on Tue Jul 23 10:22:35 2019
Version 19.4.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

Path used:      Direct
SQL*Loader-951: Error calling once/load initialization
ORA-00604: error occurred at recursive SQL level 1
ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired
Tue 23 Jul 10:23:06 BST 2019

The “date” commands reveal a timeout of 30 seconds. Setting ddl_lock_timeout has no bearing on the timeout: the database is waiting on kpodplck wait before retrying ORA-54:

10:22:36 SQL> r
  1* select event, state from v$session where program like 'sqlldr%'

EVENT                                                            STATE
---------------------------------------------------------------- -------------------
kpodplck wait before retrying ORA-54                             WAITING

1 row selected.

This was where I left off with the previous post until I noticed there is another option!

Oracle 12.2 and later

In 12.2 and later you can instruct sqlldr to wait until the lock is released. There is a new parameter named direct_path_lock_timeout:

$ sqlldr | egrep 'Version|direct_path'
Version 19.4.0.0.0
direct_path_lock_wait -- wait for access to table when currently locked  (Default FALSE)

Interestingly there are no hits for direct_path_lock_wait in My Oracle Support’s knowledgebase. There are merely a few documentation references. So what does this parameter do? While the table is still locked in exclusive mode, let’s start the sqlldr process with the new option:

$ date; sqlldr /@loader control=t2_2.ctl direct_path_lock_wait=true; date
Fri 26 Jul 10:13:54 BST 2019

SQL*Loader: Release 19.0.0.0.0 - Production on Fri Jul 26 10:13:54 2019
Version 19.4.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

Path used:      Direct
[ ... waiting ... ]

The process now sits there and waits … and it does so for more than 30 seconds. And instead of kpodplck wait before retrying ORA-54 it waits on … drums please: enq: TM contention!

10:20:11 SQL> select seq#, event, state, round(seconds_in_wait/60, 2) mins_waiting
10:20:21   2  from v$session where program like 'sqlldr%';

      SEQ# EVENT                    STATE               MINS_WAITING
---------- ------------------------ ------------------- ------------
       119 enq: TM - contention     WAITING                     6.53

10:20:26 SQL> 

This is great news if your sqlldr processes compete for TM enqueues and your load process takes a little longer than the previously hard coded timeout of 30 seconds. The process eventually completed successfully after the enqueue was released:

$ date; sqlldr /@loader control=t2_2.ctl direct_path_lock_wait=true; date
Fri 26 Jul 10:13:54 BST 2019

SQL*Loader: Release 19.0.0.0.0 - Production on Fri Jul 26 10:13:54 2019
Version 19.4.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

Path used:      Direct

Load completed - logical record count 950936.

Table T2:
  950924 Rows successfully loaded.

Check the log file:
  t2_2.log
for more information about the load.
Fri 26 Jul 10:21:32 BST 2019

Caveats

As with every pro, there are usually cons associated. The downside to waiting (for a potentially very long time) is that you might not notice load processes beginning to stack up unless proper instrumentation and monitoring are in place. Waiting too long for data to be loaded is equally bad as not loading at all because the end result is identical. As with many features in the database Oracle gives you plenty of options, and it’s up to the developers and architects to make the correct decisions on how to use them.

Summary

Beginning with sqlldr 12.2 Oracle introduced the option to wait for enqueues on segments to be released instead of aborting after 30 seconds.

In the next post I’ll write about another possibility to prevent exclusive table locks in the first place when running multiple concurrent sqlldr sessions.

sqlldr, direct path loads and concurrency in 12.1 and earlier

I have recently come across an interesting issue related to concurrent data loading into the Oracle database using sqlldr’s direct path mode. Although I investigated the situation on 12.1.0.2, I found that the same holds true in 19.4 as well when using the defaults. I reconstructed the case, although it is simplified a little to bring the point home.

The environment I used to put this post together is Oracle Restart 19.4.0 on Oracle Linux 7.6.

Test overview

For this test I am running concurrent sqlldr sessions to demonstrate the case. I am conscious of that fact that I could have used external tables, but then I wouldn’t have been able to write this post :)

Assume there’s a table named t2:

SQL> create table t2 as select * from dba_objects where 1 = 0;

Table created.

I am planning on populating the table with data in CSV format. Using sqlcl it is dead easy to create an input file, simply specify the /*csv*/ hint in your query and spool the output to a file: job done. The resulting input file is named t2.dat.

Next up is the control file. I am a creature of habit and tend to use a control file although I could have tried the same process using express mode.

Using the documentation I figured the following control file should do. Since it’s easy enough to forget passing direct=true to the command line I added it to the options clause. When using sqlcl to create a CSV file it adds the column headings in line 1, and they won’t load so I’m skipping these records. Here is the full control file:

options (direct=true, skip=1)
load data 
infile 't2.dat'
into table t2
append
fields csv without embedded 
date format 'dd-mon-yy'
trailing nullcols
(
OWNER,                                                                          
OBJECT_NAME,                                                                    
SUBOBJECT_NAME,                                                                 
OBJECT_ID,                                                                      
DATA_OBJECT_ID,                                                                 
OBJECT_TYPE,                                                                    
CREATED date,                                                                        
LAST_DDL_TIME date,                                                                  
TIMESTAMP,                                                                      
STATUS,                                                                         
TEMPORARY,                                                                      
GENERATED,                                                                      
SECONDARY,                                                                      
NAMESPACE,                                                                      
EDITION_NAME,                                                                   
SHARING,                                                                        
EDITIONABLE,                                                                    
ORACLE_MAINTAINED,                                                              
APPLICATION,                                                                    
DEFAULT_COLLATION,                                                              
DUPLICATED,                                                                     
SHARDED,                                                                        
CREATED_APPID,                                                                  
CREATED_VSNID,                                                                  
MODIFIED_APPID,                                                                 
MODIFIED_VSNID
)

With the prerequisites at hand I’m ready to perform some testing.

Test #1: concurrent sessions using defaults

First of all, what happens when starting 5 concurrent sessions? Will they complete? Let’s try this in bash:

for i in $(seq 1 5) ; do
  echo starting iteration $i
  sqlldr /@loader control=t2.ctl log=session_${i}.log &
done

This ran to completion without any visible errors, but I prefer to run sanity checking anyway. The first thing to do is to see if all records have been loaded. The input file contains 73148 data records by the way.

$ grep 'Rows successfully loaded' session*log
session_1.log:  73148 Rows successfully loaded.
session_2.log:  73148 Rows successfully loaded.
session_3.log:  73148 Rows successfully loaded.
session_4.log:  73148 Rows successfully loaded.
session_5.log:  73148 Rows successfully loaded.

So this looks ok, what about the run times?

$ grep '^Run' *.log
session_1.log:Run began on Mon Jul 22 21:32:31 2019
session_1.log:Run ended on Mon Jul 22 21:32:44 2019
session_2.log:Run began on Mon Jul 22 21:32:31 2019
session_2.log:Run ended on Mon Jul 22 21:32:40 2019
session_3.log:Run began on Mon Jul 22 21:32:31 2019
session_3.log:Run ended on Mon Jul 22 21:32:46 2019
session_4.log:Run began on Mon Jul 22 21:32:31 2019
session_4.log:Run ended on Mon Jul 22 21:32:48 2019
session_5.log:Run began on Mon Jul 22 21:32:31 2019
session_5.log:Run ended on Mon Jul 22 21:32:42 2019

All in all these loads completed in a reasonably short time. Some took longer to finish than others though. That made me curious and I logged into the database to see what was going on. I never consciously saw this event:

SQL> select sid, event, state from v$session where program like 'sqlldr%';

       SID EVENT                                                            STATE
---------- ---------------------------------------------------------------- -------------------
        44 kpodplck wait before retrying ORA-54                             WAITING
        52 kpodplck wait before retrying ORA-54                             WAITING
        54 kpodplck wait before retrying ORA-54                             WAITING
       290 Data file init write                                             WAITING
       291 kpodplck wait before retrying ORA-54                             WAITING

SQL> 

Using the output from my above query I could see that one session was active, all the others queued up behind it. Interestingly there is no “enq: TM – contention” which you’d see with direct path inserts in sqlplus for example:

-- in session 1
SQL> insert /*+ append */ into t2 select * from dba_objects;

73157 rows created.

-- notice there is no commit! 

-- Session 2 waits ...
SQL> insert /*+ append */ into t2 select * from dba_objects;

-- the DBA sees it all:
SQL> select sid, event, state from v$session where username = 'MARTIN'

       SID EVENT                                                            STATE
---------- ---------------------------------------------------------------- -------------------
        51 enq: TM - contention                                             WAITING
       270 SQL*Net message from client                                      WAITING

What’s even more interesting is this: ddl_lock_timeout which I thought might have helped, has no effect on sqlldr operations:

SQL> show parameter ddl_lock_timeout

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
ddl_lock_timeout                     integer     0
SQL> 

The parameter isn’t set at all, yet I could see “kpodplck wait before retrying ORA-54” which is strange. ORA-54 is of course the dreaded “resource busy and acquire with nowait specified” error. I managed to trace kpodp back to the direct load path functionality.

Without the ability to control waiting via a database parameter I am aware of, there must be some other timeout. And sure enough, there is:

$ time sqlldr /@loader control=t2.ctl

SQL*Loader: Release 19.0.0.0.0 - Production on Mon Jul 22 21:46:59 2019
Version 19.4.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

Path used:      Direct
SQL*Loader-951: Error calling once/load initialization
ORA-00604: error occurred at recursive SQL level 1
ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired

real    0m32.038s
user    0m0.011s
sys     0m0.011s

It appears as if sqlldr times out waiting for a locked resource after round about 30 seconds, regardless of ddl_lock_timeout.

And this is the end of the story if you aren’t on 12.2 or later. In the next post I’ll show you how you can work around this problem with more current releases.

Oracle 19c Automatic Indexing: Methodology Introduction (After Today)

For the past month or so I’ve been playing around extensively with the new Oracle 19c “Automatic Indexing” feature, so I thought it was time to start blogging about it. Considering it’s only in “Version 1” status, my initial impression is very positive in that it works extremely well doing at what it’s initially designed […]

Sometimes the Simplest Things Are the Most Important

I didn’t ride in First Class a lot during my career, but I can remember one trip, I was sitting in seat 1B. Aisle seat, very front row. Right behind the lavatory bulkhead. The lady next to me in 1A was very nice, and we traded some stories.

I can remember telling her during dinner, wow, just look at us. Sitting in an aluminum tube going 500 miles per hour, 40,000 feet off the ground. It’s 50º below zero out there. Thousands of gallons of kerosene are burning in huge cans bolted to our wings, making it all go. Yet here we sit in complete comfort, enjoying a glass of wine and a steak dinner. And just three feet away from us in that lavatory right there, a grown man is evacuating his bowels.

I said, you know, out of all the inventions that have brought us to where we are here today, the very most important one is probably that wall.

It’s back!

Yes indeed! Now that the dates and times are available for OpenWorld 2019, then it is naturally time for the best data searching, filtering and analysis tool on the planet to step up to the plate, enter the fray and …. hmmm… I’ve run out of metaphors </p />
</p></div>

    	  	<div class=

Ansible tips’n’tricks: checking if a systemd service is running

I have been working on an Ansible playbook to update Oracle’s Tracefile Analyser (TFA). If you have been following this blog over the past few months you might remember that I’m a great fan of the tool! Using Ansible makes my life a lot easier: when deploying a new system I can ensure that I’m also installing TFA. Under normal circumstances, TFA should be present when the (initial) deployment playbook finishes. At least in theory.

As we know, life is what happens when you’re making other plans, and I’d rather check whether TFA is installed/configured/running before trying to upgrade it. The command to upgrade TFA is different from the command I use to deploy it.

I have considered quite a few different ways to do this but in the end decided to check for the oracle-tfa service: if the service is present, TFA must be as well. There are probably other ways, maybe better ones, but this one works for me.

Checking for the presence of a service

Ansible offers a module, called service_facts since version 2.5 to facilitate working with services. I also tried the setup module but didn’t find what I needed. Consider the following output, generated on Oracle Linux 7.6 when gathering service facts:

TASK [get service facts] *******************************************************
 ok: [localhost] => {
     "ansible_facts": {
         "services": {
             "NetworkManager-wait-online.service": {
                 "name": "NetworkManager-wait-online.service", 
                 "source": "systemd", 
                 "state": "stopped"
             }, 
             "NetworkManager.service": {
                 "name": "NetworkManager.service", 
                 "source": "systemd", 
                 "state": "running"
             }, 
             "auditd.service": {
                 "name": "auditd.service", 
                 "source": "systemd", 
                 "state": "running"
             }, 

[ many more services ]

            "oracle-tfa.service": {
                 "name": "oracle-tfa.service", 
                 "source": "systemd", 
                 "state": "running"
             }, 

[ many more services ]

This looks ever so slightly complicated! And indeed, it took a little while to work the syntax out. My first attempt were all but unsuccessful.

Getting the syntax right

Thankfully I wasn’t the only one with the problem, and with a little bit of research ended up with this code:

---
 - hosts: localhost
   connection: local
   become: true

   tasks:
   - name: get service facts
     service_facts:

   - name: try to work out how to access the service
     debug:
       var: ansible_facts.services["oracle-tfa.service"]

Awesome! When running this on a system with TFA installed, it works quite nicely:

TASK [try to work out how to access the service] *******************************
 ok: [localhost] => {
     "ansible_facts.services[\"oracle-tfa.service\"]": {
         "name": "oracle-tfa.service", 
         "source": "systemd", 
         "state": "running"
     }
 }
 

 PLAY RECAP *********************************************************************
 localhost                  : ok=3    changed=0    unreachable=0    failed=0

The same code fails on a system without TFA installed:

TASK [try to work out how to access the service] *******************************
 ok: [localhost] => {
     "ansible_facts.services[\"oracle-tfa.service\"]": "VARIABLE IS NOT DEFINED!
      'dict object' has no attribute 'oracle-tfa.service'"
 }
 

 PLAY RECAP *********************************************************************
 localhost                  : ok=3    changed=0    unreachable=0    failed=0

Now the trick is to ensure that I’m not referencing an undefined variable. This isn’t too hard either, here is a useable playbook:

---
 - hosts: localhost
   connection: local 
 
   tasks:
   - name: get service facts
     service_facts:
 
   - name: check if TFA is installed
     fail:
       msg: Tracefile Analyzer is not installed, why? It should have been there!
     when: ansible_facts.services["oracle-tfa.service"] is not defined

The “tasks” include getting service facts before testing for the presence of the oracle-tfa.service. I deliberately fail the upgrade process to make the user aware of a situation that should not have happened.

Hope this helps!

Running pgBench on YugaByteDB 1.3

Running pgBench on YugaByte DB 1.3

My first test on this Open Source SQL distributed database.

Did you hear about YugaByteDB, a distributed database with an architecture similar to Google Spanner, using PostgreSQL as the query layer?

I started to follow when I’ve heard that Bryn Llewellyn, famous PL/SQL and EBR product manager, left Oracle to be their developer advocate. And YugaByteDB got more attention recently when announcing that their product license is now 100% Open Source.

I like to learn new things by trying and troubleshooting rather than reading the documentation. Probably because there’s more to learn aside of the documentation path. And also because troubleshooting is fun. So, one of the great features is that the query layer is compatible with PostgreSQL. Then I’ll try to run pgBench on YugaByteDB.

It is important to mention here that I’m running all nodes in a single lab VM, so performance is not representative. And I’m testing the YSQL query layer which is still in beta. The goal is to discover and learn about the distributed database challenges, rather than evaluating the product.

Install YugaByteDB

Nothing is easier than the installation of YugaByteDB, all documented:

Install YugaByte DB | YugaByte DB Docs

I install it in RHEL 7.6 (Actually OEL 7.6 as I’m running my lab in an Oracle Cloud VM). The install is just an un-tar followed by the post-install which patches the binaries so that we don’t have to set LD_LIBRARY_PATH. I set PATH to the YugaByte bin directory:

wget -O- https://downloads.yugabyte.com/yugabyte-1.3.0.0-linux.tar.gz | tar -zxvf - 
export PATH=$PATH:$PWD/yugabyte-1.3.0.0/bin
post_install.sh

I create a 3 nodes cluster on this host:

yb-ctl --rf 3 create
yb-ctl status

Those nodes are created as 127.0.0.1 to 127.0.0.3 here. I access remotely and I tunnel the interesting ports with my ssh_config file:

Host yb
HostName 130.61.59.66
User opc
ForwardX11 yes
DynamicForward 8080
#YSQL JDBC
LocalForward 15433 127.0.0.1:5433
LocalForward 25433 127.0.0.2:5433
LocalForward 35433 127.0.0.3:5433
#YCQL API
LocalForward 19042 127.0.0.1:9042
LocalForward 29042 127.0.0.2:9042
LocalForward 39042 127.0.0.3:9042
# Web UI
LocalForward 17000 127.0.0.1:7000
LocalForward 27000 127.0.0.2:7000
LocalForward 37000 127.0.0.3:7000

I create a database. The “psql” equivalent here is “ysqlsh”:

ysqlsh
\timing on
drop database if exists franck;
create database franck;
\q

ysqlsh (11.2)

Install PgBench

As I tunneled the 5433 port I can use pgBench from my laptop:

pgbench --host localhost --port 15433 --username postgres franck

But the full postgres installation is also there in ./yugabyte-1.3.0.0/postgres

then, I add this in my path:

export PATH=$PATH:$PWD/yugabyte-1.3.0.0/bin:$PWD/yugabyte-1.3.0.0/postgres/bin

Initialize pgBench

I run the initialization:

pgbench --initialize --host localhost -p 5433 -U postgres franck

ERROR: DROP multiple objects not supported yet
HINT: See https://github.com/YugaByte/yugabyte-db/issues/880

Ok, I have an error because pgBench uses the multi-table drop statement which is not supported yet. But what’s really nice is the error message containing a link to the GitHub issue about this.

No problem, I don’t need to drop the tables and pgBench has an --init-steps option to choose the steps: drop tables, table creation, generate data, vacuum, primary key creation, foreign key creation.

pgbench --initialize --init-steps=tgvpf -h localhost -p 5433 -U postgres franck

ERROR: VACUUM not supported yet
HINT: Please report the issue on https://github.com/YugaByte/yugabyte-db/issues

There’s no FILLFACTOR and no VACUUM per-se (the storage engine has a transparent garbage collector). YugaByteDB uses the PostgreSQL query layer, but not the same storage layer. My tables are created and data is generated:

ysqlsh franck
\dt
select count(*) from pgbench_branches;
select count(*) from pgbench_accounts;
select count(*) from pgbench_tellers;
select count(*) from pgbench_history;

Let’s continue without vacuum, only the primary and foreign key definition:

pgbench --initialize --init-steps=pf -h localhost -p 5433 -U postgres franck

ERROR: This ALTER TABLE command is not yet supported.

pgBench adds the constraints with an ALTER TABLE but YugaByteDB supports only inline declaration in the CREATE TABLE. You can check the DDL from a PostgreSQL database (initialize with the ‘foreign key’ step which is not the default):

pg_dump --schema-only -h localhost -p 5433 -U postgres franck

Basically, here is what is missing in my YugaByteDB database:

alter table pgbench_branches add primary key (bid);
"alter table pgbench_tellers add primary key (tid)",
"alter table pgbench_accounts add primary key (aid)"
"alter table pgbench_tellers add constraint pgbench_tellers_bid_fkey foreign key (bid) references pgbench_branches",
"alter table pgbench_accounts add constraint pgbench_accounts_bid_fkey foreign key (bid) references pgbench_branches",
"alter table pgbench_history add constraint pgbench_history_bid_fkey foreign key (bid) references pgbench_branches",
"alter table pgbench_history add constraint pgbench_history_tid_fkey foreign key (tid) references pgbench_tellers",
"alter table pgbench_history add constraint pgbench_history_aid_fkey foreign key (aid) references pgbench_accounts"

Re-Create with Primary and Foreign Keys

Finally, here is what I want to run to get everything in a supported way:

ysqlsh franck
 drop table if exists pgbench_history;
drop table if exists pgbench_tellers;
drop table if exists pgbench_accounts;
drop table if exists pgbench_branches;
CREATE TABLE pgbench_branches (
bid integer NOT NULL
,bbalance integer
,filler character(88)
,CONSTRAINT pgbench_branches_pkey PRIMARY KEY (bid)
);
CREATE TABLE pgbench_accounts (
aid integer NOT NULL
,bid integer references pgbench_branches
,abalance integer
,filler character(84)
,CONSTRAINT pgbench_accounts_pkey PRIMARY KEY (aid)
);
CREATE TABLE pgbench_tellers (
tid integer NOT NULL
,bid integer references pgbench_branches
,tbalance integer
,filler character(84)
,CONSTRAINT pgbench_tellers_pkey PRIMARY KEY (tid)
);
CREATE TABLE pgbench_history (
tid integer references pgbench_tellers
,bid integer references pgbench_branches
,aid integer references pgbench_accounts
,delta integer
,mtime timestamp without time zone
,filler character(22)
);
\q

This creates the tables without any error. Next step is to generate data

Generate data

Now the only “ --initialize” step I have to do is the generation of data.

Note that this step is doing the truncate with multi-table syntax and this one is already implemented.

So here is the “generate data” step:

pgbench --initialize --init-steps=g -h localhost -p 5433 -U postgres franck

ERROR: Operation only supported in SERIALIZABLEisolation level
HINT: See https://github.com/YugaByte/yugabyte-db/issues/1199

PostgreSQL runs by default in “read committed” isolation level. As the GitHub issue mentions, YugaByteDB support for Foreign Keys requires “Serializable”. The reason is that the storage engine (DocDB) has no explicit row locking yet to lock the referenced row. But with referential integrity, inserting in a child table must lock the parent row in share mode (like a SELECT FOR KEY SHARE) to ensure that it is not currently being deleted (or the referenced columns updated). Then the no-lock solution is to run in a true Serializable isolation level.

About the support of Foreign Keys, read Bryn Llewellyn blog post:

Relational Data Modeling with Foreign Keys in a Distributed SQL Database - The Distributed SQL Blog

Note that I said “true” Serializable isolation level because I’m used to Oracle Database where this term is used for “Snapshot Isolation”- more about this:

Oracle serializable is not serializable - Blog dbi services

Serializable transaction isolation

So, the current transaction isolation level is “Read Committed” where repeatable reads, which is required by foreign keys, needs SELECT FOR KEY SHARE:

ysqlsh franck
select current_setting('transaction_isolation');
\q

Then, in order to be able to insert in a table that has some foreign keys, I set the default isolation level to Serializable for my database, and re-connect to check it:

ysqlsh franck
alter database franck set default_transaction_isolation=serializable;
\c franck postgres localhost 5433
select current_setting('transaction_isolation');
\q


Update 27-JUL-2019

If you don’t want to change the default, you can also set:

PGOPTIONS='-c default_transaction_isolation=serializable'

Generate data with serializable transactions

Ok, let’s try to generate data now:

pgbench --initialize --init-steps=g -h localhost -p 5433 -U postgres franck

That is taking a long time so I attach the debugger on it:

gdb $(pgrep pgbench)

https://github.com/YugaByte/yugabyte-db/blob/master/src/postgres/src/bin/pgbench/pgbench.c#L3704

pgbench.c:3704 is the call to PQendcopy() which is waiting for the asynchronous COPY completion, COPY is used by pgBench for the large table “pgbench_accounts”. On YugaByteDB side, it seems that only one Tablet Server is doing the work, 100% in CPU:

The Tablet Server is running code from librocksdb.so (the YugaByteDB storage engine, DocDB, is based on RocksDB):

perf top

While I was waiting I generated a Brendan Gregg flamegraph on the busy Tablet Server, but there’s no visible bottleneck.

sudo perf record -e cpu-cycles -o /tmp/perf.out -F 99 -g
^C
git clone https://github.com/brendangregg/FlameGraph.git
sudo perf script -i /tmp/perf.out | ./FlameGraph/stackcollapse-perf.pl | ./FlameGraph/flamegraph.pl /dev/stdin > /tmp/perf.folded.svg

This goes beyond the goal of this post and, anyway, looking at performance is probably not relevant in this version. [21-JUL-2019: the support for COPY has just been added since this test - more info here]

Finally, the initialization of 100000 tuples finished after one hour:

time pgbench --initialize --init-steps=g -h localhost -p 5433 -U postgres franck

So, finally, I have my pgBench schema loaded, with all referential integrity and data.

pgBench simple-update in single-session

I’m running the “simple-update” which basically updates the “abalance” for a row in “pgbench_accounts”, and inserts into “pgbench_history”.

pgbench --no-vacuum --builtin=simple-update --protocol=prepared --time 30 -h localhost -p 5433 -U postgres franck

pgbench --no-vacuum --builtin=simple-update --protocol=prepared --time 30 -h localhost -p 5433 -U postgres franck

At least I know that this basic OLTP application can run without any change on YugaByteDB and that’s a very good point for application transparency. I’ll explain later why I start here the “simple update” workload rather than the default. The Transaction Per Second rate is not amazing (it is 150x higher on a plain PostgreSQL on the same platform), but this is not what I am testing here.

pgBench simple-update in multi-sessions

Obviously, a distributed database should be scalable when multiple users are working in parallel. For this I run pgBnech with 10 clients from 10 threads:

pgbench --no-vacuum --protocol=prepared --builtin=simple-update --time 30 --jobs=10 --client=10 -h localhost -p 5433 -U postgres franck

pgbench — no-vacuum — protocol=prepared — builtin=simple-update — time 30 — jobs=10 — client=10 -h localhost -p 5433 -U postgres franck

Good. I said that we should not look at the elapsed time, but the comparison with the previous run shows the scalability as 10 sessions can run about 10x more transactions per second. I run the 3 YugaByteDB nodes on a 24 core virtual machine here. The servers are multi-threaded:

nTH: number of threads (LWP), P: last used CPU (SMP)

pgBench TCPB-like in multi-sessions

Demystifying Benchmarks: How to Use Them To Better Evaluate Databases

Actually, the first test I did was the default pgBench workload, which is the “TPC-B (sort of)”. In addition to the “simple update”, each transaction updates, in addition to “pg_account”, the balance “pgbench_tellers” and “pgbench_branches”.

This is more tricky because multiple clients will have to concurrently update the same records. And there’s a high probability of contention given the cardinalities. There, pessimistic locking would be better than optimistic. I’m still with my Foreign Keys here and Serializable isolation level (which is optimistic locking).

pgbench --no-vacuum --protocol=prepared --builtin=simple-update --time 30 --jobs=10 --client=10 -h localhost -p 5433 -U postgres franck

ERROR: could not serialize access due to concurrent update

Quickly I can see that 9 out of the 10 clients failed with “Conflicts with higher priority transaction”. This is the equivalent to the PostgreSQL “ERROR: could not serialize access due to concurrent update”. With optimistic locking, the application must be ready to re-try a failed transaction, but pgBench has no option for that. This is not special to YugaByteDB: you will get the same in PostgreSQL when running pgBench default workload in a Serializable transaction isolation level.

Anyway, given the high probability of collision here, the solution is pessimistic locking.

Read Committed transaction isolation

If I set back the isolation level to “Read Committed” the UPDATE will use pessimistic locking, and then I expect the transactions to be serialized, waiting to see committed changes rather than failing when encountering a concurrent change.

ysqlsh franck
alter database franck set default_transaction_isolation='read committed';
\q

But then, I cannot declare the foreign keys or I’ll get “ ERROR: Operation only supported in SERIALIZABLE isolation level” until the issue #1199 is fixed.

Without referential integrity constraints

I re-create the tables with the REFERENCES clause commented out:

ysqlsh franck
drop table if exists pgbench_history;
drop table if exists pgbench_tellers;
drop table if exists pgbench_accounts;
drop table if exists pgbench_branches;
CREATE TABLE pgbench_branches (
bid integer NOT NULL
,bbalance integer
,filler character(88)
,CONSTRAINT pgbench_branches_pkey PRIMARY KEY (bid)
);
CREATE TABLE pgbench_accounts (
aid integer NOT NULL
,bid integer --references pgbench_branches
,abalance integer
,filler character(84)
,CONSTRAINT pgbench_accounts_pkey PRIMARY KEY (aid)
);
CREATE TABLE pgbench_tellers (
tid integer NOT NULL
,bid integer --references pgbench_branches
,tbalance integer
,filler character(84)
,CONSTRAINT pgbench_tellers_pkey PRIMARY KEY (tid)
);
CREATE TABLE pgbench_history (
tid integer --references pgbench_tellers
,bid integer --references pgbench_branches
,aid integer --references pgbench_accounts
,delta integer
,mtime timestamp without time zone
,filler character(22)
);
\q

And run the initialize again, which is much faster (2 minutes instead of 1 hour):

time pgbench --initialize --init-steps=g -h localhost -p 5433 -U postgres franck

Then, ready to run my 10 clients TPC-B workload:

pgbench --no-vacuum --protocol=prepared --builtin=simple-update --time 30 --jobs=10 --client=10 -h localhost -p 5433 -U postgres franck

Operation failed. Try again.: Conflicts with higher priority transaction
ERROR: Operation failed. Try again.: Conflicts with committed transaction
ERROR: Error during commit: Operation expired: Transaction expired

This is much better. Among the 23574 transactions, I got only 3 errors and 7 clients were still running concurrently. I’ve also run it with only one client where I had TPS=144 and here with 7 clients remaining we reach TPS=785.

Of course, I would expect no “Try again” error when in Read Committed isolation level. Here I have the 3 out of the 5 transactional errors we can get from libpq calls (according to yugabyte pg_libpq-test.cc):

  • Transaction expired
  • Conflicts with committed transaction
  • Conflicts with higher priority transaction
  • Restart read required
  • Value write after transaction start

But that’s probably for another post. The important outcomes from my very first test of YugaByteDB are:

  • It is very easy to install and test, so… try it (and you will get a link to get a nice T-Shirt shipped to your home)
  • I was able to run pgBench, a simple application written for PostgreSQL, without any change for YugaByteDB
  • Not all DDL is supported yet, but easy to workaround and follow the Git issue.
  • Foreign Keys are supported, which is a challenge for a distributed database.
  • Transaction concurrency is managed and the issues that will be fixed in later releases are clearly documented as Git issues

Oracle DBA_SQL_PLAN_BASELINE SQL_ID and PLAN_HASH_VALUE

There are probably better ways, so please let me know (@FranckPachot). This is what I use when I want to get the SQL_ID and the PLAN_HASH_VALUE when looking at the SQL Plan Baselines.

The DBA_SQL_PLAN_BASELINES view does not provide them, probably because SQL Plan Management (SPM) is going from a statement and it’s execution plan to the SQL Plan Baselines, but doesn’t need to navigate in the other way. However, we need it when troubleshooting query performance.

SQL_ID

I think I got this from It Tony Hasler “Expert Oracle SQL: Optimization, Deployment, and Statistics” book. There’s an internal function available since 11gR2 which calculates the SQL_ID from an SQL_TEXT (as null-terminated C string):

dbms_sqltune_util0.sqltext_to_sqlid(sql_text||chr(0)) sql_id

PLAN_HASH_VALUE

I lazily use DBMS_XPLAN.DISPLAY_SQL_PLAN_BASELINE for this one, as it displays a line with it:

( select to_number(regexp_replace(plan_table_output,'^[^0-9]*')) 
from table(
dbms_xplan.display_sql_plan_baseline(sql_handle,plan_name)
) where plan_table_output like 'Plan hash value: %') plan_hash_value

Example

Here is an example where I query DBA_SQL_PLAN_BASELINE with those additional columns:

select dbms_sqltune_util0.sqltext_to_sqlid(sql_text||chr(0)) sql_id,
( select to_number(regexp_replace(plan_table_output,'^[^0-9]*'))
from table(dbms_xplan.display_sql_plan_baseline(sql_handle,plan_name))
where plan_table_output like 'Plan hash value: %') plan_hash_value
,plan_name,enabled,accepted,fixed,reproduced
,dbms_xplan.format_time_s(elapsed_time/1e6) hours,creator,origin,created,last_modified,last_executed
,sql_text
from dba_sql_plan_baselines b
where sql_text like '%&sql_text_pattern.%'
order by sql_id,hours desc
;

For the ELAPSED_TIME, I use the FORMAT_TIME_S for pretty formatting.
More info:

Oracle numbers in K/M/G/T/P/E

So here is the output of the previous query. I have seen some SQL_ID executed with different PLAN_HASH_VALUE and I know which one was good or not. This helps me to know exactly which ones I want to accept or disable:

Reviving an iPad and On-Premises lesson 2.

<< Introducing I.T. to an Elderly Relative
<<<< Preparing the device
<<<<<< First Lessons, Frustrations, & Funny Stuff

In my previous post we finished with my mother having bricked her iPad – that is, having turned it onto a useless lump. So I drove up to see her again to sort it out. {BTW if you think I am being a bit mean to my mum – yes I am. But I do love her and in the end the iPad has resulted in us being much more in touch. But I think it helps to share the frustrations of getting someone utterly unused to technology on line}.

The first thing I did was to get Mum to turn it on and put in the password. Martin with an I (not a Y – “as some people spell it like that!”). Mum was, very slowly, putting in the password correctly. And then staring at the screen until it flashed up an error. She had forgotten about pressing DONE. Now, if she’d called me when she had started having trouble… The thing is, that is so true across all of IT support. If only people called up when they first had a problem or did something wrong (like deleted all those rows…). If you call up quickly, there is much more chance the problem will be solved quickly. Anyway, I digress. I now knew what had gone wrong, she may or may not type the password correctly but it was timing out each time. Of course, by this point the iPad would no longer respond to the correct password, it was locked out.

You may not know this but if an iPad is locked out as it thinks it might be stolen (password put in wrongly too many times), you can’t just factory reset it. At least, I could not and google-fu mostly confirmed this. You have to plug it into another device with iTunes on it. And you can’t just plug it into the device you set it up on and refresh it, even if you backed it up to this device. At least, I could not. Maybe I am not very good at this tech lark. You have to download the latest version of the OS to your device, plug the switched off iPad into your device, turn the iPad on and then press certain buttons on it in a given way within a 0.731 second window that occurs at an unspecified time after turning the device on. I don’t know how often I tried to get the sodding iPad into recovery mode and recognised by iTunes, but it sure as hell amused my Mum to watch me try. I then re-set-up the iPad to be the same, simple set-up I had done so before. See post 2 for some hints on that. All the time Mum was making snide comments about “how simple this all is, Martyn!”. I think she was having revenge. Sue was keeping out the way.

After all the issues with “Martin with a Y or I”, I set the pass code to be a number. Yes, it’s less secure but I have the Apple ID details for her account – if she loses the iPad I can either track it or wipe it remotely. But we were up and running again, we had a working iPad and on-premises lesson 2 could begin. I’m not sure either of us was 100% happy about this…

Mum wants 2 main things from “the interweb”. She wants to be able to contact me (and, I presume, her other Son and her daughter-in-law) and she wants to be able to look things up. If she can do the former than I can help, remotely, with the latter.

So I showed her how to use messenger to contact us again. It’s been a week or so since the last lesson so I knew she would need a reminder. I pointed at one of the various icons and asked her what it looked like “It’s a phone!” So what will it do? “I don’t know, you are supposed to tell me!”. If it looks like a phone, it’s probably… “{blank look}”. You pressed this by accident last week and it made you scream? “Oh, it’s a phone!”. Excellent, we gave it a quick go.

What about this one next to it? What does it look like? “A box and a little box”. Fair point, but it looks a but like a tv camera? She agreed. So, what will it do? “blank look”. You know this one, we tested this with Sue in the kitchen last week… “the kettle?!?”. It was like Star-Trek… “Oh yes, she appeared on the iPad and I could talk to her. It’s just like Star Trek!”. We tried that one too. All good.

OK, let’s re-visit sending messages and using the keyboard. I show her me sending her a couple of messages again. Enter some text, any text. Press the icon to send the message. “Which one”. The one next to the message, it looks like a plane. “Which message?” The one you just typed. “So I press this one {points to the enter key}” No! No, the blue plane one. “This one!” No!!! that is a phone symbol, I explained that one 5 minutes ago and you seem to have no trouble hitting that one despite that it is in utterly the wrong place and no where near the message. “What message?” THE ONE…..The one you just typed, there, the one that says ‘GFRYTSB’. “So I click on your name?” NO! NO! THE FUCKING PLANE! TAP THE FUCKING PLANE!!!!

She taps the plane.

It sends the message “Oh. It did that before. How do I know who it sent it to?” It sent it to the Pope. “Why did it send it to the Pope?” {sigh}. How many people did we set this up for? Me, Sue, Steve, no Pope. But you see my name at the top of the conversation? You know, third child your bore? The name right above all the other messages? It sent it to me.

“But there are three names {moves finger} over here”. THAT IS OVER THERE!!!!!!! You have spent 10 minutes calling me, star- treking me, seeing messages from me, who the hell do you think it sent the message to?!? “Susan?” {I’m losing it…}

OK, send me another message. You know it’s me, my picture and name is above the conversation. Here, look at my screen your picture and name is above *my* conversation and those are the messages you have sent me.

She types something.
and stares at the screen.
And stares at the screen…
And looks at my screen…
And back at her screen…
“It’s not sent! Has it gone to someone else?”
The. Plane. Tap the Plane.
‘Ping’ – “Ohh! you got the message! How does it know where you are?” The bloody computer pixies know. They track everyone in the world. “Can I message anyone in the world?” I lie a little and say no. only the people in the list. “Does it know Steve is in Wales?” Yes. Look, do you ask the phone how it knows where I am? “No, but this is not the phone”.

We exchange a few more messages for practice and then I get her to tap on the other names, to change conversations. She swaps to Sue and Mum sends her a couple of messages. Once again Mum is asking how the computer knows where Sue is. I point out that as Sue is in the room, the iPad can see her – and then realise that was a really stupid thing to say as Mum did not get the joke. “So it CAN’T message Steve if it can’t see him?” No, it can, it can message anyone on her list.”Shall I message him?” No, he lives in Wales, life is hard enough for him already.

It’s time to go home. I make mum turn the iPad off, turn it on, put in the code and send me a message. She’s got it. “What about the internet?” The internet is not ready for you yet Mum, that will be lesson three. Read the book I got you and give it a go if you like. You can’t break… Actually, just wait until I come back over.

I have to say, since then Mum has been able to message me without issue and can turn the iPad on and off with no trouble, so you do get there eventually.

But I do seem to be buying a lot more wine these days…

CockroachDB… true distributed system-of-record for the cloud

After exploring the analytics end of distributed NewSQL databases, I decided to explore true system-of-record in the cloud. Without a doubt, the future is moving to the cloud. CockroachDB is at the fore-front of this journey to be the backbone database for new micro-services in the cloud. Escape the legacy database rats nest that wasn’t designed to be truly distributed or in the cloud.

With CockroachDB, you can domicile data for regulatory reasons and ensure the best possible performance by accessing data locally. CockroachDB allows you to use create a SINGLE distributed database across multiple cloud providers and on-prem. Our self healing not only helps keep your data consistent, but it automates cloud-to-cloud migration without downtime.

cockroach geo-distributed cluster

It should be an exciting new journey. If you are interested, take a look at our architecture, download, and take the roach out for a spin!