Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

Installing ZFS on OEL7 UEK4 for Docker storage

The Oracle Database is fully supported on Docker according that Linux is Red Hat Enterprise Linux 7 or Oracle Enterprise Linux 7 with Unbreakable Enterprise 4. This is documented in MOS Note 2216342.1. Given the size of the Oracle database in GigaBytes even empty, the way it is installed at build with many file updates, and the per-block modifications of the datafiles, a block level copy-on-write filesystem is a must and deduplication and compression are appreciated. This makes ZFS a good option for the Docker storage driver, but also the external volumes. By the way, the Docker documentation about the storage drivers mention that zfs is a good choice for high-density workloads such as PaaS and this of course includes Database as a Service.

I’ve run this example on OEL 7.2 created in the the Oracle Cloud:


We need to install the kernel headers. Of course, it is probably better to run a ‘yum update’ and reboot in order to run the latest kernel.
Here, I’m just installing the headers for the current kernel:

[root@localhost opc]# yum -y install kernel-uek-devel-$(uname -r)
kernel-uek-devel.x86_64 0:4.1.12-112.14.13.el7uek
Dependency Installed:
cpp.x86_64 0:4.8.5-28.0.1.el7_5.1 gcc.x86_64 0:4.8.5-28.0.1.el7_5.1 glibc-devel.x86_64 0:2.17-222.el7
glibc-headers.x86_64 0:2.17-222.el7 kernel-headers.x86_64 0:3.10.0-862.9.1.el7 libdtrace-ctf.x86_64 0:0.8.0-1.el7
libmpc.x86_64 0:1.0.1-3.el7 mpfr.x86_64 0:3.1.1-4.el7
Dependency Updated:
glibc.x86_64 0:2.17-222.el7 glibc-common.x86_64 0:2.17-222.el7 libgcc.x86_64 0:4.8.5-28.0.1.el7_5.1
libgomp.x86_64 0:4.8.5-28.0.1.el7_5.1


We need Dynamic Kernel Module Support to load ZFS modules. I had problems in the past with this so I install it step by step to verify that everything is ok. First, enable the EPEL repository:

[root@localhost opc]# yum install -y yum-utils
[root@localhost opc]# yum-config-manager --enable ol7_developer_EPEL

Then install DKMS:

[root@localhost opc]# yum -y install -y dkms
dkms.noarch 0:2.4.0-1.20170926git959bd74.el7
Dependency Installed:
elfutils-default-yama-scope.noarch 0:0.170-4.el7 elfutils-libelf-devel.x86_64 0:0.170-4.el7
kernel-debug-devel.x86_64 0:3.10.0-862.9.1.el7 zlib-devel.x86_64 0:1.2.7-17.el7
Dependency Updated:
elfutils-libelf.x86_64 0:0.170-4.el7 elfutils-libs.x86_64 0:0.170-4.el7 zlib.x86_64 0:1.2.7-17.el7

Install ZFS repository

There is a zfs-release package that installs the /etc/yum.repos.d/zfs.repo:

[root@localhost opc]# sudo rpm -Uvh
warning: /var/tmp/rpm-tmp.yvRURo: Header V4 RSA/SHA256 Signature, key ID f14ab620: NOKEY
Preparing... ################################# [100%]
Updating / installing...
1:zfs-release-1-5.el7_4 ################################# [100%]

Basically, all it contains is the following enabled section:

name=ZFS on Linux for EL7 - dkms

Install ZFS

This is the important part, installing ZFS:

[root@localhost opc]# sudo yum install -y zfs
Package Arch Version Repository Size
zfs x86_64 0.7.9-1.el7_4 zfs 413 k
Installing for dependencies:
kernel-devel x86_64 3.10.0-862.9.1.el7 ol7_latest 16 M
libnvpair1 x86_64 0.7.9-1.el7_4 zfs 30 k
libuutil1 x86_64 0.7.9-1.el7_4 zfs 35 k
libzfs2 x86_64 0.7.9-1.el7_4 zfs 130 k
libzpool2 x86_64 0.7.9-1.el7_4 zfs 591 k
lm_sensors-libs x86_64 3.4.0-4.20160601gitf9185e5.el7 ol7_latest 41 k
spl x86_64 0.7.9-1.el7_4 zfs 29 k
spl-dkms noarch 0.7.9-1.el7_4 zfs 456 k
sysstat x86_64 10.1.5-13.el7 ol7_latest 310 k
zfs-dkms noarch 0.7.9-1.el7_4 zfs 4.9 M

The most important is to check that the zfs module is installed correctly:

Running module version sanity check.
- Original module
- No original module exists within this kernel
- Installation
- Installing to /lib/modules/4.1.12-112.14.13.el7uek.x86_64/extra/

I’ve seen cases where it was not and then the module cannot load. You can also check:

[root@localhost opc]# dkms status
spl, 0.7.9, 4.1.12-112.14.13.el7uek.x86_64, x86_64: installed
zfs, 0.7.9, 4.1.12-112.14.13.el7uek.x86_64, x86_64: installed

If you have a problem (such as “modprobe: FATAL: Module zfs not found” when loading the module), check the status and maybe re-install it with:

dkms remove zfs/0.7.9 --all
dkms --force install zfs/0.7.9

If everything is ok, you can load the module:

[root@localhost opc]# /sbin/modprobe zfs
[root@localhost opc]#

Create a ZFS filesystem

If the ZFS module was not loaded you have this error:

[root@localhost opc]# zpool list
The ZFS modules are not loaded.
Try running '/sbin/modprobe zfs' as root to load them.

If it has been loaded correctly, you have no ZFS Storage Pool yet:

[root@localhost opc]# zpool list
no pools available

First I need to add a disk to my machine. Here I have only one disk created when I created the Compute Service:

[root@localhost opc]# lsblk
xvdb 202:16 0 128G 0 disk
├─xvdb1 202:17 0 500M 0 part /boot
└─xvdb2 202:18 0 127.5G 0 part
├─vg_main-lv_root 249:0 0 123.5G 0 lvm /
└─vg_main-lv_swap 249:1 0 4G 0 lvm [SWAP]

I add a new disk in the Storage tab:
And attach it and attach it to my Cloud Instance:

Here is the new disk visible from the system:

[root@localhost opc]# lsblk
xvdb 202:16 0 128G 0 disk
├─xvdb1 202:17 0 500M 0 part /boot
└─xvdb2 202:18 0 127.5G 0 part
├─vg_main-lv_root 249:0 0 123.5G 0 lvm /
└─vg_main-lv_swap 249:1 0 4G 0 lvm [SWAP]
xvdc 202:32 0 120G 0 disk
[root@localhost opc]# ls -l /dev/xvdc /dev/block/202:32
lrwxrwxrwx 1 root root 7 Jul 19 15:05 /dev/block/202:32 -> ../xvdc
brw-rw---- 1 root disk 202, 32 Jul 19 15:05 /dev/xvdc

Here is where I add a ZFS Storage Pool for Docker:

[root@localhost opc]# zpool create -f zpool-docker -m /var/lib/docker /dev/xvdc
[root@localhost opc]# zpool status
pool: zpool-docker
state: ONLINE
scan: none requested
zpool-docker ONLINE 0 0 0
xvdc ONLINE 0 0 0
[root@localhost opc]# zpool list
zpool-docker 119G 118K 119G - 0% 0% 1.00x ONLINE -

And while I’m there I set some attributes to enable compression and deduplication. And as Docker writes to layers with 32k I/O I set the recordsize accordingly:

zfs set compression=on zpool-docker
zfs set dedup=on zpool-docker
zfs set recordsize=32k zpool-docker

Note that I attached if you reboot the instance you will have to attach the storage again and then run zpool import zpool-docker)

[root@localhost opc]# zpool import zpool-docker

Just to test that everything is ok, I install Docker as I did in a previous post:

[root@localhost opc]# yum-config-manager --add-repo
[root@localhost opc]# yum-config-manager --enable ol7_addons
[root@localhost opc]# yum -y install docker-ce
[root@localhost opc]# systemctl start docker

Docker layers

I pull a small image and start a container on it:

[root@localhost opc]# docker run oraclelinux:7-slim

Here is the image and the ZFS dataset for its layer, mounted under /var/lib/docker/zfs:

[root@localhost opc]# docker image ls
oraclelinux 7-slim b1af4ba0cf19 12 days ago 117MB
[root@localhost opc]# docker inspect oraclelinux:7-slim | jq -r .[0].GraphDriver
"Data": {
"Dataset": "zpool-docker/fe31ff466872588506b1a3a3575c64d458beeb94d15bea593e5048237abf4fcc",
"Mountpoint": "/var/lib/docker/zfs/graph/fe31ff466872588506b1a3a3575c64d458beeb94d15bea593e5048237abf4fcc"
"Name": "zfs"

And here is the container layer:

[root@localhost opc]# docker container ls -a
9eb7610c1fc5 oraclelinux:7-slim "/bin/bash" 6 minutes ago Exited (0) 6 minutes ago inspiring_shannon
[root@localhost opc]# docker inspect inspiring_shannon | jq -r .[0].GraphDriver
"Data": {
"Dataset": "zpool-docker/5d1022761b6dee28a25e21ec8c5c73d99d09863f11439bbf86e742856f844982",
"Mountpoint": "/var/lib/docker/zfs/graph/5d1022761b6dee28a25e21ec8c5c73d99d09863f11439bbf86e742856f844982"
"Name": "zfs"

If you don’t have jq just ‘yum install jq’. It is very convenient to filter and display the ‘inspect’ output.

We can see those datasets from ZFS list:

[root@localhost opc]# zfs list -o creation,space,snapshot_count,written -r | sort
Thu Jul 19 15:13 2018 zpool-docker 115G 126M 0B 964K 0B 125M none 964K
Thu Jul 19 15:38 2018 zpool-docker/fe31ff466872588506b1a3a3575c64d458beeb94d15bea593e5048237abf4fcc 115G 125M 0B 125M 0B 0B none 0
Thu Jul 19 15:39 2018 zpool-docker/5d1022761b6dee28a25e21ec8c5c73d99d09863f11439bbf86e742856f844982 115G 87K 0B 87K 0B 0B none 87K
Thu Jul 19 15:39 2018 zpool-docker/5d1022761b6dee28a25e21ec8c5c73d99d09863f11439bbf86e742856f844982-init 115G 46K 0B 46K 0B 0B none 0

Here, sorted by creation time, we see the datasets used by each layer. The initial files before having any image are less than 1MB. The image uses 125MB. The container creation has written 87KB and 46KB additional once running.


Cet article Installing ZFS on OEL7 UEK4 for Docker storage est apparu en premier sur Blog dbi services.

Oracle Security Training by Pete Finnigan in 2018

Are you worried about the data in your databases being stolen? GDPR has just become law across the EU and the UK and affects business in other countries that process EU citizens data. Maybe you store and process credit card....[Read More]

Posted by Pete On 19/07/18 At 02:04 PM

Google Cloud Spanner – inserting data

In a previous post I’ve created a Google Cloud Spanner database and inserted a few rows from the GUI. This is definitely not a solution fo many rows and here is a post about using the command line.

If I start the Google Shell from the icon on the Spanner page for my project, everything is set. But if I run it from elsewhere, using the as I did in A free persistent Google Cloud service with Oracle XE I have to set the project:

franck_pachot@cloudshell:~$ gcloud config set project superb-avatar-210409
Updated property [core/project].


I create my Spanner instance with 3 nodes across the world:
franck_pachot@superb-avatar-210409:~$ time gcloud spanner instances create franck --config nam-eur-asia1 --nodes=3 --description Franck
Creating instance...done.
real 0m3.940s
user 0m0.344s
sys 0m0.092s


and Spanner database – created in 6 seconds:

franck_pachot@superb-avatar-210409:~$ time gcloud spanner databases create test --instance=franck
Creating database...done.
real 0m6.832s
user 0m0.320s
sys 0m0.128s


The DDL for table creation can also be run from there:

franck_pachot@superb-avatar-210409:~$ gcloud spanner databases ddl update test --instance=franck --ddl='create table DEMO1 ( ID1 int64, TEXT string(max) ) primary key (ID1)'
DDL updating...done.

I’m now ready to insert one million rows. Here is my table:

franck_pachot@superb-avatar-210409:~$ gcloud spanner databases ddl describe test --instance=franck
--- |-
ID1 INT64,


The gcloud command line has a limited insert possibility:

franck_pachot@superb-avatar-210409:~$ time for i in $(seq 1 1000000) ; do gcloud beta spanner rows insert --table=DEMO1 --database=test --instance=franck --data=ID1=${i},TEXT=XXX${i} ; done
commitTimestamp: '2018-07-18T11:09:45.065684Z'
commitTimestamp: '2018-07-18T11:09:50.433133Z'
commitTimestamp: '2018-07-18T11:09:55.752857Z'
commitTimestamp: '2018-07-18T11:10:01.044531Z'
commitTimestamp: '2018-07-18T11:10:06.285764Z'
commitTimestamp: '2018-07-18T11:10:11.106936Z'

Ok, let’s stop there. Calling a service for each row is not efficient with a latency of 5 seconds.


I’ll use the API from Python. Basically, a connection is a Spanner Client:

franck_pachot@superb-avatar-210409:~$ python3
Python 3.5.3 (default, Jan 19 2017, 14:11:04)
[GCC 6.3.0 20170118] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from import spanner
>>> spanner_client = spanner.Client()
>>> instance = spanner_client.instance('franck')
>>> database = instance.database('test')

Batch Insert

With this I can send a batch of rows to insert. Here is the full Python script I used to insert one million, by batch of 1000 rows:

from import spanner
spanner_client = spanner.Client()
instance = spanner_client.instance('franck')
database = instance.database('test')
for j in range(1000):
for i in range(1000):
with database.batch() as batch:
batch.insert(table='DEMO1',columns=('ID1', 'TEXT',),values=records)

This takes 2 minutes:

franck_pachot@superb-avatar-210409:~$ time python3
real 2m52.707s
user 0m21.776s
sys 0m0.668s

If you remember my list of blogs on Variations on 1M rows insert that’s not so fast. But remember that rows are distributed across 3 nodes in 3 continents but here inserting with constantly increasing value have all batched rows going to the same node. The PRIMARY KEY in Google Spanner is not only there to declare a constraint but also determines the organization of data.


The select can also be run from there from a read-only transaction called ‘Snapshot’ because it is doing MVCC consistent reads:

frank_pachot@superb-avatar-210409:~$ python3
Python 3.5.3 (default, Jan 19 2017, 14:11:04)
[GCC 6.3.0 20170118] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from import spanner
>>> with spanner.Client().instance('franck').database('test').snapshot() as snapshot:
... results = snapshot.execute_sql('SELECT COUNT(*) FROM DEMO1')
... for row in results:
... print(row)

The advantage of the read-only transaction is that it can do consistent reads without locking. The queries executed in a read-write transaction have to acquire some locks in order to guarantee consistency when reading across multiple nodes.


So, you can look at the PRIMARY KEY as a partition by range, and we have also reference partitioning with INTERLEAVE IN PARENT. This reminds me of the Oracle CLUSTER segment that is so rarely used because storing the tables separately is finally the better compromise on performance and flexibility for a multi-purpose database.

Here is my creation of DEMO2 where ID1 is a foreign key referencing DEMO1

franck_pachot@superb-avatar-210409:~$ time gcloud spanner databases ddl update test --instance=franck --ddl='create table DEMO2 ( ID1 int64, ID2 int64, TEXT string(max) ) primary key (ID1,ID2), interleave in parent DEMO1 on delete cascade'
DDL updating...done.
real 0m24.418s
user 0m0.356s
sys 0m0.088s

I’m now inserting 5 detail rows per each parent row:

from import spanner
database = spanner.Client().instance('franck').database('test')
for j in range(1000):
for i in range(1000):
for k in range(5):
records.append((1+j*1000+i,k,u'XXX'+str(i)+' '+str(k)))
with database.batch() as batch:

This ran in 6 minutes.

Join (Cross Apply)

Here is the execution plan for


where I join the two tables and apply a filter on the join:

Thanks to the INTERLEAVE the join is running locally. Each row from DEMO1 (the Input of the Cross Apply) is joined with DEMO2 (the Map of Cross Apply) locally. Only the result is serialized. On this small number of rows we do not see the benefit from having the rows in multiple nodes. There are only 2 nodes with rows here (2 local executions) and probably one node contains most of the rows. The average time per node is 10.72 seconds and the elapsed time is 20.9 seconds, so I guess that one node ran un 20.9 seconds and the other in 1.35 only.

The same without the tables interleaved (here as DEMO3) is faster to insert but the join will be more complex where DEMO1 must be distributed to all nodes.
Without interleave, the input table of the local Cross Apply is a Batch Scan, which is actually like a temporary table distributed to all nodes (seems to have 51 chunks here), created by the ‘Create Batch’. This is called Distributed Cross Applied.

So what?

Google Spanner has only some aspects of SQL and Relational databases. But it is still, like the NoSQL databases, a database where the data model is focused at one use case only because the data model and the data organization have to be designed for specific data access.


Cet article Google Cloud Spanner – inserting data est apparu en premier sur Blog dbi services.

Power BI 101 – Log Files and Tracing

Knowing where log files are and how to turn on debugging is an essential part of any technical job and this goes for Power BI, too.  Remember, as I learn, so does everyone else….Come on, pretty please?

Power BI Desktop

Log files and traces can be accessed one of two ways-

  • Via the Power BI Application
  • Via File Explorer

In the Power BI application, go to File –> Options and Settings –> Options –> Diagnostics.

Crash and dump files are automatically stored with an option to disable them from this screen, but unsure why you’d ever want to do this.  If Power BI does crash, you would lose any valuable data on what the cause was. 300w, 768w, 1519w, 1300w" sizes="(max-width: 318px) 100vw, 318px" data-recalc-dims="1" />

To debug with a trace, you’ll need to enable it from this screen as well, as it’s not turned on by default.  Remember that tracing can be both resource and storage intensive, so only enable it when you actually need to diagnose something.  You can also choose to bypass tracing the geo code cache, as this is used to help map coordinates and it can be very chatty.

To view files, you can click on the open crash/dump file folder and this will open up a File Explorer to the traces directory on your pc.

Directly From File Explorer:

Ensure that File Explorer has viewing set to display hidden items.

C:\Users\\AppData\Local\Microsoft\Power BI Desktop\Traces

Log Files

These are all retained inside the Performance folder under the Traces directory

The file’s will be named with the following naming convention:


Locating the files that you need for your current process is easiest if you sort by Date Modified.  Verify that you’re working with the file that is being written to and not the file used to keep track of startup and shutdown log tracking, (0 KB): 300w, 768w, 1512w, 1300w" sizes="(max-width: 438px) 100vw, 438px" data-recalc-dims="1" />

The third log file in the list above, and also the one started before the executable for Power BI Desktop, (PID 13396) is the Microsoft Mashup Container, (Microsoft.Mashup.Container.NetFX40.exe) with its own PID of 16692.  It’s contains valuable information about calculations,  measures and other caching processes.  Take care to ensure the PID of the one used by Power BI in the logs matches the one you’re inspecting in the Task Manager-  Excel and other programs are also known to have a version of this executable, so there may be more than one listed for Power BI, as well as others for different Microsoft applications.

Log File Breakdown

Each file will contain entries providing information on high level processing, including start time, total size of cache allocated for the task, process information, Process ID, (PID), Transaction ID, (TID) and duration.

An example of an entry can be seen below:

ObjectCacheSessions/CacheStats/Size {"Start":"2018-07-19T01:42:24.9707127Z","Action":"ObjectCacheSessions/CacheStats/Size","entryCount":"1","totalSize":"24","ProductVersion":"2.59.5135.781 (PBIDesktop)","ActivityId":"00000000-0000-0000-0000-000000000000","Process":"PBIDesktop","Pid":13396,"Tid":8,"Duration":"00:00:00.0046865"}

We can easily match up the Process name and the PID with what is displayed in our Task Manager detail view: 300w, 768w, 1300w" sizes="(max-width: 460px) 100vw, 460px" data-recalc-dims="1" />

We’ve now identified the process, the amount of memory allocated to perform a task captured in the log, start time and the duration.  The information in these log files can assist when diagnosing if Power BI desktop crashes, but the data collected is quite rudimentary.

If you shut down Power BI Desktop, the PBIDesktop* log file writes to the startup file, which was once empty and it then empties and saves off the timestamp of the exit of the program.

The Microsoft Mashup file has much of the same information, but includes deeper level processing work by Power BI, such as work done in the Query Editor or when we create a measure or new column/table.

In the three examples from the file below, you can see a compile, a save and then an evaluate task.  Note that the Host Process ID is my Power BI Desktop we’ve seen earlier, but the interaction with the Microsoft Mashup Container is demonstrated as well:

SimpleDocumentEvaluator/GetResult/Compile {"Start":"2018-07-19T01:48:46.1441843Z","Action":"SimpleDocumentEvaluator/GetResult/Compile","HostProcessId":"13396","ProductVersion":"2.59.5135.781 (PBIDesktop)","ActivityId":"04248470-07e1-4862-b184-a32f186f26fd","Process":"Microsoft.Mashup.Container.NetFX40","Pid":16692,"Tid":1,"Duration":"00:00:00.4302569"}

ObjectCache/CacheStats/Size {"Start":"2018-07-19T01:48:47.3504705Z","Action":"ObjectCache/CacheStats/Size","HostProcessId":"13396","entryCount":"5","totalSize":"14564","ProductVersion":"2.59.5135.781 (PBIDesktop)","ActivityId":"04248470-07e1-4862-b184-a32f186f26fd","Process":"Microsoft.Mashup.Container.NetFX40","Pid":16692,"Tid":1,"Duration":"00:00:00.0000170"}

SimpleDocumentEvaluator/GetResult/Evaluate {"Start":"2018-07-19T01:48:46.5744678Z","Action":"SimpleDocumentEvaluator/GetResult/Evaluate","HostProcessId":"13396","ProductVersion":"2.59.5135.781 (PBIDesktop)","ActivityId":"04248470-07e1-4862-b184-a32f186f26fd","Process":"Microsoft.Mashup.Container.NetFX40","Pid":16692,"Tid":1,"Duration":"00:00:00.7780750"}

Another common file in the Performance directory will contain the msmdsrv* naming convention, which collect log information on the data source loader.  Duration information and cache/memory allocation could offer valuable information on poor performance during data loading processes.  First stop is always to check the settings for the desktop to see what has been set for memory allocation vs. assuming it’s the default.

If I just start the program and don’t open anything, only the high level processing of starting, basic memory allocation and stopping will be tracked in the PBIDesktop* file until I open up a PBIX file.  Then anything that needs to be updated and refreshed for the visuals, etc. will begin to write log data to the Microsoft Mashup log file and if a data refresh must be performed, the msmdsrv file.

Trace files

When you do turn on debugging, tracing, as shown in the beginning of this post, a file is created in the parent directory, TRACES.

When enabled and after a restart of the Power BI Desktop, you will receive not only similar information about PID, TID, the process and the duration, but also encounter granule information about Power BI and what’s going on behind the scenes:

  • Application graphics info
  • Settings
  • Parameters
  • Background processes
  • Caching
  • Extensions
  • Query edits
  • Changes applied

You’ll even see entries similar to the following:


A mutex is a small, efficient allocation of memory.  As mutexes have thread affinity, it means the mutex can only be released by the thread in Power BI that owns it.  If it’s released by another thread, an application exception will be thrown in the application and trapped in the trace file.

The interesting aspect of tracing in Power BI Desktop, the options are put back to default, with granule level tracing disabled when you restart the application.













Copyright © DBA Kevlar [Power BI 101 - Log Files and Tracing], All Right Reserved. 2018.

Speaking At DOAG 2018 Conference And IT Tage 2018

#333333; font-family: Verdana, Arial, sans-serif; font-size: 13px;">I will be speaking at the yearly DOAG#333333; font-family: Verdana, Arial, sans-serif;"> conference in December as well as at the IT Tage in November. My talk will be "Oracle Optimizer System Statistics Update 2018" where I summarize the history and current state of affairs regarding System Statistics and I/O calibration in recent Oracle versions like 12c and 18c.

Easy way to create large demo-tables in #Exasol and #Oracle 600w, 150w" sizes="(max-width: 300px) 100vw, 300px" />

If you need a large set of data to test or demonstrate something, this does the trick:

Create demo table in Oracle

create table t as
rownum as id, -- growing unique number
'Just some text' as textcol,
mod(rownum,5) as numcol1, -- numbers 0-4 round-robin
mod(rownum,1000) as numcol2 , -- numbers 0-999 round robin
5000 as numcol3, -- constant
to_date ('01.' || lpad(to_char(mod(rownum,12)+1),2,'0') || '.2018' ,'') as time_id -- dates for every month 
from dual connect by level<=1e4 -- 10000 rows 

Create demo table in Exasol

create table t as
rownum as id, -- growing unique number
'Just some text' as textcol,
mod(rownum,5) as numcol1, -- numbers 0-4 round-robin
mod(rownum,1000) as numcol2 , -- numbers 0-999 round robin
5000 as numcol3, -- constant
to_date ('01.' || lpad(to_char(mod(rownum,12)+1),2,'0') || '.2018' ,'') as time_id -- dates for every month 
from (select 1 from dual connect by level<=1e4) -- 10000 rows 

In Oracle you may get this error message for a high number of rows (like 10 Mio): ORA-30009: Not enough memory for CONNECT BY operation.
This way it works in spite of it:

SQL> alter session set workarea_size_policy=manual;

Session altered.

SQL> alter session set sort_area_size=1000000000;

Session altered.

SQL> create table t as
     rownum as id, 
     'Just some text' as textcol,
     mod(rownum,5) as numcol1, 
     mod(rownum,1000) as numcol2 , 
     5000 as numcol3, 
     to_date ('01.' || lpad(to_char(mod(rownum,12)+1),2,'0') || '.2018' ,'') as time_id  
     from dual connect by level<=1e7 ; 
Table created.
SQL> select count(*) from t;


SQL> select bytes/1024/1024 as mb from user_segments where segment_name='T';


In Exasol, this problem doesn’t surface:

SQL_EXA> create table t as
         rownum as id,
         'Just some text' as textcol,
         mod(rownum,5) as numcol1,
         mod(rownum,1000) as numcol2 ,
         5000 as numcol3,
         to_date ('01.' || lpad(to_char(mod(rownum,12)+1),2,'0') || '.2018' ,'') as time_id
         from (select 1 from dual connect by level<=1e7) ; 
EXA: create table t as... 
Rows affected: 10000000 
SQL_EXA> SELECT raw_object_size/1024/1024 as mb FROM SYS.EXA_ALL_OBJECT_SIZES where object_name='T';
EXA: SELECT raw_object_size/1024/1024 as mb FROM SYS.EXA_ALL_OBJECT_SIZES w...


1 row in resultset.

And the resulting table is smaller because Exasol stores it in columnar compressed format </p />

    	  	<div class=

Announcement: Venue Confirmed For Upcoming Brussels “Oracle Indexing Internals and Best Practices” Seminar

I can finally confirm the venue for my upcoming “Oracle Indexing Internals and Best Practices” seminar in beautiful Brussels, Belgium running on 27-28 September 2018. The venue will be the Regus Brussels City Centre Training Rooms Facility, Avenue Louise / Louizalaan 65, Stephanie Square, 1050, Brussels. Note: This will be the last public seminar I’ll run […]

Searching in Oracle Database documentation

Just a quick heads up with something I see from time to time in Chrome (but not in Firefox or any other browser).

Occasionally when doing a search, the results are not limited as per my criteria.  For example, if I am searching for information about Spatial in the Licensing Guide:


then when I click the Search button, the results might come back with a far broader search range:


If you experience this, there is an easy workaround – simply re-run the search. It only seems to occur (for me at least) on the first execution of the search.  Clicking the Search button again on the page, yielded the correct result second time around:


Let me know if you’ve seen similar behaviour – if it is not just me, I’ll pass it along to the Documentation team.

Happy searching!

Which Privileges Are Required to Use the ADWC Service Console?

The Autonomous Data Warehouse Cloud (ADWC) service provides a Service Console that can be used to monitor the service activity and to carry out a small number of administration tasks (e.g. changing some resource management rules).

The documentation specifically says to use the ADMIN user to login. But, actually, any unlocked database user having the CREATE SESSION privilege can be used to login. This is not only useful, but, in my opinion, necessary. In fact, developers should be able to see the service activity without knowing the password of the ADMIN user.

The documentation provides no information about what privileges are needed to take advantage of the provided functionalities. If you login with a user having only the CREATE SESSION privilege you get either empty charts or errors when trying to use some of the functionality. The following figure illustrates:

The Service Console displays no data when a user without the necessary privileges login.

In other words, the UI is static and privileges are only checked when a specific functionality is used. I would expect something dynamic…

By looking at the roles owned by the ADMIN user, I noticed three roles with a name that could match what I was looking for: CONSOLE_MONITOR, CONSOLE_OPERATOR and CONSOLE_ADMIN. Then, with some trial and error, I came up with the following table that summarizes which role provides which functionality:

Overview No data Data visible Data visible
Activity No data Data visible Data visible
Activity – Cancel execution Unauthorized Authorized Authorized
Administration – Download Client Credentials Unauthorized Authorized Authorized
Administration – Set Resource Management Rules Unauthorized Authorized Authorized
Administration – Set Administrator Password Unauthorized Unauthorized Authorized
Administration – Manage Oracle ML Users Unauthorized Unauthorized Authorized
Administration – Download Oracle Instance Client Authorized Authorized Authorized
Administration – Send Feedback to Oracle Authorized Authorized Authorized


  • A user with the CONSOLE_MONITOR role has the same privileges than a user without any role. To me this state of affairs do not look correct. I would expect that the CONSOLE_MONITOR role makes visible the Overview and Activity data.
  • With these roles it is not possible to provide only the privileges to see data without being able to modify something or kill sessions.
  • The “Cancel execution” functionality is actually a “kill session”. My guess is that as soon as the service is upgrade to 18c then new functionality to cancel executions will be used instead.

Rebuilding Indexes: Danger With Clustering Factor Calculation (Chilly Down)

Let me start by saying if you don’t already following Jonathan Lewis’s excellent Oracle blog, do yourself a favour. In a recent article, Jonathan highlighted a danger with rebuilding indexes (or indeed creating an index) when used in relation to collecting index statistics with the TABLE_CACHED_BLOCKS preference. I’ve discussed the importance of the TABLE_CACHED_BLOCKS statistics […]