Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

PeopleTools Table Reference Generator

Like many other PeopleSoft professionals, I spend a lot of time looking at the PeopleTools tables because they contain meta-data about the PeopleSoft application. Much of the application is stored in PeopleTools tables. Some provide information about the Data Model. Many of my utility scripts reference the PeopleTools tables, some of them update them too. Therefore, it is very helpful to be able to understand what is in these tables. In PeopleSoft for the Oracle DBA, I discussed some tables that are of regular interest. I included the tables that correspond to the database catalogue and that are used during the PeopleSoft login procedure. The tables that are maintained by the process scheduler are valuable because they contain information about who ran what process when, and how long they ran for.
I am not the only person to have started to document the PeopleTools tables on their website or blog, most people have picked a few tables that are of particular interest. I wanted to produce a complete reference.  However, with over 4000 PeopleTools tables and views, it is simply not viable to do even a significant number of them manually.  So, I wrote some SQL and PL/SQL to dynamically generate a page for each PeopleTools table and views and put the generated pages on my website.  If you use Google to search for a PeopleTools name you will probably find the reference page.
I have now revisited that code and made a number of changes and improvements.

  • I have used a later version of PeopleTools to generate the reference on my website.  The list of PeopleTools tables is no longer defined in PeopleTools by object security, so I have used an independent and somewhat broader definition: Table or View records that are either owned by PPT or whose SQLTABLENAME is the same as the record name.
  • There is much less reliance on static pages.  There are now only 3 such pages everything else is generated.  Instead, additional data is loaded from static scripts into the PLAN_TABLE that should always be present in an Oracle database and so it doesn't have to be created.  It should be a global temporary table so there is no problem with debris being left behind or interference with other processes.  That data is then combined with data in the PeopleTools tables during the generation process.
  • The utility has been rewritten as a PL/SQL package that should be created in the PeopleSoft Owner ID schema (usually SYSADM).
  • The generated HTML is simply tidier, and more consistent.

The source is available on GitHub at https://github.com/davidkurtz/PTRef, so you can download and generate your own reference on your own current version of PeopleTools. An example of the generated output can be found on my website.
The idea of the original PeopleTools tables reference was the people could contribute additional descriptions and information that were not in the PeopleTools tables.  That can still happen by making changes to the scripts that load the additional data and uploading new versions to GitHub.

CPU percent

A recent post on the ODC General Database forum asked for an explanation of the AWR report values “%Total CPU” and “%Busy CPU” under the “Instance CPU” label, and how the “%Busy CPU “ could be greater than 100%.  Here’s a text reproduction of the relevant sample supplied:

Host CPU

CPUs Cores Sockets Load Average Begin Load Average End %User %System %WIO %Idle
2 2 1 0.30 1.23 10.7 5.6 5.3 77.7

Instance CPU

%Total CPU %Busy CPU %DB Time waiting for CPU (Resource Manager)
29.8 133.8 0.0

The answer is probably “It’s 12.1 and it’s a programmer error”.

  • Note that the Host CPU %Idle is not consistent with the three usage figures:  10.7 + 5.6 + 5.3 = 21.6 whereas 100 – 77.7 = 22.3.
  • So let’s run with 22.3% and see what else we can notice: 29.8 / 22.3 = 1.3363 – that’s pretty close (when expressed as a percentage) to 133.8%

Hypothesis:

Someone did the division the wrong way round when trying to work out the percentage of the host’s non-idle CPU that could be attributed to the instance. In this example the “%Busy CPU” should actually report 100 * 22.3 / 29.8 = 74.8%

Note – the difference between 133.8 and 133.63 can be attributed to the fact that the various figures reported in this bit of the AWR are rounded to the nearest 1 decimal place.

Note 2 – I don’t think this error is present in 11.2.0.4 or 12.2.0.1

 

 

 

 

 

Learning Kubernetes: persistent storage with Minikube

As part of my talk at (the absolutely amazing) Riga Dev Days 2019 I deployed Oracle Restful Data Services (ORDS) in Minikube as my application’s endpoint. I’ll blog about deploying ORDS 19 in docker and running it on Kubernetes later, but before I can do so I want to briefly touch about persistent storage in Minikube because I saw it as a pre-requisite.

In a previous post I wrote about using Minikube as a runtime environment for learning Kubernetes. This post uses the same versions as before – Docker 18.09 and Kubernetes 1.14 are invoked by Minikube 1.1.0.

Quick & dirty intro to storage concepts for Kubernetes

Container storage is – at least to my understanding – ephemeral. So, in other words, if you delete the container, and create a new one, locally stored data is gone. In “standalone” Docker, you can use so-called volumes to store things more permanently. A similar concept exists in Kubernetes in form of a persistent volume (PV) and a corresponding persistent volume claim (PVC).

In a nutshell, and greatly simplified, a persistent volume is a piece of persistent storage pre-allocated on a Kubernetes cluster that can be “claimed”. Hence the names …

There is a lot more to say about persistent data in Kubernetes, and others have already done that, so you won’t find an explanation of storage concepts in Kubernetes here. Please head over to the documentation set for more details.

Be careful though, as the use of persistent volumes and persistent volume claims is quite an in-depth topic, especially outside the developer-scoped Minikube. I specifically won’t touch on the subject of persistent storage outside of Minikube, I’d rather save that for a later post.

Persistent Volumes for my configuration files

ORDS requires you to run an installation routine first, in the cause of which it will create a directory containing its runtime configuration. The shell script I’m using to initialise my container checks for the existence of the configuration directory and skips the installation step if it finds one. It proceeds straight to starting the Tomcat container. This is primarily done to speed the startup process up. If I were not to use the PV/PVC combination the pods in my deployment would have to run the installation each time they start up, something I wanted to avoid.

A less complex example please

I didn’t want to make things more complicated than necessary, so instead I’ll use a much simpler example before writing up on my ORDS deployment. My example is based on the official Ubuntu image, and I’m writing a “heartbeat” file into the mounted volume and see if I can find it on the underlying infrastructure.

The Minikube documentation informs us that Minikube preserves data in /data and a few other locations. Anything you try to put elsewhere will be lost. With that piece of information at hand I proceeded with the storage creation.

Experienced Minikube users might point out at this stage that pre-creating a volume isn’t needed as Minikube supports dynamic volume creation. From what I can tell that’s correct, but not what I chose to do.

Creating a persistent volume

Based on the previously mentioned documentation I created a persistent volume and accompanying volume claim like this:

$ cat persistent-volumes.yml 

kind: PersistentVolume
apiVersion: v1
metadata:
  name: research-vol
  labels:
    type: local
spec:
  capacity:
    storage: 2Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/data/research-vol"
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: research-pvc
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: ""
  volumeName: research-vol
  resources:
    requests:
      storage: 1Gi

This approach is very specific to Minikube because I’m using persistent volumes of type HostPath.

Translated to plain English this means that I’m creating a 2 GB volume pointing to /data/research-vol on my Minikube system. And I’m asking for 1 GB in my persistent volume claim. The access mode (ReadWriteOnce) seems to be related to mounting the volume (concurrently) on multi-node clusters. Or it’s a bug because I successfully wrote to a single PVC from multiple pods as you can see in a bit… In my opinion the documentation wasn’t particularly clear on the subject.

Build the Docker image

Before I can deploy my example application to Minikube, I need to build a Docker image first. This is really boring, but I’ll show it here for the sake of completeness:

$ cat Dockerfile 
FROM ubuntu
COPY run.sh /usr/local/bin
RUN chmod +x /usr/local/bin/run.sh
ENTRYPOINT ["/usr/local/bin/run.sh"]

The shell script I’m running is shown here:

 $ cat run.sh 
 #!/usr/bin/env bash

 set -euxo pipefail

 if [[ ! -d $VOLUME ]]; then
   /bin/echo ERR: cannot find the volume $VOLUME, exiting
   exit 1
 fi

 while true; do
   /usr/bin/touch ${VOLUME}/$(hostname)-$(date +%Y%m%d-%H%M%S)
   /bin/sleep 10
 done

As you can see it is just an infinite loop writing heartbeat files into the directory indicated by the VOLUME environment variable. The image is available to Minikube after building it. Refer to my previous post on how to build Docker images for Minikube, or have a look at the documentation. I built the image as “research:v2”

Deploy

With the foundation laid, I can move on to defining the deployment. I’ll do that again with the help of a YAML file:

 $ cat research-deployment.yml 
 apiVersion: extensions/v1beta1
 kind: Deployment
 metadata:
   labels:
     app: research
   name: research
 spec:
   replicas: 2
   selector:
     matchLabels:
       app: research
   template:
     metadata:
       labels:
         app: research
     spec:
       containers:
       - image: research:v2
         env:
         - name: VOLUME
           value: "/var/log/heartbeat"
         name: research
         volumeMounts:
         - mountPath: "/var/log/heartbeat"
           name: research-vol
       volumes:
       - name: research-vol
         persistentVolumeClaim: 
           claimName: research-pvc
       restartPolicy: Always

If you haven’t previously worked with Kubernetes this might look daunting, but it isn’t actually. I’m asking for the deployment of 2 copies (“replicas”) of my research:v2 image. I am passing an environment variable – VOLUME – to the container image. It contains the path to the persistent volume’s mount point. I’m also mounting a PV named research-vol as /var/log/heartbeat/ in each container. This volume in the container scope is based on the definition found in the volumes: section of the YAML file. It’s important to match the information in volumes and volumeMounts.

Running

After a quick kubectl apply -f research-deployment.yml I have 2 pods happily writing to the persistent volume.

 $ kubectl get pods
 NAME                       READY   STATUS    RESTARTS   AGE
 research-f6668c975-98684   1/1     Running   0          6m22s
 research-f6668c975-z6pll   1/1     Running   0          6m22s

The directory exists as specified on the Minikube system. If you used the VirtualBox driver, use minikube ssh to access the VM and change to the PV’s directory:

# pwd
/data/research-vol
# ls -l *research-f6668c975-z6pll* | wc -l
53
# ls -l *research-f6668c975-98684* | wc -l
53

As you can see, both pods are able to write to the directory. The volume’s contents were preserved when I deleted the deployment:

$ kubectl delete deploy/research
deployment.extensions "research" deleted
$ kubectl get pods
No resources found.

This demonstrates that both pods from my deployment are gone. What does it look like back on the Minikube VM?

# ls -l *research-f6668c975-z6pll* | wc -l
63
# ls -l *research-f6668c975-98684* | wc -l
63

You can see that files are still present. The persistent volume lives up to its name.

Next I wanted to see if (new) Pods pick up what’s in the PV. After a short modification to the shell script and a quick docker build followed by a modification of the deployment to use research:v3, the container prints the number of files:

#!/usr/bin/env bash

set -euxo pipefail

if [[ ! -d $VOLUME ]]; then
  /bin/echo ERR: cannot find the volume $VOLUME, exiting
  exit 1
fi

/bin/echo INFO: found $(ls -l $VOLUME/* | wc -l) files in the volume

while true; do
  /usr/bin/touch ${VOLUME}/$(hostname)-$(date +%Y%m%d-%H%M%S)
  /bin/sleep 10
done

As proven by the logs:

 $ kubectl get pods
 NAME                        READY   STATUS        RESTARTS   AGE
 research-556cc9989c-bwzkg   1/1     Running       0          13s
 research-556cc9989c-chg76   1/1     Running       0          13s

 $ kubectl logs research-556cc9989c-bwzkg
 [[ ! -d /var/log/heartbeat ]]
 ++ wc -l
 ++ ls -l /var/log/heartbeat/research-7d4c7c8dc8-5xrtr-20190606-105014 
 ( a lot of output not shown )
 /bin/echo INFO: found 166 files in the volume
 INFO: found 166 files in the volume
 true
 ++ hostname
 ++ date +%Y%m%d-%H%M%S
 /usr/bin/touch /var/log/heartbeat/research-556cc9989c-bwzkg-20190606-105331
 /bin/sleep 10
 true
 ++ hostname
 ++ date +%Y%m%d-%H%M%S
 /usr/bin/touch /var/log/heartbeat/research-556cc9989c-bwzkg-20190606-105341
 /bin/sleep 10 

It appears as if I successfully created and used persistent volumes in Minikube.

PS360 enhancement: Added report of DDL models

I have written several blogs and presentations recently about how and how not to collect statistics in PeopleSoft.

  • Managing Cost-Based Optimizer Statistics for PeopleSoft recommends
    • If you are going to continue to use DBMS_STATS in the DDL model then
      • Do not specify ESTIMATE_PERCENT because it disables the hash-based number-of-distinct-values calculation, forcing it to go back to the COUNT(DISTINCT ...) method that requires a sort, and may not produce accurate values because it only samples data.
      • Do not specify METHOD_OPT='FOR ALL INDEXED COLUMNS SIZE 1' because this will not collect histograms on indexed columns, and will not update column statistics on unindexed columns.
      • Do specify FORCE=>TRUE so that you can lock and delete statistics on temporary records.
    • However, the alternative is to use GFCSTATS11 package to collects these statistics.  This package is controlled by a metadata table so you can defined statistics collection behaviour for specific records.
  • How Not to Collect Optimizer Statistics in an Application Engine Program
    • This blog explains why you should not explicitly code DBMS_STATS calls into Application Engine programs.

This has prompted me to add a new report to the PS360 utility that simply reports the various DDL models for Oracle.  Thus during a health check, I can see how statistics are collected during batch processes.

PS360 can be download from https://github.com/davidkurtz/ps360.

Scalar Subquery Costing

A question came up on Oracle-l list-server a few days ago about how Oracle calculates costs for a scalar subquery in the select list. The question included an example to explain the point of the question. I’ve reproduced the test below, with the output from an 18.3 test system. The numbers don’t match the numbers produced in the original posting but they are consistent with the general appearance.

rem
rem     Script:         ssq_costing.sql
rem     Author:         Jonathan Lewis
rem     Dated:          May 2019
rem     Purpose:        
rem
rem     Last tested 
rem             18.3.0.0
rem             12.2.0.1
rem

create table t_1k ( n1 integer ) ;
create table t_100k ( n1 integer ) ;

insert into t_1k
  select
         level
    from dual
    connect by level <= 1e3;

insert into t_100k
  select level
    from dual
    connect by level <= 1e5;

commit ;

begin
  dbms_stats.gather_table_stats ( null, 'T_1K') ;
  dbms_stats.gather_table_stats ( null, 'T_100K') ;
end ;
/

explain plan for
select 
        /*+ qb_name(QB_MAIN) */
        (
        select /*+ qb_name(QB_SUBQ) */ count(*)
        from t_1k
        where t_1k.n1 = t_100k.n1
        )
from t_100k
;

select * from table(dbms_xplan.display);

-----------------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |   100K|   488K|  1533K  (2)| 00:01:00 |
|   1 |  SORT AGGREGATE    |        |     1 |     4 |            |          |
|*  2 |   TABLE ACCESS FULL| T_1K   |     1 |     4 |    17   (0)| 00:00:01 |
|   3 |  TABLE ACCESS FULL | T_100K |   100K|   488K|    36   (9)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("T_1K"."N1"=:B1)

The key point to note is this – the scalar subquery has to execute 100,000 times because that’s the number of rows in the driving table. The cost for executing the scalar subquery once is 17 – so the total cost of the query should be 1,700,036 – not 1,533K (and for execution plans the K means x1000, not x1024). There’s always room for rounding errors, of course, but a check of the 10053 (CBO trace) file shows the numbers to be 17.216612 for the t_1k tablescan, 36.356072 for the t_100K tablescan, and 1533646.216412 for the whole query. So how is Oracle managing to get a cost that looks lower than it ought to be?

There’s plenty of scope for experimenting to see how the numbers change – and my first thought was simply to see what happens as you change the number of distinct values in the t_100K.n1 column. It would be rather tedious to go through the process of modifying the data a few hundred times to see what happens, so I took advantage of the get_column_stats() and set_column_stats() procedures in the dbms_stats package to create a PL/SQL loop that faked a number of different scenarios that lied about the actual table data.


delete from plan_table;
commit;

declare

        srec                    dbms_stats.statrec;
        n_array                 dbms_stats.numarray;

        m_distcnt               number;
        m_density               number;
        m_nullcnt               number;
        m_avgclen               number;


begin

        dbms_stats.get_column_stats(
                ownname         => user,
                tabname         => 't_100k',
                colname         => 'n1', 
                distcnt         => m_distcnt,
                density         => m_density,
                nullcnt         => m_nullcnt,
                srec            => srec,
                avgclen         => m_avgclen
        ); 

        for i in 1 .. 20 loop

                m_distcnt := 1000 * i;
                m_density := 1/m_distcnt;

                dbms_stats.set_column_stats(
                        ownname         => user,
                        tabname         => 't_100k',
                        colname         => 'n1', 
                        distcnt         => m_distcnt,
                        density         => m_density,
                        nullcnt         => m_nullcnt,
                        srec            => srec,
                        avgclen         => m_avgclen
                ); 


        execute immediate
        '
                explain plan set statement_id = ''' || m_distcnt || 
        '''
                for
                select
                        /*+ qb_name(QB_MAIN) */
                        (
                        select /*+ qb_name(QB_SUBQ) */ count(*)
                        from t_1k
                        where t_1k.n1 = t_100k.n1
                        )
                from t_100k
        ';
        
        end loop;       

end;
/

The code is straightforward. I’ve declared a few variables to hold the column stats from the t_100k.n1 column, called get_column stats(), then looped 20 times through a process that changes the number of distinct values (and corresponding density) recorded in the column stats, then used execute immediate to call “explain plan” for the original query.

You’ll notice I’ve given each plan a separate statement_id that corresponds to the num_distinct that generated the plan. In the code above I’ve changed the num_distinct from 1,000 to 20,000 in steps of 1,000.

Once the PL/SQL block ends I’ll have a plan table with 20 execution plans stored in it and, rather than reporting those plans with calls to dbms_xplan.display(), I’m going to be selective about which rows and columns I report.

select
        statement_id, 
        io_cost,
        io_cost - lag(io_cost,1) over (order by to_number(statement_id)) io_diff,
        cpu_cost,
        cpu_cost - lag(cpu_cost,1) over (order by to_number(statement_id)) cpu_diff,
        cost
from 
        plan_table
where 
        id = 0
order by 
        to_number(statement_id)
;

I’ve picked id = 0 (the top line of the plan) for each statement_id and I’ve reported the cost column, which is made up of the io_cost column plus a scaled down value of the cpu_cost column. I’ve also used the analytic lag() function to calculate how much the io_cost and cpu_cost changed from the previous statement_id. Here are my results from 18c:


STATEMENT_ID                      IO_COST    IO_DIFF   CPU_COST   CPU_DIFF       COST
------------------------------ ---------- ---------- ---------- ---------- ----------
1000                                17033            1099838920                 17253
2000                                34033      17000 2182897480 1083058560      34470
3000                                51033      17000 3265956040 1083058560      51686
4000                                68033      17000 4349014600 1083058560      68903
5000                                85033      17000 5432073160 1083058560      86119
6000                               102033      17000 6515131720 1083058560     103336
7000                               119033      17000 7598190280 1083058560     120553
8000                               136033      17000 8681248840 1083058560     137769
9000                               153033      17000 9764307400 1083058560     154986
10000                              170033      17000 1.0847E+10 1083058560     172202
11000                              197670      27637 1.2608E+10 1760725019     200191
12000                              338341     140671 2.1570E+10 8962036084     342655
13000                              457370     119029 2.9153E+10 7583261303     463200
14000                              559395     102025 3.5653E+10 6499938259     566525
15000                              647816      88421 4.1287E+10 5633279824     656073
16000                              725185      77369 4.6216E+10 4929119846     734428
17000                              793452      68267 5.0565E+10 4349223394     803565
18000                              854133      60681 5.4431E+10 3865976350     865019
19000                              908427      54294 5.7890E+10 3459031472     920005
20000                              957292      48865 6.1003E+10 3113128324     969492

The first pattern that hits the eye is the constant change of 17,000 in the io_cost in the first few lines of the output. For “small” numbers of distinct values the (IO) cost of the query is (33 + 17 * num_distinct) – in other words, the arithmetic seems to assume that it will execute the query once for each value and then cache the results so that repeated executions for any given value will not be needed. This looks as if the optimizer is trying to match its arithmetic to the “scalar subquery caching” mechanism.

But things change somewhere between 10,000 and 11,000 distinct values. The point comes where adding one more distinct value causes a much bigger jump in cost than 17, and that’s because Oracle assumes it’s reached a point where there’s a value that it won’t have room for in the cache and will have to re-run the subquery multiple times for that value as it scans the rest of the table. Let’s find the exact break point where that happens.

Changing my PL/SQL loop so that we calculate m_distcnt as “19010 + i” this is the output from the final query:


-- m_distcnt := 10910 + i;

STATEMENT_ID                      IO_COST    IO_DIFF   CPU_COST   CPU_DIFF       COST
------------------------------ ---------- ---------- ---------- ---------- ----------
10911                              185520            1.1834E+10                187887
10912                              185537         17 1.1835E+10    1083059     187904
10913                              185554         17 1.1836E+10    1083058     187921
10914                              185571         17 1.1837E+10    1083059     187938
10915                              185588         17 1.1838E+10    1083058     187956
10916                              185605         17 1.1839E+10    1083059     187973
10917                              185622         17 1.1841E+10    1083059     187990
10918                              185639         17 1.1842E+10    1083058     188007
10919                              185656         17 1.1843E+10    1083059     188025
10920                              185673         17 1.1844E+10    1083058     188042
10921                              185690         17 1.1845E+10    1083059     188059
10922                              185707         17 1.1846E+10    1083058     188076
10923                              185770         63 1.1850E+10    4027171     188140
10924                              185926        156 1.1860E+10    9914184     188298
10925                              186081        155 1.1870E+10    9912370     188455
10926                              186237        156 1.1880E+10    9910555     188613
10927                              186393        156 1.1890E+10    9908741     188770
10928                              186548        155 1.1900E+10    9906928     188928
10929                              186703        155 1.1909E+10    9905114     189085
10930                              186859        156 1.1919E+10    9903302     189243

If we have 10,922 distinct values in the column the optimizer calculates as if it will be able to cache them all; but if we have 10,923 distinct values the optimizer thinks that there’s going to be one value where it can’t cache the result and will have to run the subquery more than once.

Before looking at this in more detail let’s go to the other interesting point – when does the cost stop changing: we can see the cost increasing as the number of distinct values grows, we saw at the start that the cost didn’t seem to get as large as we expected, so there must be a point where it stops increasing before it “ought” to.

I’ll jump straight to the answer: here’s the output from the test when I start num_distinct off at slightly less than half the number of rows in the table:


 -- m_distcnt := (50000 - 10) + i;

STATEMENT_ID                      IO_COST    IO_DIFF   CPU_COST   CPU_DIFF       COST
------------------------------ ---------- ---------- ---------- ---------- ----------
49991                             1514281            9.6488E+10               1533579
49992                             1514288          7 9.6489E+10     473357    1533586
49993                             1514296          8 9.6489E+10     473337    1533594
49994                             1514303          7 9.6490E+10     473319    1533601
49995                             1514311          8 9.6490E+10     473299    1533609
49996                             1514318          7 9.6491E+10     473281    1533616
49997                             1514325          7 9.6491E+10     473262    1533624
49998                             1514333          8 9.6492E+10     473243    1533631
49999                             1514340          7 9.6492E+10     473224    1533639
50000                             1514348          8 9.6493E+10     473205    1533646
50001                             1514348          0 9.6493E+10          0    1533646
50002                             1514348          0 9.6493E+10          0    1533646
50003                             1514348          0 9.6493E+10          0    1533646
50004                             1514348          0 9.6493E+10          0    1533646
50005                             1514348          0 9.6493E+10          0    1533646
50006                             1514348          0 9.6493E+10          0    1533646
50007                             1514348          0 9.6493E+10          0    1533646
50008                             1514348          0 9.6493E+10          0    1533646
50009                             1514348          0 9.6493E+10          0    1533646
50010                             1514348          0 9.6493E+10          0    1533646

The cost just stops changing when num_distinct = half the rows in the table.

Formulae

During the course of these experiments I had been exchanging email messages with Nenad Noveljic via the Oracle-L list-server (full monthly archive here) and he came up with the suggesion of a three-part formula that assumed a cache size and gave a cost of

  • “tablescan cost + num_distinct * subquery unit cost” for values of num_distinct up to the cache size;
  • then, for values of num_distinct greater than the cache_size and up to half the size of the table added a marginal cost representing the probability that some values would not be cached;
  • then for values of num_distinct greater than half the number of rows in the table reported the cost associated with num_distinct = half the number of rows in the table.

Hence:

  • for 1 <= num_distinct <= 10922, cost = (33 + num_distinct + 17)
  • for 10,923 <= num_distinct <= 50,000, cost = (33 + 10,922 * 17) + (1 – 10,922/num_distinct) * 100,000 * 17
  • for 50,000 <= num_distinct <= 100,000, cost = cost(50,000).

The middle line needs a little explanation: ( 1-10,922 / num_distinct ) is the probability that a value will not be in the cache; this has to be 100,000 to give the expected number of rows that will not be cached, and then multiplied by 17 as the cost of running the subquery for those rows.

The middle line can be re-arranged as 33 + 17 * (10,922 + (1 – 10,922/num_distinct) * 100,000)

Tweaking

At this point I could modify my code loop to report the calculated value for the cost and compare it with the actual cost to show you that the two values didn’t quite match. Instead I’ll jump forward a little bit to a correction that needs to be made to the formula above. It revolves around how Oracle determines the cache size. There’s a hidden parameter (which I mentioned in CBO Fundamentals) that controls scalar subquery caching. In the book I think I only referenced it in the context of subqueries in the “where” clause. The parameter is “_query_execution_cache_max_size” and has a default value of 131072 (power(2,7)) – so when I found that the initial formula didn’t quite work I made the following observation:

  • 131072 / 10922 = 12.00073
  • 131072 / 12 = 10922.666…

So I put 1092.66667 into the formula to see if that would improve things.

For the code change I added a variable m_cost to the PL/SQL block, and set it inside the loop as follows:

m_cost := round(33 + 17 * (10922.66667 + 100000 * (1 - (10922.66667 / m_distcnt))));

Then in the “execute immediate” I changed the “explain plan” line to read:

explain plan set statement_id = ''' || lpad(m_distcnt,7) || ' - ' || lpad(m_cost,8) ||

This allowed me to show the formula’s prediction of (IO)cost in final output, and here’s what I got for values of num_distinct in the region of 10,922:


STATEMENT_ID                      IO_COST    IO_DIFF   CPU_COST   CPU_DIFF       COST
------------------------------ ---------- ---------- ---------- ---------- ----------
  10911 -   183901                 185520            1.1834E+10                187887
  10912 -   184057                 185537         17 1.1835E+10    1083059     187904
  10913 -   184212                 185554         17 1.1836E+10    1083058     187921
  10914 -   184368                 185571         17 1.1837E+10    1083059     187938
  10915 -   184524                 185588         17 1.1838E+10    1083058     187956
  10916 -   184680                 185605         17 1.1839E+10    1083059     187973
  10917 -   184836                 185622         17 1.1841E+10    1083059     187990
  10918 -   184992                 185639         17 1.1842E+10    1083058     188007
  10919 -   185147                 185656         17 1.1843E+10    1083059     188025
  10920 -   185303                 185673         17 1.1844E+10    1083058     188042
  10921 -   185459                 185690         17 1.1845E+10    1083059     188059
  10922 -   185615                 185707         17 1.1846E+10    1083058     188076
  10923 -   185770                 185770         63 1.1850E+10    4027171     188140
  10924 -   185926                 185926        156 1.1860E+10    9914184     188298
  10925 -   186081                 186081        155 1.1870E+10    9912370     188455
  10926 -   186237                 186237        156 1.1880E+10    9910555     188613
  10927 -   186393                 186393        156 1.1890E+10    9908741     188770
  10928 -   186548                 186548        155 1.1900E+10    9906928     188928
  10929 -   186703                 186703        155 1.1909E+10    9905114     189085
  10930 -   186859                 186859        156 1.1919E+10    9903302     189243

The formula is only supposed to work in the range 10923 – 50,000, so the first few results don’t match; but in the range 10,923 to 10,930 the match is exact. Then, in the region of 50,000 we get:


STATEMENT_ID                      IO_COST    IO_DIFF   CPU_COST   CPU_DIFF       COST
------------------------------ ---------- ---------- ---------- ---------- ----------
  49991 -  1514281                1514281            9.6488E+10               1533579
  49992 -  1514288                1514288          7 9.6489E+10     473357    1533586
  49993 -  1514296                1514296          8 9.6489E+10     473337    1533594
  49994 -  1514303                1514303          7 9.6490E+10     473319    1533601
  49995 -  1514311                1514311          8 9.6490E+10     473299    1533609
  49996 -  1514318                1514318          7 9.6491E+10     473281    1533616
  49997 -  1514325                1514325          7 9.6491E+10     473262    1533624
  49998 -  1514333                1514333          8 9.6492E+10     473243    1533631
  49999 -  1514340                1514340          7 9.6492E+10     473224    1533639
  50000 -  1514348                1514348          8 9.6493E+10     473205    1533646
  50001 -  1514355                1514348          0 9.6493E+10          0    1533646
  50002 -  1514363                1514348          0 9.6493E+10          0    1533646
  50003 -  1514370                1514348          0 9.6493E+10          0    1533646
  50004 -  1514377                1514348          0 9.6493E+10          0    1533646
  50005 -  1514385                1514348          0 9.6493E+10          0    1533646
  50006 -  1514392                1514348          0 9.6493E+10          0    1533646
  50007 -  1514400                1514348          0 9.6493E+10          0    1533646
  50008 -  1514407                1514348          0 9.6493E+10          0    1533646
  50009 -  1514415                1514348          0 9.6493E+10          0    1533646
  50010 -  1514422                1514348          0 9.6493E+10          0    1533646

Again, the formula applies only in the range up to 50,000 (half the rows in the table) – and the match is perfect in that range.

Next steps

The work so far gives us some idea of the algorithm that the optimizer is using to derive a cost, but this is just one scenario and there are plenty of extra questions we might ask. What, as the most pressing one, is the significance of the number 12 in the calculation 131,072/12. From previous experience I guess that is was related to the length of the input and output values of the scalar subquery – as in “value X for n1 returns value Y for count(*)”.

To pursue this idea I recreated the data sets using varchar2(10) as the definition of n1 and lpad(rownum,10) as the value – the “breakpoint” dropped from 10,922 down to 5,461. Checking the arithmetic 131,072 / 5461 = 24.001456, then 131,072/24 = 5461.333… And that’s the number that made fhe formular work perfectly for the modified data set.

Then I set used set_column_stats() to hack the avg_col_,len of t_100K.n1 to 15 and the break point dropped to 4,096.  Again we do the two arithmetic steps: 131072/4096 = 32 (but then we don’t need to do the reverse step since the first result is integral).

Checking the original data set when n1 was a numeric the avg_col_len was 5, so we have three reference points:

  • Avg_col_len = 5. “Cache unit size” = 12
  • Avg_col_len = 11. Cache unit size = 24 (don’t forget the avg_col_len includes the length byte, so our padded varchar2(10) has a length of 11).
  • Avg_col_len = 15, Cache unit size = 32

There’s an obvious pattern here: “Cache unit size” = (2 x avg_col_len + 2).  Since I hadn’t been changing the t_1k.n1 column at the same time, that really does look like a deliberate factor of 2 (I’d thought intially that maybe the 12 was affected by the lengths of both columns in the predicate – but that doesn’t seem to be the case.)

The scientific method says I should now make a prediction based on my hypothesis – so I set the avg_col_len for t_100K.n1 to 23 and guessed that the break point would be at 2730 – and it was.  (131072 / (2 * 23 + 2) = 2730.6666…) .

The next question, of course, is “where does the “spare 2″ come from?” Trying to minimize the change in the code I modified my subquery to select sum(to_number(n1)) rather than count(*), then to avg(to_number(n1)) – remember I had changed n1 to a varchar2(10) that looked like a number left-padded with spaces. In every variant of the tests I’d done so far all I had to do to get an exact match between the basic formula and the optimizer’s cost calculation was to use “2 * avg_col_len + 22” as the cache unit size – and 22 is the nominal maximum length of an internally stored numeric column.

Bottom line: the cache unit size seems to be related to the input and output values, but I don’t know why there’s a factor of 2 applied to the input column length, and I don’t know why the length of count(*) is deemed to be 2 when other derived numeric outputs use have the more intuitive 22 for their length.

tl;dr

The total cost calculation for a scalar subquery in the select list is largely affected by:

  • a fixed cache size (131,072 bytes) possibly set by hidden parameter _query_execution_cache_max_size
  • the avg_col_len of the input (correlating) column(s) from the driving table
  • the nominal length of the output (select list) of the subquery

There is an unexplained factor of 2 used with the avg_col_len of the input, and a slightly surprising value of 2 if the output is simply count(*).

If the number N of distinct values for the driving column(s) is less than the number of possible cache entries the effect of the scalar subquery is to add N * estimated cost of executing the subquery once.  As the number of distinct values for the driving column(s) goes above the limit then the incremental effect of the subquery is based on the expected number of times an input value will not be cached. When the number of distinct values in the driving column(s) exceeds half the number of rows in the driving table the cost stops increasing – there is no obvious reason when the algorithm does this.

There are many more cases that I could investigate at this point – but I think this model is enough as an indication of general method. If you come across a variation where you actually need to work out how the optimizer derived a cost then this framework will probably be enough to get you started in the right direction.

 

PFCLATK - Audit Trail Toolkit - Checksums

We have a toolkit called PFCLATK that is used in customer engagements to assist our customers to create comprehensive and useful audit trails for their databases. The toolkit is used in consulting engagements at the moment but will be adding....[Read More]

Posted by Pete On 06/06/19 At 03:08 PM

Hibernate for Oracle DBAs

Warning: any smart developer may feel sick when reading this ;)

I am not a developer, but I like to discuss with developers: share my side of the IT (the database that we want rock stable and durable) and listen to their side (the application that they want easy to maintain and evolve). And, as I like to understand what I’m talking about, I often need to test some snippets.

Many DBAs complain about Hibernate when they come upon the queries generated by a wrong mapping. They think it was designed to be bad (who would do that?). And they are convinced that JDBC and SQL are sufficient to build applications. Actually, many DBAs I have seen are persuaded that they understand everything about coding because they have written some ugly PERL scripts to automate their job. And that anything going beyond has the only goal to break the database.

I didn’t go this way. As I like to understand before building my opinion, I have read “Hibernate In Action” and tested some object to relational mapping. I’m talking about Hibernate 3 times here. I’ve found those tests in a folder from 2008. I re-used this today for a short test and that’s the reason for this post.

Here I am showing how I’m doing those small tests with Hibernate. I’m a DBA and I cannot have an Eclipse environment taking all my screens, hiding those database top activity charts. And all my RAM is already eaten by SQL Developer and Chrome Grid Control windows, no room for Eclipse. And anyway, those mouse-focused IDEs are not friends with my carpal tunnel. I like the keyboard and tty.

So, for simple tests, I need simple things which can be reduced to a command line and 1 file that I can open with vi. The goal of this post is to show how it is easy to test some Hibernate thing in this case. Of course, any real developer will vomit when looking at this… don’t forget this is about short tests only.

Libraries

So, no Maven for me. I download the whole Hibernate .zip and build a CLASSPATH with everything I found in the required lib folder.

wget https://netix.dl.sourceforge.net/project/hibernate/hibernate-orm/5.4.3.Final/hibernate-release-5.4.3.Final.zip
unzip hibernate-release-5.4.3.Final.zip
for i in hibernate-release-5.4.3.Final/lib/required/*.jar 
do
CLASSPATH="${CLASSPATH}:$i"
done
export CLASSPATH=.:$ORACLE_HOME/jdbc/lib/ojdbc8.jar:$CLASSPATH

You can see that I’ve added the Oracle JDBC as I’ll connect to an Oracle database that I have locally (I use Oracle Cloud DBaaS here).

Compile

No Ant here. I compile the .java files I have in my folder (I don’t use packages and subfolders for simple tests). Note that, from a past admiration for makefiles, I add enough intelligence (like “test -nt”) to compile only when the code is newer than the source.

for i in *.java
do
if [ $i -nt $(basename $i .java).class ]
then
$ORACLE_HOME/jdk/bin/javac $i
fi
done

ORM Mapping

My goal was to quickly test the following mapping from @vlad_mihalcea:

The best way to map a Composite Primary Key with JPA and Hibernate - Vlad Mihalcea

So what do I have in those .java files? Testing Hibernate needs having many classes. And in Java, each class goes to its own file. But did I say that I want to open only 1 file? I use inner classes.

import java.io.*;
import java.sql.*;
import java.util.*;
import oracle.jdbc.*;
import org.hibernate.*;
import org.hibernate.cfg.*;
import javax.persistence.*;
public class Franck {
@Entity(name = "Company")
@Table(name = "company")
public class Company {
    @Id
@GeneratedValue(strategy=GenerationType.IDENTITY)
private Long id;
    private String name;
    public Long getId() {
return id;
}
public void setId(Long id) {
this.id = id;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (!(o instanceof Company)) return false;
Company company = (Company) o;
return Objects.equals(getName(), company.getName());
}
@Override
public int hashCode() {
return Objects.hash(getName());
}
}
@Embeddable
public class EmployeeId implements Serializable {
@ManyToOne
@JoinColumn(name = "company_id")
private Company company;
    @Column(name = "employee_number")
private Long employeeNumber;
    public EmployeeId() {
}
public EmployeeId(Company company, Long employeeId) {
this.company = company;
this.employeeNumber = employeeId;
}
public Company getCompany() {
return company;
}
public Long getEmployeeNumber() {
return employeeNumber;
}
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (!(o instanceof EmployeeId)) return false;
EmployeeId that = (EmployeeId) o;
return Objects.equals(getCompany(), that.getCompany()) &&
Objects.equals(getEmployeeNumber(), that.getEmployeeNumber());
}
@Override
public int hashCode() {
return Objects.hash(getCompany(), getEmployeeNumber());
}
}
@Entity(name = "Employee")
@Table(name = "employee")
public class Employee {
@EmbeddedId
private EmployeeId id;
    private String name;
    public EmployeeId getId() {
return id;
}
public void setId(EmployeeId id) {
this.id = id;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
}
public static void main(String[] args) throws SQLException {
SessionFactory sf=new Configuration()
.addAnnotatedClass(Employee.class)
.addAnnotatedClass(Company.class)
.setProperty("hibernate.connection.url","jdbc:oracle:thin:@//localhost/PDB1")
.setProperty("hibernate.connection.driver_class","oracle.jdbc.driver.OracleDriver")
.setProperty("hibernate.connection.username","demo")
.setProperty("hibernate.connection.password","demo")
.setProperty("hibernate.format_sql","true")
.setProperty("hibernate.show_sql","true")
.setProperty("hibernate.hbm2ddl.auto","create")
.buildSessionFactory();
}
}

The main class name (here Franck) matches the file name (Franck.java) and all my entities are inner classes here. After compilation here are my files:

-rwxr--r--. 1 oracle 4931 Jun  5 20:18 Franck.java
-rw-r--r--. 1 oracle 1304 Jun 5 20:19 Franck$Company.class
-rw-r--r--. 1 oracle 1398 Jun 5 20:19 Franck$EmployeeId.class
-rw-r--r--. 1 oracle 922 Jun 5 20:19 Franck$Employee.class
-rw-r--r--. 1 oracle 1557 Jun 5 20:19 Franck.class

Execution

$ORACLE_HOME/jdk/bin/java Franck

This generates the following:

That’s all I need to verify what my annotations generate with the Oracle 12c Dialect. Ugly on-file code, but sufficient for this goal.

Java as a Shell

Ok, now that I think that any real developer has stopped reading, I can confess that I add the following on the top of my .java file:

/*TAG-FOR-SHELL 2>/dev/null
CLASSPATH=.
# Oracle JDBC
CLASSPATH=${CLASSPATH}:$ORACLE_HOME/jdbc/lib/ojdbc8.jar
# Download Hibernate
[ -f /var/tmp/hibernate.zip ] || wget -O /var/tmp/hibernate.zip https://netix.dl.sourceforge.net/project/hibernate/hibernate-orm/5.4.3.Final/hibernate-release-5.4.3.Final.zip
# Unzip Hibernate
[ -d /var/tmp/hibernate*?/lib/required ] || unzip -d /var/tmp /var/tmp/hibernate.zip
# add libs to CLASSPATH
for l in /var/tmp/hibernate*?/lib/required/*.jar ; do CLASSPATH="${CLASSPATH}:$l" ; done ; export CLASSPATH
# compile all Java
for s in $(find -name "*.java"); do
s="$(basename $s .java)"
[ $s.java -nt $s.class ] && {
echo "Compiling $s..." >&2
$ORACLE_HOME/jdk/bin/javac $s.java || exit 1
}
done
# execute
$ORACLE_HOME/jdk/bin/java $(basename $0 .java)
exit
*/

And then I “chmod u+x” this .java file and run it as a shell. The shell part is included in Java comment so that this file can be compiled as Java source code. And it gets the libraries if not there already, builds CLASSPATH, compile what’s new in the directory, and run it as a Java program. all that with a simple:

./Franck.java

Any comments welcome on Twitter: @FranckPachot

Kscope Sunday will be awesome

Yeah yeah I know. What kind of dufus cut-pastes a giant image into their blog post. That would be ….. me Smile

But the reality is, there is just soooooo much going on this year on the Sunday before the “official” start of the Kscope conference, it was too hard to condense it into a few sentences. So I just dumped the image from the website, but you can read about it here.

It’s my first Kscope and it will be blast to be running a day of high quality but dirt cheap learning with Maria, Jeff, Steven and Blaine.

And we’ll even have some fun and games. How good is your knowledge of the Oracle Database and the Oracle community? Our Sunday quiz will let you shine! (Well, to be honest, we’ve designed the quiz so that anyone in the community has a chance to win, so come along to have some fun and maybe pick up some prizes as well!)

See you in Seattle!

 

image

So long …

… and thanks for any Douglas Adams references.

I decided on Friday that after more than 31 years using the Oracle software and more than 37 years of being self-employed in the computer industry – not to mention a further 8 years being an enthusiastic amateur – the time had finally come for a change of pace.  As from 1st June 2019 I am retired.

I have a couple of long term commitments that I’ll still be doing a little work for over the next few months, and I wouldn’t be surprised if someone occasionally tempts me into doing a few days solving interesting problems.  I’ll still be answering questions on the Oracle Developer Community forum and (if the email gets through) on the Oracle-L listserver , and I’ll still be writing blog notes about Oracle and submitting abstracts to conferences for a few years. Who knows, I may even get round to writing another book – after all, I’ve already downloaded Oracle 19 so I’ll still be experimenting, investigating, and trying to break things Some people retire to do the Times crossword or Sudoku, my retirement pastime is going to be playing with Oracle.

But now I’ve got 3 musical instruments to re-learn, a garden to re-engineer, apple trees to train, and a lot of rowing [U.S. “crew”] time to catch up on so that I can get fit enough to get back to playing squash (which might be an interesting experience now that my eyesight has been fixed). And for relaxation I’ll be sitting in the garden watching the birds, drinking fine wine and reading a good book.

Update (6th June 2019)

Thanks for all the appreciative comments to know. It’s nice to have a record of the fact that I have made a difference for so many people.

 

Ignoring Hints

One of the small changes (and, potentially big but temporary, threats) in 18.3 is the status of the “ignore hints” parameter. It ceases to be a hidden (underscore) parameter so you can now officially set parameter optimizer_ignore_hints to true in the parameter file, or at the system level, or at the session level. The threat, of course, it that some of your code may use the hidden version of the parameter (perhaps in an SQL_Patch as an opt_param() option rather than in its hint form) which no longer works after the upgrade.

But there’s more. The parameter (whether the old hidden version or the new revealed version) doesn’t make the optimizer ignore parallel() hints. But 18.3 now has a related parameter optimizer_ignore_parallel_hints to address this limitation. Here’s a quick demo – we start by creating a table and then running a query where the full tablescan is clearly the default strategy that the optimizer would take if we didn’t hint an indexed access path:

rem
rem     Script:         ignore_parallel_hints.sql
rem     Author:         Jonathan Lewis
rem     Dated:          May 2019
rem 

create table t1
as
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        rownum                          id,
        mod(rownum,10)                  n1,
        lpad(rownum,10,'0')             v1,
        lpad('x',100,'x')               padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e4 -- > comment to avoid WordPress format issue
;

create index t1_i1 on t1(id);

set serveroutput off

prompt  =============
prompt  Baseline test
prompt  =============

select
        /*+ index(t1) */
        n1, sum(id)
from
        t1
where
        id > 0
group by
        n1
order by
        n1
;

select * from table(dbms_xplan.display_cursor);


SQL_ID  gudnnk7j7q5bz, child number 0
-------------------------------------
select  /*+ index(t1) */  n1, sum(id) from  t1 where  id > 0 group by
n1 order by  n1

Plan hash value: 356059923

----------------------------------------------------------------------------------------------
| Id  | Operation                            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |       |       |       |   198 (100)|          |
|   1 |  SORT GROUP BY                       |       |    10 |    70 |   198   (2)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T1    | 10000 | 70000 |   196   (1)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                  | T1_I1 | 10000 |       |    22   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("ID">0)


Now we repeat the exercise with the version-specific “alter session” command below – and you should try each option with each version of Oracle if you want to do the complete test cycle – to see that the session will ignore hints and the plan will change (side note – using the underscore version  with 18.3 doesn’t raise an error, the statement is silently ignored):


alter session set "_optimizer_ignore_hints" = true;
alter session set "optimizer_ignore_hints" = true;

SQL_ID  gudnnk7j7q5bz, child number 1
-------------------------------------
select  /*+ index(t1) */  n1, sum(id) from  t1 where  id > 0 group by
n1 order by  n1

Plan hash value: 3946799371

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |    27 (100)|          |
|   1 |  SORT GROUP BY     |      |    10 |    70 |    27  (12)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL| T1   | 10000 | 70000 |    25   (4)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("ID">0)

Then, of course, we have to test a query with a parallel() hint – or shared() hint, which is the internal equivalent you will notice occasionally in outlines or the “remote” statement for distributed execution plans – to show that we don’t yet ignore parallel queries – the plans following the code are from 18.3:


select
        /*+ parallel(t1 3) */
        n1, sum(id)
from
        t1
where
        id > 0
group by
        n1
order by
        n1
;

select * from table(dbms_xplan.display_cursor);

alter session set "optimizer_ignore_parallel_hints" = true;

select
        /*+ parallel(t1 3) */
        n1, sum(id)
from
        t1
where
        id > 0
group by
        n1
order by
        n1
;

select * from table(dbms_xplan.display_cursor);




SQL_ID  7jynurdtc48kv, child number 0
-------------------------------------
select  /*+ parallel(t1 3) */  n1, sum(id) from  t1 where  id > 0 group
by  n1 order by  n1

Plan hash value: 2919148568

------------------------------------------------------------------------------------------------------------------
| Id  | Operation                | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |          |       |       |    10 (100)|          |        |      |            |
|   1 |  PX COORDINATOR          |          |       |       |            |          |        |      |            |
|   2 |   PX SEND QC (ORDER)     | :TQ10001 |    10 |    70 |    10  (10)| 00:00:01 |  Q1,01 | P->S | QC (ORDER) |
|   3 |    SORT GROUP BY         |          |    10 |    70 |    10  (10)| 00:00:01 |  Q1,01 | PCWP |            |
|   4 |     PX RECEIVE           |          |    10 |    70 |    10  (10)| 00:00:01 |  Q1,01 | PCWP |            |
|   5 |      PX SEND RANGE       | :TQ10000 |    10 |    70 |    10  (10)| 00:00:01 |  Q1,00 | P->P | RANGE      |
|   6 |       HASH GROUP BY      |          |    10 |    70 |    10  (10)| 00:00:01 |  Q1,00 | PCWP |            |
|   7 |        PX BLOCK ITERATOR |          | 10000 | 70000 |     9   (0)| 00:00:01 |  Q1,00 | PCWC |            |
|*  8 |         TABLE ACCESS FULL| T1       | 10000 | 70000 |     9   (0)| 00:00:01 |  Q1,00 | PCWP |            |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   8 - access(:Z>=:Z AND :Z<=:Z) -- > comment added to avoid wordpress format issue
       filter("ID">0)

Note
-----
   - Degree of Parallelism is 3 because of table property


Session altered.


SQL_ID  7jynurdtc48kv, child number 1
-------------------------------------
select  /*+ parallel(t1 3) */  n1, sum(id) from  t1 where  id > 0 group
by  n1 order by  n1

Plan hash value: 3946799371

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |    27 (100)|          |
|   1 |  SORT GROUP BY     |      |    10 |    70 |    27  (12)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL| T1   | 10000 | 70000 |    25   (4)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("ID">0)


tl;dr

In 18.3 Oracle exposes the parameter optimizer_ignore_hints – any code using depending on the hidden version of this parameter will no longer behave as expected. 18.3 also introduces optimiser_ignore_parallel_hints to allow you to ignore parallel hints as well.