It’s been about two and a half years since Enkitec took delivery of our first Exadata. (I blogged about it here: Weasle Stomping Day) Getting our hands on Exadata was very cool for all of us geeks. A lot has changed since then, but we’re still a bunch of geeks at heart and so this week we indulged our geekdom once again with the delivery of our Big Data Appliance (BDA). In case you haven’t heard about it, Oracle has released an engineered system that is designed to host “Big Data” (which is not my favorite term, but I’ll have to save that for some other time). The Hadoop ecosystem has taken off in the last couple of years and this is Oracle’s initial foray into the arena.
We started on an interesting mad scientist kind of project a couple of days ago.
One of our long time customers bought an Exadata last month. They went live with one system last week and are in the process of migrating several others. The Exadata has an interesting configuration. The sizing exercise done prior to the purchase indicated a need for 3 compute nodes, but the data volume was relatively small. In the end, a half rack was purchased and all four compute nodes were licensed, but 4 of the 7 storage servers were not licensed. So it’s basically a half rack with only 3 storage servers.
This past week I spent some time setting up and running various Hadoop workloads on my CDH cluster. After some Hadoop jobs had been running for several minutes, I noticed something quite alarming — the system CPU percentages where extremely high.
This cluster is comprised of 2s8c16t Xeon L5630 nodes with 96 GB of RAM running CentOS Linux 6.2 with java 1.6.0_30. The details of those are:
I am currently at a presentation of Patrick Schwanke, Quest Germany, regarding easy and high speed connect between NoSQL and Oracle Databases. Not really what I planned but as mentioned by Alex Nuijten in an earlier post, unstructured data and it’s handling is gaining ground, so I thought it would a good start do start …
Oracle Big Data Appliance (BDA) is being announced at the Oracle OpenWorld keynote as I’m posting this. It will take some time for it to be actually available for shipment and some details will likely change but here is what we have so far about Oracle Big Data Appliance. A rack with InfiniBand, full of [...]
Many analysts are suggesting that a big data appliance will be announced at this OOW. Based on published Oracle OpenWorld focus sessions on oracle.com (PDF documents), the following technologies will most likely be the key — Hadoop, NoSQL, Hadoop data loader for Oracle, R Language. Want more details — you have to wait for them. [...]
前回のテストでShared NothingのHadoop/HiveとShared EverythingのOracleの比較をした。結果はOracleの圧勝で、計算によると100台のhadoop環境でやっとOracle SE並みのスピードとなる。
select ps_partkey,sum(ps_supplycost * ps_availqty) as value
from partsupp, supplier, nation
where ps_suppkey = s_suppkey
and s_nationkey = n_nationkey
and n_name = 'INDIA'
group by ps_partkey having