MariaDB CONNECT – Avoiding the pitfalls!

mariaDB

There will come a time when you need to make data available to your mariaDB application from other database management systems. The CONNECT functionality allows you to do this. This article will cover how to use it to access remote data and some of the challenges and pitfalls you may encounter.

In one of our recent projects, we needed to calculate some count statistics from two Oracle 11g database tables and store the results in our mariaDB 10.0.22 database. We were dealing with approximately 2 million rows on each of the Oracle tables and, as we were calculating set theory counts, we needed to compare the keys on both tables. The tables were indexed correctly and performance within Oracle was really good.

In order to access the Oracle tables we need to set CONNECT up. Having rushed through the CONNECT documentation, we set up two CONNECT tables in our mariaDB database, one for each of the remote Oracle tables.

The mariadb create table statements looked a bit  like this:

CREATE TABLE CONNECT_Remote_Data_TableA

ENGINE=CONNECT

TABLE_TYPE=ODBC

TABNAME=TableA

CONNECTION=’Driver={Oracle ODBC driver};Server=://xxx.xxx.xxx.xxx:1521/ORCL;UID=USERID;PWD=PASSWORD;’

 

CREATE TABLE CONNECT_Remote_Data_TableB

ENGINE=CONNECT TABLE_TYPE=ODBC

TABNAME=TableB

CONNECTION=’Driver={Oracle ODBC driver};Server=://xxx.xxx.xxx.xxx:1521/ORCL;UID=USERID;PWD=PASSWORD;’

When we ran these, the result was successful and the two tables were created. A quick test via “Select * from CONNECT_Remote_Data_TableA” proved that data was indeed flowing from Oracle to mariaDB.

We built our queries in mariaDB, referring to the CONNECT tables and started our unit testing. The results were good and we could insert the data returned from them into a mariaDB table. CONNECT was a success and we could now push on with the rest of the development, having built and tested this functionality.

Everything went well until we started to ramp up the volume in the Oracle tables. Then we witnessed an alarming degradation in performance that got worse as we added more and more data. At first we struggled to understand what the problem was – the tables were indexed after all and, therefore, access should be really quick. It was only when we started to think through what  CONNECT table actually was and did some more reading that we found the problem. The solution was based around where the actual SQL Query was being executed.

Here is a representation of what we had built:

maria_connect_pic1

In this configuration, our SQL query was running in mariaDB and drawing data from the Oracle tables. MariaDB inserted the result into the results table but it was very slow. Out of interest, we took the SQL query, converted it to Oracle PL/SQL and ran it in Oracle. The results were lightening quick as you’d expect them to be as the tables were correctly indexed. So, the problem was related to where the SQL ran:

  • In mariaDB – very slow
  • In Oracle – very fast

What’s the usual solution to make a slow query run quickly? Indexing. So we looked at that. In our rush to get this up and running, we had missed the fact that ODBC CONNECT tables cannot be indexed. In effect, all we had created was a conduit or “pipe” to the data which arrived in a stream of unindexed rows that mariaDB then had to work heroically to produce our results from.

So how could we make use of the Oracle indexing within our query and still get the results into mariaDB? It seemed that we needed to “push down” the SQL query to the Oracle end of the CONNECT “pipe”. To do this, we realised that we only needed a single mariaDB CONNECT table but that table would need the SRCDEF parameter adding to it. SRCDEF allows you to execute SQL on the remote database system instead of in mariaDB. The SRCDEF needed to contain a PL/SQL query as it would be running native to Oracle. Our new CONNECT statement looked like this:

CREATE TABLE CONNECT_Remote_Data_Count

ENGINE=CONNECT

TABLE_TYPE=ODBC

TABNAME=TableA

CONNECTION=’Driver={Oracle ODBC driver};Server=://xxx.xxx.xxx.xxx:1521/ORCL;UID=USERID;PWD=PASSWORD;’

SRCDEF=’…PL/SQL equivalent of PSEUDO SQL: Count the entries on TableA that are also on TableB…”

However, when we executed a “Select count(*) from CONNECT_Remote_Data_Count” we received a strange result – 1. The answer was returned very quickly, which was encouraging. However, we knew that this wasn’t the correct answer – we expected many thousands of entries to be on both tables. After a little more head scratching, we tried “Select * from CONNECT_Remote_Data_Count” and viola – our expected result was returned. In effect we were selecting the content of the CONNECT table’s query.

So we now had an Oracle PL/SQL query that was wrapped inside a mariaDB CONNECT “pipe” and being executed remotely in an Oracle database where it could make full use of the indexing. The result was then the only data item being sent down the “pipe” from Oracle to mariaDB.

The final solution looked like this:

maria_connect_pic2

So, as we can see, CONNECT is a powerful thing. It allowed us to build a solution that populated our mariaDB system with results from a query against two tables sat on an Oracle database. The full power of the indexing was utilised and the results were returned in a very fast time.

If you’d like to know more about how we are using CONNECT, then just get in touch.

 

This entry was posted in Connectivity, Data Flow, Databases, MariaDB, Oracle, Uncategorized and tagged , , , , , , , , , , . Bookmark the permalink.