Announcing the Oracle Database Source Data Integration

We are excited to announce that the Meroxa Platform now supports data integrations for Oracle Databases.

Oracle Database, also known as Oracle or Oracle DB, is a relational database management system (RDBMS) developed by Oracle Corporation. It is one of the most widely used databases in the world by large enterprise companies that require robust and dependable database solutions to store, process, and access data at a massive scale.

Oracle Database as a Source

Meroxa's Turbine application framework lets you write code naturally by using Meroxa Functions to alter incoming data records and events from an Oracle Database source data stream before arriving at any downstream destinations, whether another database or system. The Turbine application framework supports programming languages such as JavaScript, Python, Ruby, and Go.

When you deploy a Turbine streaming application with an Oracle Database source, the Meroxa Platform takes an initial snapshot of the source table. Once the snapshot is complete, it begins tracking new data records and events, including INSERT, UPDATE, and DELETE operations, and pushes them into the data stream. 

Real-Time Data Streaming with Change Data Capture

Using Change Data Capture (CDC), we can process Oracle Database data record events in real-time. We do this by creating a tracking table and a database trigger to track event records.

The tracking table and trigger name have the same names as a source table with the prefix MEROXA. The tracking table has all the same columns as the source table plus three additional columns:

Column name

Description

CONNECTOR_TRACKING_ID

The auto-increment index for the position.

CONNECTOR_OPERATION_TYPE

The operation type, whether INSERT, UPDATE, or DELETE.

CONNECTOR_TRACKING_CREATED_AT

The timestamp of event record creation in the tracking table.

An event record will be written in a tracking table when data is added, changed, or deleted from an Oracle Database table. The queries retrieving these event records from the tracking table are similar to those used in Snapshot mode but with CONNECTOR_TRACKING_ID as the ordering column.

An Ack method will collect the CONNECTOR_TRACKING_ID of the event records successfully applied and are later removed from the tracking table every 5 seconds or when the connection is closed.

Things to be aware of...

  • Changes sometimes need to be made to columns in an Oracle Database table. The changes must also be applied to the tracking table when this happens by your Oracle Database administrator.
  • All tracking information only exists within the Oracle Database. Upon deletion of the tracking table, the tracking process will restart from the beginning by initiating a new snapshot of the table, which could lead to unintended replication of data downstream.

Creating an Oracle Database Resource on the Meroxa Platform

Customers can create Resources for Oracle Databases using the Meroxa CLI or Dashboard. You must have a Meroxa account and be logged in to your account to get started.

Meroxa CLI

In the following example, we create an Oracle Database Resource named my-oracle-db. Resource names may contain lowercase letters, numbers, underscores, and hyphens. Use this name to reference your Oracle Database when writing your Turbine application code.

Using the Meroxa CLI, run the following command:

$ meroxa resource create my-oracle-db --type oracle –-url oracle://user:password@host.com:1571/database

Meroxa Dashboard

Below are the steps required to create an Oracle Database Resource using the Meroxa Dashboard:

  1. Navigate to the Resources tab.
  2. Click the Add a Resource button.
  3. Search for Oracle DB using the search bar.
  4. Click the Add Resource button for Oracle DB.
  5. Confirm you are on the Add a resource form with Oracle DB selected.
  6. Provide a valid Resource Name (e.g., my-oracle-db, myoracle, oracle123).
  7. Provide a valid Connection URL for your Oracle Database instance (.e.g., oracle://user:password@host.com:1571/database-name)
  8. Click the Save button.

A notification in the dashboard will appear once your Oracle Database Resource has been successfully created. 

Using Oracle Database as a Source with Turbine

Using Turbine, customers can use any Turbine-supported language such as JavaScript, Python, Ruby, or Go to stream and transform business-critical data from an Oracle Database table to any destination.

The following example demonstrates how to do this with TurbinePy using Python.

First, initialize your Turbine streaming app by running the following command in the Meroxa CLI:

$ meroxa app init my-first-app –-lang python

You should receive confirmation from the Meroxa CLI that you've initialized your Turbine streaming app, meaning the application files have been created locally in the current directory. 

From this point, run the following command to get to the root of the application code. 

$ cd my-first-app

Within your Turbine streaming app you will discover a main.py file. Open this with your preferred code editor. You will see self-documented boilerplate code with a custom function written against the example data record set provided in a fixtures directory. 

Look for the following code with the resources and records methods:

source = await turbine.resources("source_name")
records = await source.records("collection_name", {})

The resources method is used to specify a Resource on the Meroxa Platform. Replace source_name with the name of your Oracle Database Resource. In the following example, we’ll use the name we used when creating the Oracle Database Resource my-oracle-db.

source = await turbine.resources("my-oracle-db")

The records method is used to specify the respective table you wish to use as the source of data. In the following example, there is a table called transactions. Replace collection_name with the name of your Oracle Database table. 

In addition, you will need to indicate which field will be used for ordering rows. This column must contain unique values that can be used for sorting otherwise, the snapshot will not work properly. In the following example, we will use the id column.

records = await source.records("transactions",{"orderingColumn":"id"})

There are a few additional configurations for Oracle Database Source data integrations that can be defined in your Turbine application code. In the following example, we want to change the batchSize from its default value of 1000 to 2000. We do this by including another key value pairing in the configuration which is the second argument of  the records method.

records = await source.records("transactions",{"orderingColumn":"id",”batchSize”:”2000”})

Below is a list of the supported configurations for Oracle Database sources.

Configuration

Requirement

Description

Example value

orderingColumn

Required.

The column name that the connector will use for ordering rows. Column must contain unique values and be suitable for sorting, otherwise the snapshot won't work correctly.

id

snapshot

Optional, default value is true.

Enables or disables snapshots of the entire Oracle DB table before starting Change Data Capture (CDC) mode.

false

batchSize

Optional, default value is 1000.

Sets the size of the rows batch. Min is 1 and max is 100000.

100

keyColumns

Optional.

If the field is empty, the connector makes a request to the database and uses the received list of primary keys of the specified table. If the table does not contain primary keys, the connector uses the value of the orderingColumn field as the keyColumns value.

id,uuid

columns

Optional.

A list of column names that should be included in each record's payload, by default includes all columns.

id,name,age

You’re ready to start streaming with an Oracle Database as a source for your Turbine streaming app!

What's next?

All that's left for you to do is to write function code to transform the streaming data and event records to a downstream set of Resources or third-party APIs.

Need ideas for a Turbine app using Oracle Database as a source? Check out our example Turbine apps to get started. But don't let this example hinder you. The sky's the limit for what you and your team can achieve.

We can’t wait to see what you build! 🚀

As always,

Topics: Meroxa, Change Data Capture, Data migration, CDC, Oracle