Cdc

Read(195) Label: debezium, ogg,

1.  CdcCli external library offers Cdc external library functions. The directory containing files of Cdc external library is: installation directory\esProc\extlib\CdcCli. The Raqsoft core jar for this external library is [installation directory]\esProc\extlib\CdcCli\scu-cdc-cli-2.10.jar.

 

2.  A JRE version 1.7 or above is required. To use a custome JRE, users need to configure java_home in config.txt under installation directory\esProc\bin. Just skip this step if the JRE version is adequate.

 

3.  To use the CdcCli external library, you need to have the data sync tool like Debezium and Ogg. Here are uses of the two.

3.4.1 Debezium

The Debezium data sync tool is located in DebeziumCli directory ([installation directory]\esProc\extlib\DebeziumCli). This tool works through Debezium (support 3.0.0 version or above) to store the changes resulted from data addition, deletion or modification performed on the database (support MySQL and PostgreSQL for the time being) in the multizone composite table.

(1) The Raqsoft core jar for this external library is [installation directory]\esProc\extlib\DebeziumCli \scu-debezium-2.10.jar.

There are also the following files in DebeziumCli directory:

antlr4-runtime-4.10.1.jar

connect-api-3.8.0.jar

connect-json-3.8.0.jar

connect-runtime-3.8.0.jar

debezium-api-3.0.0.Final.jar

debezium-connector-binlog-3.0.0.Final.jar

debezium-connector-mysql-3.0.0.Final.jar

debezium-connector-postgres-3.0.0.Final.jar

debezium-core-3.0.0.Final.jar

debezium-ddl-parser-3.0.0.Final.jar

debezium-embedded-3.0.0.Final.jar

debezium-storage-file-3.0.0.Final.jar

debezium-storage-kafka-3.0.0.Final.jar

fastjson-1.2.76.jar

jackson-annotations-2.16.2.jar

jackson-core-2.16.2.jar

jackson-databind-2.16.2.jar

jackson-module-afterburner-2.16.2.jar

kafka-clients-3.8.0.jar

log4j-1.2.17.jar

mysql-binlog-connector-java-0.31.0.jar

mysql-connector-j-9.0.0.jar

postgresql-42.2.14.jar

slf4j-api-1.7.30.jar

slf4j-reload4j-1.7.36.jar

Note: The third-party jars are encapsulated in the compression package and users can choose appropriate ones for specific scenarios.

There are also another two other files:

log4j.properties

start_server.bat

(2) Modify the java path in start_server.bat. jre11 or a higher version is required. Below is the content of start_server.bat:

set EXECJAVA="D:\Java\jdk-11\bin\java"

%EXECJAVA% -Xms128m -Xmx1024m  -cp .;*;../../lib/*; com.scudata.lib.debezium.DebeziumCollect %1 %2

(3) In a directory on the same level as start_server.bat, create the configuration file mysql.properties or postgres.properties as needed. Below is content of the configuration file:

# Custom, unique connector name

name=testName

# Connector class name, which varies with different databases. Here we use MySQL as an example

connector.class=io.debezium.connector.mysql.MySqlConnector

# File-based offset persistence

offset.storage=org.apache.kafka.connect.storage.FileOffsetBackingStore

offset.storage.file.filename=F:/tmp/dbz/storage/mysql_offsets.dat

offset.flush.interval.ms=60000

# Database server name

database.server.name=myDatabase

# The current database’s numeric ID, which is random by default

database.server.id=59059

# Database IP address and port number

database.hostname=localhost

database.port=3306

# Database connection username and password

database.user=username

database.password=password

# Database name

database.dbname =myDBname

# Set time zone

database.connectionTimeZone=GMT+8

# List of monitored data tables separated by comma

table.include.list=myDBname.table1,myDBname.table2

# Name of custom table level topic

topic.prefix=sync

# Replication slot name

slot.name=slotName

# Specify the location holding the database frame history storage

schema.history.internal=io.debezium.storage.file.history.FileSchemaHistory

schema.history.internal.file.filename=F:/tmp/dbz/storage/schemahistory.dat 

# Set snapshot mode

snapshot.mode=always

# Use the third-party interface to handle data type

decimal.handling.mode=double

numeric.handling.mode=numeric

# Data conversion, for which time type format is specified

converters=scudataConverter

scudataConverter.type=com.scudata.lib.debezium.ScudataConverter

scudataConverter.schema.name=io.debezium.scudata.type.scudataConverter

scudataConverter.format.date=yyyy-MM-dd

scudataConverter.format.time=HH:mm:ss

scudataConverter.format.datetime=yyyy-MM-dd HH:mm:ss

scudataConverter.format.timestamp=yyyy-MM-dd HH:mm:ss

scudataConverter.format.timestamptz=yyyy-MM-dd HH:mm:ss

scudataConverter.format.timestamp.zone=GMT+8

scudataConverter.format.timestamptz.zone=UTC

(4)  Start MySQL or PostgreSQL database.

(5)  Start the data sync tool. The startup command and parameter explanation are as follows:

start_server.bat parameter 1 parameter 2

Parameter 1: The directory holding data files to be synced, which can be an absolute path or a relative path.

When data sync is finished, a dirout directory will be automatically generated in this path for storing the multizone composite table. The parameter value is the current directory by default, and this parameter cannot be omitted alone;

Parameter 2: The .properties configuration file to be connected, which is mysql.properties by default.

For example, start_server.bat F:/tmp/myTest postgres.properties means that configuration file postgres.properties is used and files generated from data sync will be stored in the specified directory F:/tmp/myTest/dirout.

(6) After the data sync tool is started, you can perform data addition, deletion or modification on data in the database. The updates will be recorded in the multizone composite table generated in dirout directory:

MYTEST.T1_2025_01.ctx records all updates of data in MYTEST database’s table T1 in January, 2025. A new multizone composite table will be created to record changes in the next month.

(7) The external library functions working on the multizone composite tables generated from data sync are cdc_collect() and cdc_merge(). Look them up in Help-Function referenceto find their uses.

3.4.2 Ogg

The Ogg data sync tool is located in OggCli directory ([installation directory]\esProc\extlib\OggCli). This tool works through Ogg to store the changes resulted from data addition, deletion or modification performed on the database (support Oracle for the time being) in the multizone composite table.

(1)  The Raqsoft core jar for this external library is [installation directory]\esProc\extlib\OggCli \scu-oggplug-2.10.jar.

There is also the following file in OggCli directory:

fastjson-1.2.2.jar

Note: The third-party jars are encapsulated in the compression package and users can choose appropriate ones for specific scenarios.

(2)  In the existing OGG environment, configure [OGG installation path]/dirprm/javaue.properties on the target side:

Set value of gg.handler.sample.type property as com.scudata.lib.cdc.SplHandler;

Load third-party jar files fastjson-1.2.7.jar and scu-oggplug-2.10.jar, which are located in  OggCli directory ([installation directory]\esProc\extlib\OggCli), and esproc-bin-********.jar, icu4j-60.3.jar and lz4-1.3.0.jar in esProc class path ([installation directory]\esProc\lib).

Below is content of javaue.properties:

gg.handler.sample.type=com.scudata.lib.cdc.SplHandler

javawriter.bootoptions=-Djava.class.path=.,ggjava/resources/classes,ggjava/resources/lib/ggcmdui-12.2.0.1.2.001.jar,/ggjava/resources/lib/*,ggjava/ggjava.jar,/home/esProc/extlib/OggCli/fastjson-1.2.7.jar,/home/esProc/extlib/OggCli/scu-oggplug-2.10.jar,/home/esProc/lib/icu4j-60.3.jar,/home/esProc/lib/esproc-bin-********.jar,/home/esProc/lib/lz4-1.3.0.jar

(3) Configure and start javaue process.

(4) After the javaue process is started, you can perform data addition, deletion or modification on data in the database. The updates will be recorded in the multizone composite table generated in dirout directory:

MYTEST.T1_2025_01.ctx records all updates of data in MYTEST database’s table T1 in January, 2025. A new multizone composite table will be created to record changes in the next month.

(5) The external library functions working on the multizone composite tables generated from data sync are cdc_collect() and cdc_merge(). Look them up in Help-Function referenceto find their uses.