Change Data Capture

IBM® InfoSphere® Change Data Capture integrates information across heterogeneous data stores in real time to support data migrations, application consolidation, data synchronisation, dynamic warehousing, MDM, SOA, business analytics and ETL or data quality processes.

What Does It Do?

  • CDC replicates important data events in real time without impacting system performance. This is useful to deliver data changes for master data management, event-driven data quality, dynamic data warehousing, and ETL process optimisation.
  • Through its integration with IBM InfoSphere DataStage it provides real-time data feeds to ETL processes with transactional integrity and no staging required, which enables real-time validity checks of changed data against defined data rules—and data cleansing if necessary—during the transformation process.
  • Real-time data transactions can be packaged into XML documents and delivering to and from messaging middleware such as WebSphere MQ.
  • It provides an easy-to-use graphical user interface (GUI) for rapid deployment of data integration processes and comprehensive monitoring capabilities for increased visibility into the replication environment.
  • It has flexible implementation methods which enable unidirectional, bi-directional, many-to-one, and one-to-many delivery of data across the enterprise.

How Does It Do It?

  • CDC monitors Database logs and captures only changed data and transfers it from publisher to subscriber systems which is more efficient than performing queries directly against the database.
  • It provides multi-platform support without programming to allow built-in transformation and filtering to translate values, derive new calculated fields, join tables, create, store and retrieve custom data transformations as macros etc.
  • Direct peer-to-peer integration delivers direct database connectivity and therefore no data staging or gateway technologies are required.
  • Extended update options such as “Adaptive apply” will take operations that occurred on the source table and merge / upsert changes to a target table, regardless of the absence or presence of the source row in the target table. The “Live audit” capability keeps a history of all changes made to source tables with information including the user and program identities who made the change. Users can summarize numerical data from multiple rows in one or more source tables into a single row in a target table. Row consolidation lets users merge data from multiple source tables into one or more rows in a target table.
  • CDC can monitor transactional systems for specific transactions and package that information as a single XML message to target a specific message queue.
  • Changed data is efficiently transmitted, including large object binaries (LOBs) such as multi-media audio and video data.

Why Is That Useful?

  • CDC improves operational efficiency and saves time and resources by eliminating redundant data transfer and saving network bandwidth.
  • With no programming required users can make the most of existing systems, skills and resources and integrate data throughout all supported platforms with no changes required in the current environment.
  • Decision support is enhanced since CDC provides in-flight data transformations to meet specific business requirements and information flows in an efficient and rapid manner directly between source and target systems.
  • Sensitive data can be delivered securely by making it accessible to authorised recipients only.
  • Extended update options provide more flexible replication capabilities in data warehousing environments lowering processing power and storage capacity requirements, and enabling easier querying and reporting.
  • With CDC critical data can be packaged with additional information from other systems and passed to a message queue that is based on business rules.