CONflux

Real-Time Data Processing for the Greenplum DCA Appliance

Conflux can be thought of as a data-caching layer for the Greenplum Database ingesting tens of thousands of data points a second. Its innovative design allows for your organization to perform real time processing on Conflux while simultaneously loading that data into Greenplum. Conflux is used in a multitude of ways, from your data warehouse ODS layer to website and network monitoring.

Features

Real-Time: Conflux enables performing advanced analytics in real-time, enabling live dashboards, identifying and recognizing patterns within one data stream or across multiple data streams.
Triggers: Conflux allows the trigger of actions based on predefined rules or heuristics. Conflux can handle the most challenging data ingestion needs analyzing far beyond 60,000 objects a second.
Resource Management: Conflux is an ideal solution for resource contention between an attempt at near real-time analytics and business insights on your existing Greenplum environment

Scalability, availability, and fault tolerance:

Conflux leverages a shared-nothing environment built on open source technology. Conflux uses sharing to distribute data, automatically balancing data and workload between servers. It is the fundamental building block for large-scale real time data processing and low-cost operations using the Greenplum DIA Module. Fault recovery is fully transparent to the application reducing the complexity of the software stack built. Just like your Greenplum Appliance, Conflux can grow with your demands. The Conflux analytics engine allows you to leverage new DIA modules in the cluster with ease allowing for scalable insert and application performance.

Low hardware and operational cost

Conflux brings the benefit of using your existing Greenplum Appliance DIA Modules. Conflux runs real-time data processing and analytics before hitting your Greenplum system, this early data validation and reduction can help limit your data warehouse to only things pertinent to your data insights and reporting needs. Conflux also enables you to easily port existing Greenplum MapReduce functions for real-time or batch processing, reducing the migration costs from your existing environment.

Flexibility, compatibility and easy deployment

Real-time analytics results are immediately available for querying and real-time dashboards using partner technologies like Pentaho, scripts and web applications. Data is then automatically archived into the Greenplum Database within seconds for further analysis via tools like Alpine Data Miner, R and SAS.

Triggers

The Conflux data store allows you to instantly leverage analytics results into web-browser accessible real-time dashboards. Within Conflux a user has the ability to build automatic trigger actions. These allow you to setup custom events that detect data as it’s coming in, allowing for real event driven processing. Included trigger types are:

  • Email/Web based services
  • Queue building
  • Scripts (Python/Perl)
  • Greenplum SQL and Stored Procedures
  • MapReduce Functions
  • JSON functions