MPP BI Architectural Principles

Datacentricity

The idea behind the principle is to organize computations closer to the data, in other words to reduce all possible transitions and data transformations (a similar approach is used in Hadoop technology).

The main principle characteristics:

Extensive use of stored procedures in the database to implement the business logic of applications
Two-tier data-centric architecture: the application server and the main business logic are located inside the database (two-tier architecture as the main core of the system)
The shortest path from the database to the client: one component – ngnix
The unique optimized data model. A dataset is a data storage unit in BI, it stores not only data, but visualization settings, configuration settings, user interface logic, and code executed on the client
The code of the system core is written on the data-centric language PL / pgSQL

Open Architecture

The idea behind the principle is the use of open programming interfaces (API) for:

interaction with external systems
interactions between system components
replacement of system components
changes of the API itself

The principle is implemented through the use of a microservice architecture for the development of additional functionality. It is possible to replace or expand microservices. The system components communicate on the basis of the described protocols and operate as independent server processes.

The main principal characteristics:

Freedom to choose how to work with data – with or without copying to a local database
Freedom to choose how to develop the system
Microservices: components are deployed in separate Docker containers
No binding to OS. MPP BI runs on Linux and Unix-like systems when needed
No binding to processor type.
Ability to provide source codes

Concept of Three Data Layers

The idea behind the principle is to optimize information queries and load computing power by organizing the work of three data layers: hot, warm and cold.

The main principle characteristics:

Implementation of a fast hot layer on massively parallel systems, example: Clickhouse
Storing a warm layer in massively parallel DBMS, for example: Greenplum, Oracle Exadata
Storing a cold layer in Hadoop
Maximum response due to the implementation of native connectors for hot and warm data

Get To Know MPP BI Better!

Get a Demo