Datacentricity

The idea behind the principle is to organize computations closer to the data, in other words to reduce all possible transitions and data transformations (a similar approach is used in Hadoop technology).

The main principle characteristics:

  • Extensive use of stored procedures in the database to implement the business logic of applications
  • Two-tier data-centric architecture: the application server and the main business logic are located inside the database (two-tier architecture as the main core of the system)
  • The shortest path from the database to the client: one component – ngnix
  • The unique optimized data model. A dataset is a data storage unit in BI, it stores not only data, but visualization settings, configuration settings, user interface logic, and code executed on the client
  • The code of the system core is written on the data-centric language PL / pgSQL

Open Architecture

The idea behind the principle is the use of open programming interfaces (API) for:

  • interaction with external systems
  • interactions between system components
  • replacement of system components
  • changes of the API itself

The principle is implemented through the use of a microservice architecture for the development of additional functionality. It is possible to replace or expand microservices. The system components communicate on the basis of the described protocols and operate as independent server processes.

The main principal characteristics:

  • Freedom to choose how to work with data – with or without copying to a local database
  • Freedom to choose how to develop the system
  • Microservices: components are deployed in separate Docker containers
  • No binding to OS. MPP BI runs on Linux and Unix-like systems when needed
  • No binding to processor type.
  • Ability to provide source codes

Concept of Three Data Layers

The idea behind the principle is to optimize information queries and load computing power by organizing the work of three data layers: hot, warm and cold.

The main principle characteristics:

  • Implementation of a fast hot layer on massively parallel systems, example: Clickhouse
  • Storing a warm layer in massively parallel DBMS, for example: Greenplum, Oracle Exadata
  • Storing a cold layer in Hadoop
  • Maximum response due to the implementation of native connectors for hot and warm data

Get To Know MPP BI Better!