Contact us: +34 902 10 73 96, info@opensistemas.com

Areas of application


  • Smart Cities
  • Internet of things
  • Massive Data processing
  • Fraud and Risk

Sectors Application


  • Banking and Insurance
  • Public administration
  • Infraestructures
  • Industry and Energy
%
Success

CHALLENGES

1.

2.

3.

Capture, obtain and process huge amounts of information both in batches and in real time.

Ensure the quality and adequacy of the information, avoiding losses or duplicities.

Respect security requirements in integration with other systems, storage and con dentiality.

ico_solution

Solution

We use the Spark suite as appropriate and rely on Kafka to ensure there is no loss of information. In addition, we try to make scalable and real time solutions whenever is possible.

ico_benefits

Benefits

The environment can grow by including new sources and capabilities in the analysis layer, generating valuable information through the implementation of models and algorithms.

ico_results

Results

A big data architecture with horizontal scaling capabilities aimed at managing both real- time and batch information, based on Spark as a core element of the project.

SOLUTIONS

The solution approach is based on the use of the Spark suite as appropriate depending on the case, Spark core for batch processing, Spark Streaming for real time processing and on the other hand Spark SQL to be able to connect with other applications and study the data. We rely on Kafka to ensure that there is no loss of information, receiving the packages and distributing them according to the relevant criteria. We try to make the solutions scalable, if possible in real time and with hot swap, an example of this is the assembly of the kafka cluster with zookeeper, distributed in a set of nodes expandable at all times with new nodes to scale the service.

1_spark_kafka
2_sparkEco
3_nosql_databases

RESULTS

The result of the project entails the implementation of a big data architecture with horizontal scaling capabilities aimed at managing both real-time and batch-based mass information, based on spark as a core element of the project. The architecture
is also scalable in terms of the number of machines and incorporates capabilities that add security and integrity to the processed when working in environments where the disposal of information is not acceptable. From this point on, the environment can grow, including new sources as well as capabilities in the data analysis layer, facilitating the work of the data scientists team or generating value information through the implementation of models and algorithms.

#spark #kafka #scala #osBrain #sapic #hbase #storm # ume #hadoop

CLIENTS

1431
3342
1405
1599
logos_sgs