Hybrid Transactional Analytical Processing A Survey

Posted on November 18, 2022 3 minute read ∼ Filed in : A paper note

Introduction
- Goal
- Research Topics & Challenges
History
HTAP SOLUTIONS
- Single Engine for OLAP and OLTP
  - Separate Data Organization
  - Same Data Organization
- Separate OLTP and OLAP Engines
  - Decoupling the Storage for OLTP and OLAP
  - Using the Same Storage for OLTP and OLAP

Introduction

Goal

This tutorial is

Review the historical progression of OLTP and OLAP systems
Discuss the driving factors for HTAP
Provide a technical analysis of existing HTAP solutions.

After that it give some reserach challenges / topics.

Research Topics & Challenges

Most HTAP systems provide OLTP and OLAP seperately, but not support efficient processing of transactional and analytical requests within the same transactions.

To fully support HTAP, systems should allow analytics on recent data not only after the transaction that is ingesting or updating that data has committed, but also as part of the same request.
Most HTAP solutions today use multiple components to provide all the desired capabilities. These different components are usually maintained by different groups of people. Therefore, keeping these components compatible and providing the end-users with the illusion of a single system is a challenging task.
Distributed OLAP systems uses shared file systems that mainly optimized for scans but not point access such as accessing individual records or columns. How to support fast point access to shared file system and object stores is challenging.

History

Traditional DBs:

OLAP systems: BLU, Vertica, ParAccel, GreePlumDB, Vectorvise
In-memory OLTP systems: VOltDB, Hekaton, MemSQL.

New DB architectures are generated due to new hardwares, multi-core, various levels of memory caches, and large memories.

Big-data analysis tools

Voldemort, Cassandra, RocksDB offers fast insert and lookups, but lacks query capabilities.

They mainly based on columnar storage formats like Parquet.
Hive, BIgSQL, Impala, SparkSQL support the OLAP queries. But doens’t support OLTP query.

HTAP SOLUTIONS

Single Engine for OLAP and OLTP

The mainly adding support of another to one system, but still on a single engine.

And they mainly differes based on the data organization they use for their transactional and analytical requests.

Separate Data Organization

The data is stored at different format for OLTP and OLAP queries.

SAP HANA, Oracles TImesTen:

In-memory columnar processing for OLAP query + row store for OLTP

MemSQL:

In-memory OLTP + Convert to columnar format when data is written to disk.

Pelaton:

In-memory HTAP system. It provides adaptive data organizations which changes the data format at runtime based on the requrest type.

Same Data Organization

The data is stored in a single format, and the system support both queries.

Hive:

Support Txs ( insert, update, delete )

Separate OLTP and OLAP Engines

This is the system with different engines, but differs on sharing/ non-sharing storage.

Decoupling the Storage for OLTP and OLAP

The data is stored in key-value store, and are groomed into Parquet or ORC files on HDFS for queries.

Using the Same Storage for OLTP and OLAP

Some system combine their product with Spark ecosystems.

Hbase and Cassandra store data in key-value stores, While use Spark-SQL connector to allow Spark SQL to directly query HBase and Cassandra data, respectively.

Hybrid Transactional Analytical Processing A Survey

Introduction

Goal

Research Topics & Challenges

History

HTAP SOLUTIONS

Single Engine for OLAP and OLTP

Separate Data Organization

Same Data Organization

Separate OLTP and OLAP Engines

Decoupling the Storage for OLTP and OLAP

Using the Same Storage for OLTP and OLAP

Tags Cloud

Categories Cloud

It's the niceties that make the difference fate gives us the hand, and we play the cards.

Introduction

Goal

Research Topics & Challenges

History

HTAP SOLUTIONS

Single Engine for OLAP and OLTP

Separate Data Organization

Same Data Organization

Separate OLTP and OLAP Engines

Decoupling the Storage for OLTP and OLAP

Using the Same Storage for OLTP and OLAP

END OF POST

Tags Cloud

Categories Cloud

It's the niceties that make the difference fate gives us the hand, and we play the cards.