Towards a Unified Architecture for in-RDBMS Analytics

Posted on February 28, 2023   1 minute read ∼ Filed in  : 

Summary

Introduction

Integrating analytics into DBMS may introduce some deployment overhead due to lacking unified architecture in-database analytics, which enables generically performance optimizations. For example, two factors influence performance:

  • data ordering
  • the parallelization of computations in single-node multicore RDMBs.

This paper then tries to investigate those two factors theoretically and empirically and then proposes a feasible novel unified architecture for general in-database analysis.

  • Integrating many data analytics tasks which can be formulated as Incremental Gradient Descent.
  • Develop a novel strategy to improve the performance, which is limited by the bad ordering in RDMS.
  • Adapt existing approaches to make them run in parallel.

Architecture

Data order

The paper shows that some data orderings allow the IGD algorithm to converge in fewer epochs than others.

Parallelism

IGD can be run in parallel in nature.

Shard memory outperforms the shared-nothing architecture in distributed training in accuracy, while shared memory can also be lockless because of the nature of the IGD algorithms.





END OF POST




Tags Cloud


Categories Cloud




It's the niceties that make the difference fate gives us the hand, and we play the cards.