PostCENN PostgreSQL with Machine Learning Models for Cardinality Estimation

Posted on February 16, 2023   1 minute read ∼ Filed in  : 

Questions

  1. Why integaret the ML into the query processing? Is there any benefits other than incuring more engineering problems?
  2. Update model will incure restart the database? This is not performance degration ?

Introduction

Background & Motivation

ML is more accurate in predicting the DB cardinality than the traditional estimators like histograms.

Gap

But it’s only fully integrated into the DBMS and cannot provide insights into the actual query performance gians.

Challenge

Integrate the estimators into the DBMS is challenging due to the overhead incurs.

Goal

This paper try to enhance PostgreSQL database system with an end-to-end integration of NNs for cardinality estimation.

Details

Create local context using defined SQL

  • Define a collections of tables, columns, and join attributes to identify a context.
  • Each model has one context.

Select a context and conduct training using defined SQL

  • we tightly coupled the underlying PostgreSQL with a Python backend containing Tensorflow in Post- CENN as illustrated in Figure
  • use pre-aggregation to optimzie the execution of all examples queries for a context.

Model query

  • The query will use the ML interence,
  • It directly integrate tensorflow’s NN management capabilities into PostgreSQL using the Tensorflow C API1 to achieve the best performance




END OF POST




Tags Cloud


Categories Cloud




It's the niceties that make the difference fate gives us the hand, and we play the cards.