Neo A Learned Query Optimizer

Posted on February 15, 2024   1 minute read ∼ Filed in  : 

Insights

Retrain every N second to update the model

The whole process follows RL, and the expliotation and exploration is balanced via plan search procedure

Introduction

This paper proposes Neo, which can learn (based on RL) to make decisions about join order, operator, and index.

Challenges:

  1. Use value network to automatically capture intuitive patterns.
  2. Feature the query
  3. overcame reinforcement learning’s infamous sample inefficiency by using a technique known as learning from demonstration.
    1. Sample inefficiency arises when an agent requires a large number of interactions (samples) with the environment to learn an effective policy, making the learning process slow and resource-intensive.

System Overview

Two phases:

  • initial phase: expertise is collected from an expert optimizer
    • Collect (query, plan, time)
    • Train a Value Network (TCNN), which predicts the execution time of a partial or complete plan.
      • Feature: query level (join graph) + plan level (join order) information.
  • Running phase: queries are executed.
    • Given a query, it uses the value model to search for the best plan (with join order selection, join operators, and indexes.)
  • Model Retraining:

Query optimization as a Markov decision process (MDP).

  • State: partial query plan.
  • Action: a step in building a query plan in a bottom-up fashion.
  • Reward: latency.

Query encoding.

Plan encoder + Query encoder.

  • Query encoder introduce the row vector, which map each row into a vector like a sentence. in NLP.

TCNN to predict the latency of each execution plan.

min-heap to help the searching process, from the partial plan all the way to the full query plan.

Exps

both synthetic and realworld datasets

JOB, TPC-H, Corp





END OF POST




Tags Cloud


Categories Cloud




It's the niceties that make the difference fate gives us the hand, and we play the cards.