Neo A Learned Query Optimizer
1 minute read ∼ Filed in : A paper noteInsights
Retrain every N second to update the model
The whole process follows RL, and the expliotation and exploration is balanced via plan search procedure
Introduction
This paper proposes Neo, which can learn (based on RL) to make decisions about join order, operator, and index.
Challenges:
- Use value network to automatically capture intuitive patterns.
- Feature the query
- overcame reinforcement learning’s infamous sample inefficiency by using a technique known as learning from demonstration.
- Sample inefficiency arises when an agent requires a large number of interactions (samples) with the environment to learn an effective policy, making the learning process slow and resource-intensive.
System Overview
Two phases:
- initial phase: expertise is collected from an expert optimizer
- Collect (query, plan, time)
- Train a Value Network (TCNN), which predicts the execution time of a partial or complete plan.
- Feature: query level (join graph) + plan level (join order) information.
- Running phase: queries are executed.
- Given a query, it uses the value model to search for the best plan (with join order selection, join operators, and indexes.)
- Model Retraining:
Query optimization as a Markov decision process (MDP).
- State: partial query plan.
- Action: a step in building a query plan in a bottom-up fashion.
- Reward: latency.
Query encoding.
Plan encoder + Query encoder.
- Query encoder introduce the row vector, which map each row into a vector like a sentence. in NLP.
TCNN to predict the latency of each execution plan.
min-heap to help the searching process, from the partial plan all the way to the full query plan.
Exps
both synthetic and realworld datasets
JOB, TPC-H, Corp