Database Meets survery. AI Meets Database AI4DB and DB4AI
1 minute read ∼ Filed in : A paper noteIntroduction
The paper introduce the DB4AI and AI4DB problems and open questions.
DB4AI
Problem of DB AI seperations:
- Require engineer skills to define the complete execution logic
- Load/export data is costly.
- SQL lacks some complex processing patterns.
Declarative language model
Extends SQL model to support AI mdoels
Data Governance
It improves data quality and discover, clean, integrate, and label the data
Model Training
Train the model using in-database optimizations including model storage, model update and parallel training.
Targets
- Feature Selectin:
- Model Selection: Parallelly training to increase the throughoput.
- Model Management: Design model management system to track, store, and search ML models.
- Hardware acceleration
Challenges
- In-database training:
- How to store model to DB, such that multi-tenants train and use it in security and privacy manner.
- How to update model when data is updated.
- How to train in parallel.
-
Training acceleration:
- How to accerate the training with feature selection & sample selection. (important features and samples.)
-
AI optimizer:
-
UDF is not effectively optimized and requires embedded model inside database.
It requires logic => physical.
-
-
Fault-tolerant learning
- use error tolerance techniques of database system => improve the robustness
Model Inference
Infer the results using a trained model with in-database optimizations such as operator support, selectiona nd execution acceleration.
- Operator support: How to define operators and meet the optimizations requirement
- Operator selection:
- Execution acceleration: in-memory computation & distributed methods.
Open questions
- How to enhance AI training inside database
- How to reduce errors with error-tolerant techniques
- How to build database-like AI optimzier.