Hybrid In Database Inference for Declarative Information Extraction
1 minute read ∼ Filed in : A paper noteSummary
This paper proposes a way to integrate the MCMC and Gibbs sampling algorithm using SQL language. Further, it analyzes those models with some characterizations and proposes a rule to choose the different algorithms for different documents in a single query.
Introduction
Background & Motivation
In-database inference offers significant speed-up.
Gap
Existing work can only deal with simple CPF models, not non-linear ones such as skip-chain CRF models.
Goal
This paper tries to explore the in-database implementation of a few inference algorithms, such as MCMC, Gibbs, and MCMC-MH.
Details
Implement various algorithms and identify a set of parameters and rules for choosing inference algorithms.
MCMC algorithm:
Using some UDF can implement the MCMC. But it iteratively calls UDF a million times, leading to a slow process.
The paper tries to re-implement the MCMC such that one query can finish all iterations.
Model Selection based on model.
The paper model the characterizations of a few algorithms and then propose a set of rules to choose from among various inference algorithms.
Hybrid Inference
Then the paper shows the inference algorithm choice is not only model-dependent but also query and text-dependent. Thus it uses a hybrid approach to choose the different algorithms for different documents in a single query.