Learned Cardinality Estimation for Similarity Queries

Posted on December 11, 2023   1 minute read ∼ Filed in  : 

Introduction

Objective: DNN for estimating the cardinality of similarity queries (similarity search).

  • similarity search: provide an estimation card for the number of objects in D whose distance to a query q are not greater than a distance threshold pie.
  • similarity join: takes Q as inputs, and provides an estimation card for a total number of pairs (q, p), whose distance between q of Q and p of D is not greater than pie.

Problems/Insights:

  • big modules is hard to well capture the distribution of distances between data and arbitrary query.
  • how to design small modules is challenging.

Solutions: This paper improves the accuracy and reduces the size of training data:

  • query segmentation: divides a query into multiple segments, and trains a module E1 to produce an embedding zq of xq.
  • data segmentation: divides the data D into n segments, and trains a model for each segment, each model has three DNNs, each DNN for embedding of query, distance, and a segment of D.




END OF POST




Tags Cloud


Categories Cloud




It's the niceties that make the difference fate gives us the hand, and we play the cards.