Posted on February 22, 2025   1 minute read

It enhances the TPC-DS benchmark with complex data distribution and challenging yet semantically meaningful query templates

the joins in the TPC-DS benchmark are mostly between a primary key and a foreign key or between a primary key and another primary key.

Needs

  1. distinct query instances
  2. the distribution of query templates
  3. more evaluation metrics

the paper preserves the realistic database schema from TPC-DS, and it enriches the TPC-DS benchmark in a number of ways:

  • more skews and correlations to individual columns and multiple columns within a table and across table
  • new and semantically meaningful query templates (non primary-key-foreign-key)
  • fne-grained data slicing. This also increases the total number of distinct query instances

Thus, the DSB benchmark enhances the data generation of the TPC-DS benchmark with additional skews and correlations





END OF POST




Tags Cloud


Categories Cloud




It's the niceties that make the difference fate gives us the hand, and we play the cards.