1 minute read
It enhances the TPC-DS benchmark with complex data distribution and challenging yet semantically meaningful query templates
the joins in the TPC-DS benchmark are mostly between a primary key and a foreign key or between a primary key and another primary key.
Needs
- distinct query instances
- the distribution of query templates
- more evaluation metrics
the paper preserves the realistic database schema from TPC-DS, and it enriches the TPC-DS benchmark in a number of ways:
- more skews and correlations to individual columns and multiple columns within a table and across table
- new and semantically meaningful query templates (non primary-key-foreign-key)
- fne-grained data slicing. This also increases the total number of distinct query instances
Thus, the DSB benchmark enhances the data generation of the TPC-DS benchmark with additional skews and correlations