Cohort Query Processing
1 minute read ∼ Filed in : A paper noteIntroduction
Motivation
Cohort analysis processing is a data analysis technique for assessing the effects of aging on human behavior in a changing society. But it is hard to implement in traditional DBMS
Contributions
-
Define cohort analytics problems.
-
Introduce the extended relation to model user data for cohort analytics, and introduce three new operators for it.
- Two of them can extract a subset of activities.
- The last one can aggregate over arbitrary attribute combinations.
-
Build a columnar-based cohort query engine with many optimizations.
-
Design benchmark study.
SQL-based cohort analysis
Activate table has a primary key constraint on (Au; At; Ae). And many attributes. User (Au), timestamp (At), and action (Ae) attribute
Basic concepts
birth action: Action e
birth time: First-time user performs e
age: Age is a certain time unit such as a day, week, or month.