ColumnML Column-Store Machine Learning with On-The-Fly Data Transformation

Posted on May 24, 2023   1 minute read ∼ Filed in  : 

Takeaways

  1. Some words

    linear throughput scalability. on-the-fly data transformation

  2. How can we integrate ML into a column-store DBMS without disrupting either DBMS efficiency or ML quality and performance?

System

It integrates ML into a column-store DBMS without disrupting either DBMS efficiency or ML quality and performance.

  1. The paper propose pSCD to achieve the cache-efficient training on column store
    1. For a batch of data, read each column and then perform training requires Batch*Feature cache size.
    2. Partitioned SCD to achieve cache-efficient training: Coordinate-descent based algorithms enable a way of accessing the samples one feature at a time, which natively corresponds to column-wise access.
  2. The paper uses FPGA to do the data-preprocessing and ML computations.




END OF POST




Tags Cloud


Categories Cloud




It's the niceties that make the difference fate gives us the hand, and we play the cards.