Overview of SciDB

Posted on May 28, 2023   1 minute read ∼ Filed in  : 

SciDB is built to support an array data model and query language with facilities that allow users to extend our system with new scalar data types and array operators.

Traditional RDBMS may not suitable for array data model:

  • There is not need to store order information
  • It requries more flexible/extensible data management ability.

Array data model

  • n-dimensional arrays with each cell being either tuple or value (numerical or (currently) fixed length string data types)
  • Can be nexted, a cell can contains anohter array data.

Features

SciDB is Highly extensible with UDT and UDF.

Architecture

Shard-nothing architecture with a centralized node storing nodes information, data distribution, and user-defined extensions.

Storage Manager

logiclal metrics => physical storage: data -> partitions -> chunks. Chunk (64MB) are the physical unit of I/O, processing, and inter-node communications.

Chunks may have overlap, which is to facilicate the operations requiring surrounding regions.

Chunks containing data values for a particular cell of the logical array are co-located on the same node. It uses R-Trees to quickly identify which chunks contain data relevant to a particular subsample of the query array, , thereby facilitating fragment elimination and improving query planning.

Each chunk is compressed individually when writting to disk.

image-20230614162431747





END OF POST




Tags Cloud


Categories Cloud




It's the niceties that make the difference fate gives us the hand, and we play the cards.