[reading review] The FastLanes Compression Layout: Decoding >100 Billion Integers per Second with Scalar Code

In this paper, the authors propose a new layout FastLanes to accelerate decoding speed. FastLanes is designed to adapt to many heterogeneous ISAs and can also run on a virtual 1024-bits register. Because relational algebra is based on set concept, the authors can reorder data to vectorize the query execution process without harm the performance. Another notable result is that modern compiler can do auto-vectorize the scalar FastLanes layout code without explicit SIMD intrinsics.

  • Strengths: By reordering data and design a virtual 1024-bits register, FastLanes can leverage SIMD to accelerate decoding process and portable to many different existing ISAs.

  • Future works: One can integrate FastLanes to a complete system to evaluate performace and do more experiments on GPUs and TPUs.




Enjoy Reading This Article?

Here are some more articles you might like to read next:

  • [reading review] MonetDB/X100: Hyper-Pipelining Query Execution
  • [reading review] An Empirical Evaluation of Columnar Storage Formats
  • [reading review] Lakehouse: A New Generation of Open Platforms that unify Data Warehouse and Advanced Analytics
  • [reading review] Velox: Meta's Unified Execution Engine
  • [reading review] Exploiting Cloud Object Storage for High-Performance Analytics