Download PDFOpen PDF in browser

Accelerating Lattice Boltzmann Method by Fully Exposing Vectorizable Loops

EasyChair Preprint 2197

15 pagesDate: December 18, 2019

Abstract

Lattice Boltzmann Method (LBM) plays an important role in CFD applications. Accelerating LBM computation indicates the decrease of simulation costs for many industries. However, the loop-carried dependencies in LBM kernels prevent the vectorization of loops and general compilers therefore have missed many opportunities of vectorization. This paper proposes a SIMD-aware loop transformation algorithm to fully expose vectorizable loops for LBM kernels. The proposed algorithm identifies most potential vectorizable loops according to a defined dependence table. Then, it performs appropriate loop transformations and array copying techniques to legalize loop-carried dependencies and makes the identified loops automatically vectorized by compiler. Experiments carried on an Intel Xeon Gold 6140 server show that the proposed algorithm significantly raises the ratio of number of vectorized loops to number of all loops in LBM kernels. And our algorithm also achieves a better performance than an Intel C++ compiler and a polyhedral optimizer, accelerating LBM computation by 147% and 120% on average lattice update speed, respectively.

Keyphrases: Lattice Boltzmann Method, Performance, SIMD, auto-vectorization, loop transformation algorithm

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
@booklet{EasyChair:2197,
  author    = {Bin Qu and Song Liu and Hailong Huang and Jiajun Yuan and Qian Wang and Weiguo Wu},
  title     = {Accelerating Lattice Boltzmann Method by Fully Exposing Vectorizable Loops},
  howpublished = {EasyChair Preprint 2197},
  year      = {EasyChair, 2019}}
Download PDFOpen PDF in browser