Download PDFOpen PDF in browserClassification based on Associations (CBA) - a performance analysisEasyChair Preprint 5019 pages•Date: September 12, 2018AbstractClassification Based on Associations (CBA) has for two decades been the algorithm of choice for researchers as well as practitioners owing to simplicity of the produced rules, accuracy of models, and also fast model building. Two versions of CBA differing in speed -- M1 and M2 -- were originally proposed by Liu et al in 1998. While the more complex M2 version was originally designated as on average 50% faster, in this article we present benchmarks performed with multiple CBA implementations on the UCI lymph dataset contesting the M2 supremacy: the results show that M1 had faster processing speeds in most evaluated setups. M2 was recorded to be faster only when the number of input rules was very small and the number of input instances was large. We hypothesize that the better performance of the M1 version can be attributed to recent advances in optimization of vectorized operations and memory structures in SciKit learn and R, which the M1 can better utilize due to better predispositions for vectorization. Keyphrases: CBA, Classification, Classification by Associations, association rule, benchmark
|