Download PDFOpen PDF in browserOptimized Bootstrap Sampling for σ-AQP Error Estimation: A Pilot Study10 pages•Published: October 4, 2021AbstractApproximate query processing (AQP) aims to provide an approximated answer close to the exact answer efficiently for a complex query on large datasets, especially big data. It brings enormous benefits into many data science fields when the efficiency of query execution weighs more than the accuracy. However, assessing the accuracy of an approx- imated answer from AQP deserves more study. Existing work usually relies on strong dataset assumptions which may not work for real-world datasets. In this work, we employ bootstrap sampling to assess the estimation errors of the AQP for selection queries (called σ-AQP). We implement a prototype system which can calculate confidence intervals for the estimated query results. Experiment results demonstrated that the confidence intervals generated by the prototype system can cover the ground truth of the query results with high accuracy and low computing cost. In addition, we implement optimization strate- gies for the bootstrap sampling which have significantly improved the overall computing efficiency.Keyphrases: approximate query processing, bootstrap sampling, error assessment, query estimation In: Frederick Harris, Rui Wu and Alex Redei (editors). Proceedings of ISCA 30th International Conference on Software Engineering and Data Engineering, vol 77, pages 144-153.
|