Toggle Main Menu Toggle Search

Open Access padlockePrints

Comparing Different Pre-processing Techniques and Machine Learning Models to Predict PM10 and PM2.5 Concentration in Malaysia

Lookup NU author(s): Dr Jie ZhangORCiD


Full text for this publication is not currently held within this repository. Alternative links are provided below where available.


A recent report from world health organization (WHO) showed that there are in total of 3.7 million deaths that were attributed with outdoor ambient pollution. Particulate matter (PM) has been considered by many as the major contributor of air pollution, it poses serious health risks towards human that are mostly associated with cardiovascular and respiratory disease. Even though various studies have been carried out to predict the concentration of PM10 and PM2.5, there are only a handful of papers that focused on the pre-processing aspect of the prediction. In this study, the importance of data pre-processing steps were assessed for PM10 andPM2.5 prediction in Malaysia, these cover: Data cleaning (handling missing values), data transformation (standardization, normalization, robust scaling, and min–max scaling), features selection (univariate linear regression and mutual info regression), and dimensionality reduction (principle component analysis) techniques. Four different machine learning models were utilized for the predictions: Multiple linear regression (MLR), random forest regression (RFR), extra tree regression (ETR), and Decision tree regression with AdaBoost (BTR). The result showed that the best PM10 and PM2.5 prediction accuracy (R2 = 0.97, 0.92) were achieved by using BTR model coupled with normalization and PCA techniques, in which the missing data were imputed with median values.

Publication metadata

Author(s): Djarum DH, Ahmad Z, Zhang J

Editor(s): Zaini MAA; Jusoh M; Othman N

Publication type: Book Chapter

Publication status: Published

Book Title: Proceedings of the 3rd International Conference on Separation Technology (ICoST 2020)

Year: 2021

Pages: 353-374

Print publication date: 25/05/2021

Online publication date: 25/05/2021

Acceptance date: 01/05/2020

Series Title: Lecture Notes in Mechanical Engineering

Publisher: Springer

Place Published: Berlin


DOI: 10.1007/978-981-16-0742-4_25

Library holdings: Search Newcastle University Library for this item

ISBN: 9789811607417