Please use this identifier to cite or link to this item:
http://bura.brunel.ac.uk/handle/2438/31319
Title: | Identifying Suitability for Data Reduction in Imbalanced Time-Series Datasets |
Authors: | Sanderson, D Kalganova, T |
Keywords: | occupancy detection;data reduction;dynamic data application;time-series data;useful data;class balance;class density;dataset fusion;green AI |
Issue Date: | 8-May-2025 |
Publisher: | MDPI |
Citation: | Sanderson D. and Kalganova, T. (2025) 'Identifying Suitability for Data Reduction in Imbalanced Time-Series Datasets', AI, 6 (5), 98, pp. 1 - 26. doi: 10.3390/ai6050098. |
Abstract: | Occupancy detection for large buildings enables optimised control of indoor systems based on occupant presence, reducing the energy costs of heating and cooling. Through machine learning models, occupancy detection is achieved with an accuracy of over 95%. However, to achieve this, large amounts of data are collected with little consideration of which of the collected data are most useful to the task. This paper demonstrates methods to identify if data may be removed from the imbalanced time-series training datasets to optimise the training process and model performance. It also describes how the calculation of the class density of a dataset may be used to identify if a dataset is applicable for data reduction, and how dataset fusion may be used to combine occupancy datasets. The results show that over 50% of a training dataset may be removed from imbalanced datasets while maintaining performance, reducing training time and energy cost by over 40%. This indicates that a data-centric approach to developing artificial intelligence applications is as important as selecting the best model. |
Description: | Data Availability Statement: The open-source dataset used in this study may be found here: https://springernature.figshare.com/collections/A_High-Fidelity_Residential_Building_Occupancy_Detection_Dataset/5364449 (accessed on 16 April 2024). |
URI: | https://bura.brunel.ac.uk/handle/2438/31319 |
DOI: | https://doi.org/10.3390/ai6050098 |
Other Identifiers: | ORCiD: Dominic Sanderson https://orcid.org/0000-0002-1339-143X ORCiD: Tatiana Kalganova https://orcid.org/0000-0003-4859-7152 Article number: 98 |
Appears in Collections: | Dept of Electronic and Electrical Engineering Research Papers |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
FullText.pdf | Copyright © 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). | 897.15 kB | Adobe PDF | View/Open |
This item is licensed under a Creative Commons License