Please use this identifier to cite or link to this item:
http://bura.brunel.ac.uk/handle/2438/20698
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Ghorbani, M | - |
dc.contributor.author | Swift, S | - |
dc.contributor.author | Taylor, SJE | - |
dc.contributor.author | Payne, AM | - |
dc.date.accessioned | 2020-04-21T11:42:54Z | - |
dc.date.available | 2020-04-21T11:42:54Z | - |
dc.date.issued | 2020-08 | - |
dc.identifier.citation | Ghorbani, M., Swift, S., Taylor, S.J.E. and Payne, A.M. (2020) 'Design of a Flexible, User Friendly Feature Matrix Generation System and its Application on Biomedical Datasets', Journal of Grid Computing, 18, pp. 507–527. doi:10.1007/s10723-020-09518-y. | en_US |
dc.identifier.issn | 1570-7873 | - |
dc.identifier.uri | https://bura.brunel.ac.uk/handle/2438/20698 | - |
dc.description.abstract | © The Author(s) 2020. The generation of a feature matrix is the first step in conducting machine learning analyses on complex data sets such as those containing DNA, RNA or protein sequences. These matrices contain information for each object which have to be identified using complex algorithms to interrogate the data. They are normally generated by combining the results of running such algorithms across various datasets from different and distributed data sources. Thus for non-computing experts the generation of such matrices prove a barrier to employing machine learning techniques. Further since datasets are becoming larger this barrier is augmented by the limitations of the single personal computer most often used by investigators to carry out such analyses. Here we propose a user friendly system to generate feature matrices in a way that is flexible, scalable and extendable. Additionally by making use of The Berkeley Open Infrastructure for Network Computing (BOINC) software, the process can be speeded up using distributed volunteer computing possible in most institutions. The system makes use of a combination of the Grid and Cloud User Support Environment (gUSE), combined with the Web Services Parallel Grid Runtime and Developer Environment Portal (WS-PGRADE) to create workflow-based science gateways that allow users to submit work to the distributed computing. This report demonstrates the use of our proposed WS-PGRADE/gUSE BOINC system to identify features to populate matrices from very large DNA sequence data repositories, however we propose that this system could be used to analyse a wide variety of feature sets including image, numerical and text data. | - |
dc.format.medium | Print-Electronic | - |
dc.language.iso | en | en_US |
dc.publisher | Springer | en_US |
dc.rights | Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit https://creativecommons.org/licenses/by/4.0/. | - |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | - |
dc.subject | BOINC | en_US |
dc.subject | desktop grid | en_US |
dc.subject | DNA sequence | en_US |
dc.subject | feature subset selection | en_US |
dc.subject | machine learning | en_US |
dc.subject | high performance computing | - |
dc.subject | WS-PGRADE | - |
dc.subject | gUSE | - |
dc.subject | DNA feature identification | - |
dc.subject | speedup | - |
dc.title | Design of a flexible, user friendly feature matrix generation system and its application on biomedical datasets | en_US |
dc.type | Article | en_US |
dc.identifier.doi | https://doi.org/10.1007/s10723-020-09518-y | - |
dc.relation.isPartOf | Journal of Grid Computing | - |
pubs.publication-status | Published | - |
dc.identifier.eissn | 1572-9184 | - |
Appears in Collections: | Dept of Computer Science Research Papers |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
FullText.pdf | 6.78 MB | Adobe PDF | View/Open |
This item is licensed under a Creative Commons License