Please use this identifier to cite or link to this item:
http://bura.brunel.ac.uk/handle/2438/17674| Title: | An Efficient Mixed-Model for Screening Differentially Expressed Genes of Breast Cancer Based on LR-RF |
| Authors: | Sun, M Ding, T Tang, XQ Yu, K |
| Keywords: | breast cancer;differentially expressed genes;logistic regression-random forest;Bonferroni test;gene interaction networks |
| Issue Date: | 23-Apr-2018 |
| Publisher: | Institute of Electrical and Electronics Engineers (IEEE) on behalf of Association for Computing Machinery (ACM); Computational Intelligence Society; Control Systems Society; Engineering in Medicine and Biology Society |
| Citation: | Sun, M. et al. (2019) 'An Efficient Mixed-Model for Screening Differentially Expressed Genes of Breast Cancer Based on LR-RF', IEEE/ACM Transactions on Computational Biology and Bioinformatics, 16 (1), pp. 124 - 130. doi: 10.1109/TCBB.2018.2829519. |
| Abstract: | To screen differentially expressed genes quickly and efficiently in breast cancer, two gene microarray datasets of breast cancer, GSE15852 and GSE45255, were downloaded from GEO. By combining the Logistic Regression and Random Forest algorithm, this paper proposed a novel method named LR-RF to select differentially expressed genes of breast cancer on microarray data by the Bonferroni test of FWER error measure. Comparing with Logistic Regression and Random Forest, our study shows that LR-FR has a great facility in selecting differentially expressed genes. The average prediction accuracy of the proposed LR-RF from replicating random test 10 times surprisingly reaches 93.11 percent with variance as low as 0.00045. The prediction accuracy rate reaches a maximum 95.57 percent when threshold value α=0.2 in the random forest algorithm process of ranking genes’ importance score, and the differentially expressed genes are relatively few in number. In addition, through analyzing the gene interaction networks, most of the top 20 genes we selected were found to involve in the development of breast cancer. All of these results demonstrate the reliability and efficiency of LR-RF. It is anticipated that LR-RF would provide new knowledge and method for biologists, medical scientists, and cognitive computing researchers to identify disease-related genes of breast cancer. |
| URI: | https://bura.brunel.ac.uk/handle/2438/17674 |
| DOI: | https://doi.org/10.1109/TCBB.2018.2829519 |
| ISSN: | 1545-5963 |
| Appears in Collections: | Dept of Mathematics Research Papers |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| FullText.pdf | Copyright © 2018 Institute of Electrical and Electronics Engineers (IEEE). Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works (see: https://journals.ieeeauthorcenter.ieee.org/become-an-ieee-journal-author/publishing-ethics/guidelines-and-policies/post-publication-policies/). | 692.69 kB | Adobe PDF | View/Open |
Items in BURA are protected by copyright, with all rights reserved, unless otherwise indicated.