Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/30911
Full metadata record
DC FieldValueLanguage
dc.contributor.authorLi, JA-
dc.contributor.authorLi, Y-
dc.contributor.authorLi, G-
dc.contributor.authorHu, X-
dc.contributor.authorXia, X-
dc.contributor.authorJin, Z-
dc.coverage.spatialMelbourne, Australia-
dc.date.accessioned2025-03-15T08:28:47Z-
dc.date.available2021-11-15-
dc.date.available2025-03-15T08:28:47Z-
dc.date.issued2021-11-15-
dc.identifierORCiD: Yongmin Li https://orcid.org/0000-0003-1668-2440-
dc.identifier.citationLi, J.A. et al. (2021) 'EditSum: A Retrieve-and-Edit Framework for Source Code Summarization', 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE), Melbourne, Australia / Virtual Event, 15-19 November, pp. 155 - 166. doi: 10.1109/ase51524.2021.9678724.en_US
dc.identifier.isbn978-1-6654-0337-5 (ebk)-
dc.identifier.isbn978-1-6654-4784-3 (PoD)-
dc.identifier.issn1938-4300-
dc.identifier.urihttps://bura.brunel.ac.uk/handle/2438/30911-
dc.descriptionThe accepted manuscript is available at arXiv, arXiv:2308.13775v2 [cs.SE] (https://doi.org/10.48550/arXiv.2308.13775). Comments: Accepted by the 36th IEEE/ACM International Conference on Automated Software Engineering (ASE 2021).en_US
dc.description.abstractExisting studies show that code summaries help developers understand and maintain source code. Unfortunately, these summaries are often missing or outdated in software projects. Code summarization aims to generate natural language descriptions automatically for source code. According to Gros et al., code summaries are highly structured and have repetitive patterns (e.g. "return true if..."). Besides the patternized words, a code summary also contains important keywords, which are the key to reflecting the functionality of the code. However, the state-of-the-art approaches perform poorly on predicting the keywords, which leads to the generated summaries suffer a loss in informativeness. To alleviate this problem, this paper proposes a novel retrieve-and-edit approach named EditSum for code summarization. Specifically, EditSum first retrieves a similar code snippet from a pre-defined corpus and treats its summary as a prototype summary to learn the pattern. Then, EditSum edits the prototype automatically to combine the pattern in the prototype with the semantic information of input code. Our motivation is that the retrieved prototype provides a good start-point for post-generation because the summaries of similar code snippets often have the same pattern. The post-editing process further reuses the patternized words in prototype and generates keywords based on the semantic information of input code. We conduct experiments on a large-scale Java corpus (2M) and experimental results demonstrate that EditSum outperforms the state-of-the-art approaches by a substantial margin. The human evaluation also proves the summaries generated by EditSum are more informative and useful. We also verify that EditSum performs well on predicting the patternized words and keywords.en_US
dc.description.sponsorship10.13039/501100001809-National Natural Science Foundation of China.en_US
dc.format.extent155 - 166-
dc.format.mediumPrint-Electronic-
dc.languageEnglish-
dc.language.isoen_USen_US
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)en_US
dc.rightsCopyright © 2021 The Author(s) / Institute of Electrical and Electronics Engineers (IEEE). arXiv.org - Non-exclusive license to distribute. The URI https://arxiv.org/licenses/nonexclusive-distrib/1.0/ is used to record the fact that the submitter granted the following license to arXiv.org on submission of an article: I grant arXiv.org a perpetual, non-exclusive license to distribute this article. I certify that I have the right to grant this license. I understand that submissions cannot be completely removed once accepted. I understand that arXiv.org reserves the right to reclassify or reject any submission.-
dc.rights.urihttps://arxiv.org/licenses/nonexclusive-distrib/1.0/license.html-
dc.source2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE)-
dc.source2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE)-
dc.subjectcode summarizationen_US
dc.subjectinformation retrievalen_US
dc.subjectdeep learningen_US
dc.titleEditSum: A Retrieve-and-Edit Framework for Source Code Summarizationen_US
dc.typeConference Paperen_US
dc.identifier.doihttps://doi.org/10.1109/ase51524.2021.9678724-
dc.relation.isPartOf2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE)-
pubs.finish-date2021-11-19-
pubs.finish-date2021-11-19-
pubs.publication-statusPublished-
pubs.start-date2021-11-15-
pubs.start-date2021-11-15-
dc.identifier.eissn2643-1572-
dc.rights.holderThe Author(s) / Institute of Electrical and Electronics Engineers (IEEE)-
Appears in Collections:Dept of Computer Science Research Papers

Files in This Item:
File Description SizeFormat 
FullText.pdfCopyright © 2021 The Author(s) / Institute of Electrical and Electronics Engineers (IEEE). arXiv.org - Non-exclusive license to distribute. The URI https://arxiv.org/licenses/nonexclusive-distrib/1.0/ is used to record the fact that the submitter granted the following license to arXiv.org on submission of an article: I grant arXiv.org a perpetual, non-exclusive license to distribute this article. I certify that I have the right to grant this license. I understand that submissions cannot be completely removed once accepted. I understand that arXiv.org reserves the right to reclassify or reject any submission.819.8 kBAdobe PDFView/Open


Items in BURA are protected by copyright, with all rights reserved, unless otherwise indicated.