Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/30313
Title: Minimizing the Minimizers via Alphabet Reordering
Authors: Verbeek, H
Ayad, LAK
Loukides, G
Pissis, SP
Keywords: sequence analysis;minimizers;alphabet reordering;feedback arc set
Issue Date: 18-Jun-2024
Publisher: Schloss Dagstuhl – LZI GmbH
Citation: Verbeek, H. et al. (2024) 'Minimizing the Minimizers via Alphabet Reordering', Leibniz International Proceedings in Informatics, LIPIcs, 296, pp. 28:1 - 28:13 (13). doi: 10.4230/LIPIcs.CPM.2024.28.
Abstract: Minimizers sampling is one of the most widely-used mechanisms for sampling strings [Roberts et al., Bioinformatics 2004]. Let S = S[1]… S[n] be a string over a totally ordered alphabet Σ. Further let w ≥ 2 and k ≥ 1 be two integers. The minimizer of S[i..i+w+k-2] is the smallest position in [i,i+w-1] where the lexicographically smallest length-k substring of S[i..i+w+k-2] starts. The set of minimizers over all i ∈ [1,n-w-k+2] is the set ℳ_{w,k}(S) of the minimizers of S. We consider the following basic problem: Given S, w, and k, can we efficiently compute a total order on Σ that minimizes |ℳ_{w,k}(S)|? We show that this is unlikely by proving that the problem is NP-hard for any w ≥ 3 and k ≥ 1. Our result provides theoretical justification as to why there exist no exact algorithms for minimizing the minimizers samples, while there exists a plethora of heuristics for the same purpose.
Description: Conference paper presented at the 35th Annual Symposium on Combinatorial Pattern Matching (CPM 2024), Fukuoka, Japan, 25-27 Jun 2024.
URI: https://bura.brunel.ac.uk/handle/2438/30313
DOI: https://doi.org/10.4230/LIPIcs.CPM.2024.28
ISBN: 978-3-95977-326-3
Other Identifiers: ORCiD: Hilde Verbeek https://orcid.org/0000-0002-2399-3098
ORCiD: Lorraine A.K. Ayad https://orcid.org/0000-0003-0846-2616
ORCiD: Grigorios Loukides https://orcid.org/0000-0003-0888-5061
ORCiD: Solon P. Pissis https://orcid.org/0000-0002-1445-1932
Appears in Collections:Dept of Computer Science Research Papers

Files in This Item:
File Description SizeFormat 
FullText.pdfCopyright © 2024 The Author(s). This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/).1.1 MBAdobe PDFView/Open


This item is licensed under a Creative Commons License Creative Commons