My research interests are categorized in the table below according to the ACM-2012 classification.
Information Systems | Data Management Systems -> Data Structures: Data Compression; Data Layout Information Retrieval -> Search Engine Architectures and Scalability: Search Engine indexing; Search Index Compression Information Retrieval -> Specialized Information Retrieval: Structure and Multilingual Text Search |
Theory of Computation | Design and Analysis of Algorithms -> Data Structures Design and Analysis: Data Compression; Pattern Matching; Sorting and Searching; Predecessor Queries Theory and Algorithms for Application Domains -> Database Theory: Data Structures and Algorithms for Data Management |
Mathematics of Computing | Discrete Mathematics -> Combinatorics: Permutations and Combinations; Combinatorial Optimization; Combinatorics on Words; Combinatorial Algorithms, Enumeration Information Theory -> Coding Theory |
Security & Privacy | Cryptography -> Mathematical Foundations of Cryptography Cryptography -> Cryptanalysis and Other Attacks Cryptography -> Database and Storage Security: Management and Querying of Encrypted Data |
Applied Computing | Document Management and Text Processing -> Document searching/management/capture/preparation Life and Medical Sciences -> Computational Biology Life and Medical Sciences -> Bioinformatics |
Social and Professional Topics | Technologies: -> SIMD (single-instruction-multiple-data) architectures |
I am engaged in the design, implementation, and analysis of algorithms and computing systems in a broad sense, following an algorithm engineering approach. My focus encompasses various areas, including text algorithms, data compression/coding, computational biology/bioinformatics, ML/AI systems, information retrieval, cryptography, and information/data security.
I commenced my professional journey at the National Research Institute of Electronics & Cryptology of Turkey, dedicating more than a decade to the study of cryptography. My primary focus was on the efficient implementation and analysis of security and privacy algorithms and protocols, encompassing both hardware and software domains. Throughout those years, my endeavors not only allowed me to acquire expertise in crafting high-performance programs with minimal memory usage, but they also provided me with the invaluable opportunity to collaborate with accomplished mathematicians and hardware engineers. These collaborations expanded my understanding across various dimensions, including discrete mathematics, reverse engineering, and FPGA/ASIC design.
My academic pursuits culminated in the acquisition of a Ph.D. degree in natural language processing. However, I later shifted my research trajectory and became captivated by the realm of text algorithms. This transition proved rewarding as I was honored with the Best Paper Award in the data management track at the 22nd International Symposium on Computer and Information Sciences IEEE ISCIS 2007 conference. Subsequently, I authored a paper for the 19th International Symposium on Algorithms and Computation ISAAC’2008 conference as a sole contributor, addressing a long-standing challenge in bit-parallel pattern matching. Following this accomplishment, I was extended an opportunity to collaborate with Jeff Vitter in the Computer Science department at Texas A&M University as a postdoctoral researcher. This role facilitated my exploration of algorithms and data structures, with a pronounced focus on data compression and the processing of compressed data. Upon my return to my home country, I was privileged to co-lead the establishment of Turkey’s inaugural high-throughput DNA sequencing facility, made possible through a 2M USD grant from the Ministry of Development. Subsequent to the successful realization of this project, I transitioned to academia, undertaking roles encompassing teaching, research, proposal submission, grant acquisition, supervision of graduate students, industry consultation, and administrative services. My list of papers is given at the end of this page.
My focus on security aspects in data management has driven some of my inquiries towards privacy-preserving textual data processing, a realm that continues to captivate my attention. These endeavors have resulted in the patents US10554389B2 and US10740554B2. Notably, the patent titled ”Efficient Encryption Method to Secure Data with a Reduced Number of Encryption Operations” has garnered licensing agreements with three distinct companies, as elaborated in my CV. Recognition for my work includes innovation awards from Istanbul Technical University and IFIA (International Federation of Inventors’ Associations). My engagement in research and development collaboration with various industries has led
me to provide consultancy services to numerous companies some of which are listed in my CV.
Throughout my career, I have come to recognize, and firmly believe, that theoretical knowledge is essential but insufficient for constructing efficient systems in practical contexts. The dichotomy between theory and practice resembles attempting flight with only one wing. Consequently, in my research endeavors, I adhere to the algorithm engineering methodology. I strive to amalgamate theory and practice by leveraging
the opportunities presented by the computing environment and the inherent properties of target data. For instance, harnessing the vectorization support available in contemporary processors, we previously achieved the development of one of the fastest pattern matching algorithms (EPSM) (ALENEX’13) . Another noteworthy accomplishment is my recent data coding patent (US10554389B2), licensed to three companies
to mitigate encryption costs in their products.
I delineate my research interests in terms of foundational themes that I delve into, along with their pragmatic applications. On the foundational facet, my focus has been directed toward the design and engineering of algorithms and data structures, geared to expeditiously tackle diverse tasks involving vast datasets with minimal resource utilization. Such tasks encompass a spectrum including, but not confined to, search, retrieval, coding, compression, cryptography, and pattern discovery.
In the domain of applied research or systems, these foundational aspects find a myriad of practical applications. Pattern-matching algorithms, for example, underpin a multitude of solutions in computational biology and bioinformatics. Similarly, advancements in more efficient indexing, data coding, and compressed data processing directly reverberate across the landscape of massive data management, data mining, and knowledge discovery. In the wake of recent strides in deep learning and large language models, the imperative to process massive volumes with utmost efficiency assumes paramount importance for the success of daily AI/ML applications. Currently, I am engaged in collaboration with AI/ML researchers to confront challenges faced by AI/ML algorithms constrained by time and memory limitations. We are devising strategies to mitigate computational loads while executing such algorithms on extensive datasets. My research trajectory remains committed to addressing both foundational principles and practical applications at the system level.
JOURNAL ARTICLES:
- Negar Golzar, Amir Aghabaioglou, Elif Altunok, and M. Oguzhan Kulekci. A framework for assessing a country’s scientific productivity based on published articles by scientists affiliated with that country. Information Discovery and Delivery, accepted, 2023
- Sahar Hooshmand, Paniz Abedin, M. Oguzhan Külekci, and Sharma V. Thankachan. I/O– efficient data structures for non-overlapping indexing. Theoretical Computer Science, 857:1–7, 2021
- Can Kockan, Kaiyuan Zhu, Natnatee Dokmai, Nikolai Karpov, M Oguzhan Kulekci, David P Woodruff, and S Cenk Sahinalp. Sketching algorithms for genomic data analysis and querying in a secure enclave. Nature Methods, 17(3):295–301, 2020
- Domenico Cantone, Simone Faro, and M. Oguzhan Külekci. The order-preserving pattern matching problem in practice. Discrete Applied Mathematics, 274:11–25, 2020
- Paniz Abedin, M. Oguzhan Külekci, and Shama V. Thankachan. A survey on shortest unique substring queries. Algorithms, 13(9):224, 2020
- Muhammed Oguzhan Külekci and Yasin Öztürk. Applications of non-uniquely decodable codes to privacy-preserving high-entropy data representation. Algorithms, 12(4):78, 2019
- Jan Voges, Ali Fotouhi, Jörn Ostermann, and M. Oğuzhan Külekci. A two-level scheme for quality score compression. Journal of Computational Biology, 25(10):1141–1151, 2018
- M. Oğuzhan Külekci and Sharma V. Thankachan. Range selection and predecessor queries in data aware space and time. Journal of Discrete Algorithms, 43:18 – 25, 2017
- Mahmut Samil Sağiroğlu and M. Oğuzhan Külekci. A system architecture for efficient trans- mission of massive dna sequencing data. Journal of Computational Biology, 24(11):1081–1088, 2017. PMID: 28414531
- Tamanna Chhabra, Simone Faro, M. Oğuzhan Külekci, and Jorma Tarhio. Engineering order- preserving pattern matching with SIMD parallelism. Software: Practice and Experience, 47(5):731– 739, 2017
- Pınar Kavak, Bayram Yüksel, Soner Aksu, M. Oguzhan Kulekci, Tunga Güngör, Faraz Hach, S. Cenk Şahinalp, Turkish Human Genome Project, Can Alkan, and Mahmut Şamil Sağıroğlu. Robustness of massively parallel sequencing platforms. PLOS ONE, 10(9):1–11, 09 2015
- Atalay Mert Ileri, M. Oguzhan Külekci, and Bojian Xu. A simple yet time-optimal and linear- space algorithm for shortest unique substring queries. Theoretical Computer Science, 562:621– 633, 2015
- Simone Faro and M. Oğuzhan Külekci. Fast and flexible packed string matching. Journal of Discrete Algorithms, 28:61–72, 2014
- M. Oğuzhan Külekci, Jeffrey Scott Vitter, and Bojian Xu. Efficient maximal repeat finding using the burrows-wheeler transform and wavelet tree. IEEE/ACM Transactions on Compu- tational Biology and Bioinformatics, 9(2):421–429, 2012
- M. Oğuzhan Külekci, Jeffrey Scott Vitter, and Bojian Xu. Fast pattern matching via k-bit filtering based text decomposition. The Computer Journal, 55(1):62–68, 2012
- M. Oğuzhan Külekci. On scrambling the burrows–wheeler transform to provide privacy in lossless compression. Computers & Security, 31(1):26–32, 2012
- M. Oğuzhan Külekci. BLIM: A new bit-parallel pattern matching algorithm overcoming com- puter word size limitation. Mathematics in Computer Science, 3(4):407–420, 2010
CONFERENCE PUBLICATIONS:
- M Oguzhan Kulekci. Randomized data partitioning with efficient search, retrieval and privacy- preservation. In The 29th International Computing and Combinatorics Conference (COCOON), 2023
- M. Oguzhan Kulekci. Counting with prediction: Rank and select queries with adjusted anchor- ing. In 2022 Data Compression Conference (DCC), pages 409–418, 2022
- Arghya Kusum Das, M. Oguzhan Kulekci, and Sharma Thankachan. Memory efficient fm-index construction for reference genomes. In 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2022
- Burak Cardak, Levent Carkacioglu, and M Oguzhan Kulekci. Tile selection-based H265/HEVC coding. In Proceedings of IEEE 16th International Conference on Signal Image Technology & Internet Based Systems (SITIS), France, 2022. to appear
- Arnab Ganguly, Daniel Gibney, Sahar Hooshmand, M. Oguzhan Külekci, and Sharma V. Thankachan. Fm-index reveals the reverse suffix array. In 31st Annual Symposium on Com- binatorial Pattern Matching, CPM 2020, June 17-19, 2020, Copenhagen, Denmark, pages 13:1–13:14, 2020
- M. Oguzhan Külekci, Yasin Öztürk, Elif Altunok, and Can Altinigne. Enumerative data com- pression with non-uniquely decodable codes. In Proceedings of Prague Stringology Conference, 2020
- SimoneFaro,DomenicoCantone,andM.OguzhanKülekci.Shape-preservingpattern–matching. In 21st Italian Conference on Theoretical Computer Science, ICTCS 2020, September 14-16, 2020, Ischia, Italy, pages 13:1–13:14, 2020
- M. Oğuzhan Külekci, Ismail Habib, and Amir Aghabaiglou. Privacy-preserving text similar- ity via non-prefix-free codes. In Giuseppe Amato, Claudio Gennaro, Vincent Oria, and Mi- los Radovanovic, editors, Similarity Search and Applications – 12th International Conference, SISAP 2019, Newark, NJ, USA, October 2-4, 2019, Proceedings, volume 11807 of Lecture Notes in Computer Science, pages 94–102. Springer, 2019
- Can Kockan, Kaiyuan Zhu, Natnatee Dokmai, Nikolai Karpov, M. Oguzhan Külekci, David P. Woodruff, and Süleyman Cenk Sahinalp. Sketching algorithms for genomic data analysis and querying in a secure enclave. In Lenore J. Cowen, editor, Research in Computational Molecular Biology – 23rd Annual International Conference, RECOMB 2019, Washington, DC, USA, May 5-8, 2019, Proceedings, volume 11467 of Lecture Notes in Computer Science, pages 302–304. Springer, 2019
- Sahar Hooshmand, Paniz Abedin, M. Oguzhan Külekci, and Sharma V. Thankachan. Non- Overlapping Indexing – Cache Obliviously. In Gonzalo Navarro, David Sankoff, and Binhai Zhu, editors, Annual Symposium on Combinatorial Pattern Matching (CPM 2018), volume 105 of Leibniz International Proceedings in Informatics (LIPIcs), pages 8:1–8:9, Dagstuhl, Germany, 2018. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik
- M. Oguzhan Külekci. An Ambiguous Coding Scheme for Selective Encryption of High Entropy Volumes. In Gianlorenzo D’Angelo, editor, 17th International Symposium on Experimental Al- gorithms (SEA 2018), volume 103 of Leibniz International Proceedings in Informatics (LIPIcs), pages 7:1–7:13, Dagstuhl, Germany, 2018. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik
- Mehmet Akif Aydogmus and M. Oguzhan Külekci. Optimizing packed string matching on AVX2 platform. In Hermes Senger, Osni Marques, Rogério Eduardo Garcia, Tatiana Pinheiro de Brito, Rogério Iope, Silvio Luiz Stanzani, and Veronica Gil Costa, editors, High Performance Computing for Computational Science – VECPAR 2018 – 13th International Conference, São Pedro, Brazil, September 17-19, 2018, Revised Selected Papers, volume 11333 of Lecture Notes in Computer Science, pages 45–61. Springer, 2018
- Jan Voges, Ali Fotouhi, Jörn Ostermann, and M. Oğuzhan Külekci. A two-level scheme for quality score compression. In Proceedings of the 2018 International Conference on Bioinfor- matics and Computational Biology, pages 161–167. IEEE, 2018
- Ali Fotouhi, Mina Majidi, and M. Oguzhan Külekci. Quality assessment of high-throughput DNA sequencing data via range analysis. In Ignacio Rojas and Francisco M. Ortuño Guzman, editors, Bioinformatics and Biomedical Engineering – 6th International Work-Conference, IWB- BIO 2018, Granada, Spain, April 25-27, 2018, Proceedings, Part I, volume 10813 of Lecture Notes in Computer Science, pages 429–438. Springer, 2018
- Eren Kocaaga and M. Oguzhan Külekci. Security analysis on the ADS-B technology. In 25th Signal Processing and Communications Applications Conference, SIU 2017, Antalya, Turkey, May 15-18, 2017, pages 1–4, 2017
- M. Oguzhan Külekci. Inverse range selection queries. In Shunsuke Inenaga, Kunihiko Sadakane, and Tetsuya Sakai, editors, String Processing and Information Retrieval – 23rd International Symposium, SPIRE 2016, Beppu, Japan, October 18-20, 2016, Proceedings, volume 9954 of Lecture Notes in Computer Science, pages 166–177, 2016
- Simone Faro and M. Oguzhan Külekci. Efficient algorithms for the order preserving pattern matching problem. In Riccardo Dondi, Guillaume Fertin, and Giancarlo Mauri, editors, Algo- rithmic Aspects in Information and Management – 11th International Conference, AAIM 2016, Bergamo, Italy, July 18-20, 2016, Proceedings, volume 9778 of Lecture Notes in Computer Sci- ence, pages 185–196. Springer, 2016
- M. Oguzhan Külekci and M. Samil Sagiroglu. GENCROBAT: efficient transmission and pro- cessing of the massive genomic data. In 2016 IEEE/IFIP Network Operations and Management Symposium, NOMS 2016, Istanbul, Turkey, April 25-29, 2016, pages 999–1000, 2016
- M. Oguzhan Külekci and Sharma V. Thankachan. Range selection queries in data aware space and time. In Ali Bilgin, Michael W. Marcellin, Joan Serra-Sagristà, and James A. Storer, editors, 2015 Data Compression Conference, DCC 2015, Snowbird, UT, USA, April 7-9, 2015, pages 73–82. IEEE, 2015
- Boran Adas, Ersin Bayraktar, and M. Oguzhan Külekci. Huffman codes versus augmented non- prefix-free codes. In Evripidis Bampis, editor, Experimental Algorithms – 14th International Symposium, SEA 2015, Paris, France, June 29 – July 1, 2015, Proceedings, volume 9125 of Lecture Notes in Computer Science, pages 315–326. Springer, 2015
- Domenico Cantone, Simone Faro, and M. Oğuzhan Külekci. An efficient skip-search approach to the order-preserving pattern matching problem. In Jan Holub and Jan Žďárek, editors, Proceedings of the Prague Stringology Conference, pages 22–35, Prague, Czech Republic, 2015
- Tamanna Chhabra, M. Oğuzhan Külekci, and Jorma Tarhio. Alternative algorithms for order- preserving matching. In Jan Holub and Jan Žďárek, editors, Proceedings of the Prague Stringol- ogy Conference, pages 36–46, Prague, Czech Republic, 2015
- Boran Adas, Ersin Bayraktar, Simone Faro, Ibraheem Elsayed Moustafa, and M. Oguzhan Külekci. Nucleotide sequence alignment and compression via shortest unique substring. In Francisco M. Ortuño Guzman and Ignacio Rojas, editors, Bioinformatics and Biomedical En- gineering – Third International Conference, IWBBIO 2015, Granada, Spain, April 15-17, 2015. Proceedings, Part II, volume 9044 of Lecture Notes in Computer Science, pages 363– 374. Springer, 2015
- Muhammed Oğuzhan Külekci. Enhanced variable-length codes: Improved compression with efficient random access. In Proceedings of IEEE Data Compression Conference (DCC), pages 362–371, Snowbird, UT, USA, 2014. IEEE
- Atalay Mert İleri, M.Oğuzhan Külekci, and Bojian Xu. Shortest unique substring query revis- ited. In AlexanderS. Kulikov, SergeiO. Kuznetsov, and Pavel Pevzner, editors, Combinatorial Pattern Matching, volume 8486 of Lecture Notes in Computer Science, pages 172–181. Springer International Publishing, 2014
- M. Oğuzhan Külekci. Uniquely decodable and directly accessible non-prefix-free codes via wavelet trees. In Proceedings of IEEE International Symposium on Information Theory (ISIT), pages 1969–1973. IEEE, 2013
- Simone Faro and M. Oğuzhan Külekci. Fast packed string matching for short patterns. InProceedings of ACM-SIAM Symposium on Discrete Algorithms – Meeting on Algorithm En- gineering and Experiments (ALENEX), pages 113–121, New Orleans,Louisiana, USA, 2013. SIAM
- Simone Faro and M. Oğuzhan Külekci. Towards a very fast multiple string matching algorithm for short patterns. In Jan Holub and Jan Žďárek, editors, Proceedings of the Prague Stringology Conference, pages 78–91, Prague, Czech Republic, 2013
- M. Oğuzhan Külekci. On enumerating the dna sequences. In Proceedings of the ACM Con- ference on Bioinformatics, Computational Biology and Biomedicine, pages 442–449, Orlando, FL, USA, 2012. ACM
- Simone Faro and M.Oğuzhan Külekci. Fast Multiple String Matching Using Streaming SIMD Extensions Technology. In Liliana Calderón-Benavides, Cristina González-Caro, Edgar Chávez, and Nivio Ziviani, editors, String Processing and Information Retrieval, volume 7608 of Lecture Notes in Computer Science, pages 217–228. Springer Berlin Heidelberg, 2012
- M Oğuzhan Külekci. Compressed context modeling for text compression. In Proceedings of IEEE Data Compression Conference (DCC), pages 373–382, Snowbird, UT, USA, 2011. IEEE
- M Oguzhan Kulekci. A method to ensure the confidentiality of the compressed data. In 2011 First International Conference on Data Compression, Communications and Processing, pages 203–209. IEEE, 2011
- M Oğuzhan Külekci, Jeffrey Scott Vitter, and Bojian Xu. Boosting pattern matching per- formance via k-bit filtering. In Erol Gelenbe, Ricardo Lent, Georgia Sakellari, Ahmet Sacan, Hakki Toroslu, and Adnan Yazici, editors, Computer and Information Sciences, volume 62 of Lecture Notes in Electrical Engineering, pages 27–32. Springer Netherlands, 2010
- M. Oğuzhan Külekci, Jeffrey Scott Vitter, and Bojian Xu. Time-and space-efficient maxi- mal repeat finding using the burrows-wheeler transform and wavelet trees. In Proceedings of 2010 IEEE International Conference on Bioinformatics and Biomedicine, pages 622–625, Hong Kong, 2010. IEEE
- M Oğuzhan Külekci, Wing-Kai Hon, Rahul Shah, Jeffrey Scott Vitter, and Bojian Xu. Psi- ra: A parallel sparse index for read alignment on genomes. In Proceedings of 2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pages 663–668, Hong Kong, 2010. IEEE
- M. Oguzhan Külekci. Filter based fast matching of long patterns by using simd instructions. In Proceedings of Prague Stringology Conference, pages 118–128, Prague, Czech Republic, 2009
- M. Oguzhan Külekci. A method to overcome computer word size limitation in bit-parallel pattern matching. In Seok-Hee Hong, Hiroshi Nagamochi, and Takuro Fukunaga, editors, Algorithms and Computation, 19th International Symposium, ISAAC 2008, Gold Coast, Aus- tralia, December 15-17, 2008. Proceedings, volume 5369 of Lecture Notes in Computer Science, pages 496–506. Springer, 2008
- M. Oğuzhan Külekci. Tara: An algorithm for fast searching of multiple patterns on text files. In Proceedings of 22nd International Symposium on Computer and Information Sciences, pages 1–6, Ankara, Turkey, 2007. IEEE. Best paper award
- M. Oğuzhan Külekci. An empirical analysis of pattern scan order in pattern matching. InProceedings of World Congress on Engineering, International Conference on Data Mining and Knowledge Engineering (ICDMKE), pages 337–341, London, UK, 2007
- M Oğuzhan Külekci and Kemal Oflazer. An infrastructure for turkish prosody generation in text-to-speech synthesis. In Proceedings of the Turkish Symposium on Artificial Intelligence and Neural Network (TAINN), Akyaka, Muğla, Turkey, 2006
- M. Oğuzhan Külekci and Kemal Oflazer. Pronunciation disambiguation in turkish. In Pınar Yolum, Tunga Güngör, Fikret Gürgen, and Can Özturan, editors, Computer and Information Sciences, volume 3733 of Lecture Notes in Computer Science, pages 636–645. Springer Berlin Heidelberg, 2005
- M. Oğuzhan Külekci and Kemal Oflazer. Yazıdan konuşma üretmede kullanılan doğal dil işleme tekniklerine genel bir bakış (an overview of natural language processing techniques in text-to- speech systems). In Proceedings of the IEEE 12th Signal Processing and Communications Applications Conference, pages 454–457. IEEE, 2004
- M. Oğuzhan Külekci and Mehmed Özkan. Turkish word segmentation using morphological analyzer. In Proceedings of 7th European Conference on Speech Communication and Tech- nology (2nd INTERSPEECH Event), pages 1053–1056, Aalborg, Denmark, September 2001. International Speech Communication Association