Research

My research interests are categorized in the table below according to the ACM-2012 classification.

Information SystemsData Management Systems -> Data Structures: Data Compression; Data Layout
Information Retrieval -> Search Engine Architectures and Scalability: Search Engine indexing; Search Index Compression
Information Retrieval -> Specialized Information Retrieval: Structure and Multilingual Text Search
Theory of ComputationDesign and Analysis of Algorithms -> Data Structures Design and Analysis: Data Compression; Pattern Matching; Sorting and Searching; Predecessor Queries
Theory and Algorithms for Application Domains -> Database Theory: Data Structures and Algorithms for Data Management
Mathematics of ComputingDiscrete Mathematics -> Combinatorics: Permutations and Combinations; Combinatorial Optimization; Combinatorics on Words; Combinatorial Algorithms, Enumeration
Information Theory -> Coding Theory
Security & PrivacyCryptography -> Mathematical Foundations of Cryptography
Cryptography -> Cryptanalysis and Other Attacks
Cryptography -> Database and Storage Security: Management and Querying of Encrypted Data
Applied ComputingDocument Management and Text Processing -> Document searching/management/capture/preparation
Life and Medical Sciences -> Computational Biology
Life and Medical Sciences -> Bioinformatics
Social and Professional TopicsTechnologies: -> SIMD (single-instruction-multiple-data) architectures

I am engaged in the design, implementation, and analysis of algorithms and computing systems in a broad sense, following an algorithm engineering approach. My focus encompasses various areas, including text algorithms, data compression/coding, computational biology/bioinformatics, ML/AI systems, information retrieval, cryptography, and information/data security.

I commenced my professional journey at the National Research Institute of Electronics & Cryptology of Turkey, dedicating more than a decade to the study of cryptography. My primary focus was on the efficient implementation and analysis of security and privacy algorithms and protocols, encompassing both hardware and software domains. Throughout those years, my endeavors not only allowed me to acquire expertise in crafting high-performance programs with minimal memory usage, but they also provided me with the invaluable opportunity to collaborate with accomplished mathematicians and hardware engineers. These collaborations expanded my understanding across various dimensions, including discrete mathematics, reverse engineering, and FPGA/ASIC design.

My academic pursuits culminated in the acquisition of a Ph.D. degree in natural language processing. However, I later shifted my research trajectory and became captivated by the realm of text algorithms. This transition proved rewarding as I was honored with the Best Paper Award in the data management track at the 22nd International Symposium on Computer and Information Sciences IEEE ISCIS 2007 conference. Subsequently, I authored a paper for the 19th International Symposium on Algorithms and Computation ISAAC’2008 conference as a sole contributor, addressing a long-standing challenge in bit-parallel pattern matching. Following this accomplishment, I was extended an opportunity to collaborate with Jeff Vitter in the Computer Science department at Texas A&M University as a postdoctoral researcher. This role facilitated my exploration of algorithms and data structures, with a pronounced focus on data compression and the processing of compressed data. Upon my return to my home country, I was privileged to co-lead the establishment of Turkey’s inaugural high-throughput DNA sequencing facility, made possible through a 2M USD grant from the Ministry of Development. Subsequent to the successful realization of this project, I transitioned to academia, undertaking roles encompassing teaching, research, proposal submission, grant acquisition, supervision of graduate students, industry consultation, and administrative services. My list of papers is given at the end of this page.

My focus on security aspects in data management has driven some of my inquiries towards privacy-preserving textual data processing, a realm that continues to captivate my attention. These endeavors have resulted in the patents US10554389B2 and US10740554B2. Notably, the patent titled ”Efficient Encryption Method to Secure Data with a Reduced Number of Encryption Operations” has garnered licensing agreements with three distinct companies, as elaborated in my CV. Recognition for my work includes innovation awards from Istanbul Technical University and IFIA (International Federation of Inventors’ Associations). My engagement in research and development collaboration with various industries has led
me to provide consultancy services to numerous companies some of which are listed in my CV.

Throughout my career, I have come to recognize, and firmly believe, that theoretical knowledge is essential but insufficient for constructing efficient systems in practical contexts. The dichotomy between theory and practice resembles attempting flight with only one wing. Consequently, in my research endeavors, I adhere to the algorithm engineering methodology. I strive to amalgamate theory and practice by leveraging
the opportunities presented by the computing environment and the inherent properties of target data. For instance, harnessing the vectorization support available in contemporary processors, we previously achieved the development of one of the fastest pattern matching algorithms (EPSM) (ALENEX’13) . Another noteworthy accomplishment is my recent data coding patent (US10554389B2), licensed to three companies
to mitigate encryption costs in their products.

I delineate my research interests in terms of foundational themes that I delve into, along with their pragmatic applications. On the foundational facet, my focus has been directed toward the design and engineering of algorithms and data structures, geared to expeditiously tackle diverse tasks involving vast datasets with minimal resource utilization. Such tasks encompass a spectrum including, but not confined to, search, retrieval, coding, compression, cryptography, and pattern discovery.

In the domain of applied research or systems, these foundational aspects find a myriad of practical applications. Pattern-matching algorithms, for example, underpin a multitude of solutions in computational biology and bioinformatics. Similarly, advancements in more efficient indexing, data coding, and compressed data processing directly reverberate across the landscape of massive data management, data mining, and knowledge discovery. In the wake of recent strides in deep learning and large language models, the imperative to process massive volumes with utmost efficiency assumes paramount importance for the success of daily AI/ML applications. Currently, I am engaged in collaboration with AI/ML researchers to confront challenges faced by AI/ML algorithms constrained by time and memory limitations. We are devising strategies to mitigate computational loads while executing such algorithms on extensive datasets. My research trajectory remains committed to addressing both foundational principles and practical applications at the system level.

JOURNAL ARTICLES:

  1. Negar Golzar, Amir Aghabaioglou, Elif Altunok, and M. Oguzhan Kulekci. A framework for assessing a country’s scientific productivity based on published articles by scientists affiliated with that country. Information Discovery and Delivery, accepted, 2023
  2. Sahar Hooshmand, Paniz Abedin, M. Oguzhan Külekci, and Sharma V. Thankachan. I/O– efficient data structures for non-overlapping indexing. Theoretical Computer Science, 857:1–7, 2021
  3. Can Kockan, Kaiyuan Zhu, Natnatee Dokmai, Nikolai Karpov, M Oguzhan Kulekci, David P Woodruff, and S Cenk Sahinalp. Sketching algorithms for genomic data analysis and querying in a secure enclave. Nature Methods, 17(3):295–301, 2020
  4. Domenico Cantone, Simone Faro, and M. Oguzhan Külekci. The order-preserving pattern matching problem in practice. Discrete Applied Mathematics, 274:11–25, 2020
  5. Paniz Abedin, M. Oguzhan Külekci, and Shama V. Thankachan. A survey on shortest unique substring queries. Algorithms, 13(9):224, 2020
  6. Muhammed Oguzhan Külekci and Yasin Öztürk. Applications of non-uniquely decodable codes to privacy-preserving high-entropy data representation. Algorithms, 12(4):78, 2019
  7. Jan Voges, Ali Fotouhi, Jörn Ostermann, and M. Oğuzhan Külekci. A two-level scheme for quality score compression. Journal of Computational Biology, 25(10):1141–1151, 2018
  8. M. Oğuzhan Külekci and Sharma V. Thankachan. Range selection and predecessor queries in data aware space and time. Journal of Discrete Algorithms, 43:18 – 25, 2017
  9. Mahmut Samil Sağiroğlu and M. Oğuzhan Külekci. A system architecture for efficient trans- mission of massive dna sequencing data. Journal of Computational Biology, 24(11):1081–1088, 2017. PMID: 28414531
  10. Tamanna Chhabra, Simone Faro, M. Oğuzhan Külekci, and Jorma Tarhio. Engineering order- preserving pattern matching with SIMD parallelism. Software: Practice and Experience, 47(5):731– 739, 2017
  11. Pınar Kavak, Bayram Yüksel, Soner Aksu, M. Oguzhan Kulekci, Tunga Güngör, Faraz Hach, S. Cenk Şahinalp, Turkish Human Genome Project, Can Alkan, and Mahmut Şamil Sağıroğlu. Robustness of massively parallel sequencing platforms. PLOS ONE, 10(9):1–11, 09 2015
  12. Atalay Mert Ileri, M. Oguzhan Külekci, and Bojian Xu. A simple yet time-optimal and linear- space algorithm for shortest unique substring queries. Theoretical Computer Science, 562:621– 633, 2015
  13. Simone Faro and M. Oğuzhan Külekci. Fast and flexible packed string matching. Journal of Discrete Algorithms, 28:61–72, 2014
  14. M. Oğuzhan Külekci, Jeffrey Scott Vitter, and Bojian Xu. Efficient maximal repeat finding using the burrows-wheeler transform and wavelet tree. IEEE/ACM Transactions on Compu- tational Biology and Bioinformatics, 9(2):421–429, 2012
  15. M. Oğuzhan Külekci, Jeffrey Scott Vitter, and Bojian Xu. Fast pattern matching via k-bit filtering based text decomposition. The Computer Journal, 55(1):62–68, 2012
  16. M. Oğuzhan Külekci. On scrambling the burrows–wheeler transform to provide privacy in lossless compression. Computers & Security, 31(1):26–32, 2012
  17. M. Oğuzhan Külekci. BLIM: A new bit-parallel pattern matching algorithm overcoming com- puter word size limitation. Mathematics in Computer Science, 3(4):407–420, 2010

CONFERENCE PUBLICATIONS:

  1. M Oguzhan Kulekci. Randomized data partitioning with efficient search, retrieval and privacy- preservation. In The 29th International Computing and Combinatorics Conference (COCOON), 2023
  2. M. Oguzhan Kulekci. Counting with prediction: Rank and select queries with adjusted anchor- ing. In 2022 Data Compression Conference (DCC), pages 409–418, 2022
  3. Arghya Kusum Das, M. Oguzhan Kulekci, and Sharma Thankachan. Memory efficient fm-index construction for reference genomes. In 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2022
  4. Burak Cardak, Levent Carkacioglu, and M Oguzhan Kulekci. Tile selection-based H265/HEVC coding. In Proceedings of IEEE 16th International Conference on Signal Image Technology & Internet Based Systems (SITIS), France, 2022. to appear
  5. Arnab Ganguly, Daniel Gibney, Sahar Hooshmand, M. Oguzhan Külekci, and Sharma V. Thankachan. Fm-index reveals the reverse suffix array. In 31st Annual Symposium on Com- binatorial Pattern Matching, CPM 2020, June 17-19, 2020, Copenhagen, Denmark, pages 13:1–13:14, 2020
  6. M. Oguzhan Külekci, Yasin Öztürk, Elif Altunok, and Can Altinigne. Enumerative data com- pression with non-uniquely decodable codes. In Proceedings of Prague Stringology Conference, 2020
  7. SimoneFaro,DomenicoCantone,andM.OguzhanKülekci.Shape-preservingpattern–matching. In 21st Italian Conference on Theoretical Computer Science, ICTCS 2020, September 14-16, 2020, Ischia, Italy, pages 13:1–13:14, 2020
  8. M. Oğuzhan Külekci, Ismail Habib, and Amir Aghabaiglou. Privacy-preserving text similar- ity via non-prefix-free codes. In Giuseppe Amato, Claudio Gennaro, Vincent Oria, and Mi- los Radovanovic, editors, Similarity Search and Applications – 12th International Conference, SISAP 2019, Newark, NJ, USA, October 2-4, 2019, Proceedings, volume 11807 of Lecture Notes in Computer Science, pages 94–102. Springer, 2019
  9. Can Kockan, Kaiyuan Zhu, Natnatee Dokmai, Nikolai Karpov, M. Oguzhan Külekci, David P. Woodruff, and Süleyman Cenk Sahinalp. Sketching algorithms for genomic data analysis and querying in a secure enclave. In Lenore J. Cowen, editor, Research in Computational Molecular Biology – 23rd Annual International Conference, RECOMB 2019, Washington, DC, USA, May 5-8, 2019, Proceedings, volume 11467 of Lecture Notes in Computer Science, pages 302–304. Springer, 2019
  10. Sahar Hooshmand, Paniz Abedin, M. Oguzhan Külekci, and Sharma V. Thankachan. Non- Overlapping Indexing – Cache Obliviously. In Gonzalo Navarro, David Sankoff, and Binhai Zhu, editors, Annual Symposium on Combinatorial Pattern Matching (CPM 2018), volume 105 of Leibniz International Proceedings in Informatics (LIPIcs), pages 8:1–8:9, Dagstuhl, Germany, 2018. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik
  11. M. Oguzhan Külekci. An Ambiguous Coding Scheme for Selective Encryption of High Entropy Volumes. In Gianlorenzo D’Angelo, editor, 17th International Symposium on Experimental Al- gorithms (SEA 2018), volume 103 of Leibniz International Proceedings in Informatics (LIPIcs), pages 7:1–7:13, Dagstuhl, Germany, 2018. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik
  12. Mehmet Akif Aydogmus and M. Oguzhan Külekci. Optimizing packed string matching on AVX2 platform. In Hermes Senger, Osni Marques, Rogério Eduardo Garcia, Tatiana Pinheiro de Brito, Rogério Iope, Silvio Luiz Stanzani, and Veronica Gil Costa, editors, High Performance Computing for Computational Science – VECPAR 2018 – 13th International Conference, São Pedro, Brazil, September 17-19, 2018, Revised Selected Papers, volume 11333 of Lecture Notes in Computer Science, pages 45–61. Springer, 2018
  13. Jan Voges, Ali Fotouhi, Jörn Ostermann, and M. Oğuzhan Külekci. A two-level scheme for quality score compression. In Proceedings of the 2018 International Conference on Bioinfor- matics and Computational Biology, pages 161–167. IEEE, 2018
  14. Ali Fotouhi, Mina Majidi, and M. Oguzhan Külekci. Quality assessment of high-throughput DNA sequencing data via range analysis. In Ignacio Rojas and Francisco M. Ortuño Guzman, editors, Bioinformatics and Biomedical Engineering – 6th International Work-Conference, IWB- BIO 2018, Granada, Spain, April 25-27, 2018, Proceedings, Part I, volume 10813 of Lecture Notes in Computer Science, pages 429–438. Springer, 2018
  15. Eren Kocaaga and M. Oguzhan Külekci. Security analysis on the ADS-B technology. In 25th Signal Processing and Communications Applications Conference, SIU 2017, Antalya, Turkey, May 15-18, 2017, pages 1–4, 2017
  16. M. Oguzhan Külekci. Inverse range selection queries. In Shunsuke Inenaga, Kunihiko Sadakane, and Tetsuya Sakai, editors, String Processing and Information Retrieval – 23rd International Symposium, SPIRE 2016, Beppu, Japan, October 18-20, 2016, Proceedings, volume 9954 of Lecture Notes in Computer Science, pages 166–177, 2016
  17. Simone Faro and M. Oguzhan Külekci. Efficient algorithms for the order preserving pattern matching problem. In Riccardo Dondi, Guillaume Fertin, and Giancarlo Mauri, editors, Algo- rithmic Aspects in Information and Management – 11th International Conference, AAIM 2016, Bergamo, Italy, July 18-20, 2016, Proceedings, volume 9778 of Lecture Notes in Computer Sci- ence, pages 185–196. Springer, 2016
  18. M. Oguzhan Külekci and M. Samil Sagiroglu. GENCROBAT: efficient transmission and pro- cessing of the massive genomic data. In 2016 IEEE/IFIP Network Operations and Management Symposium, NOMS 2016, Istanbul, Turkey, April 25-29, 2016, pages 999–1000, 2016
  19. M. Oguzhan Külekci and Sharma V. Thankachan. Range selection queries in data aware space and time. In Ali Bilgin, Michael W. Marcellin, Joan Serra-Sagristà, and James A. Storer, editors, 2015 Data Compression Conference, DCC 2015, Snowbird, UT, USA, April 7-9, 2015, pages 73–82. IEEE, 2015
  20. Boran Adas, Ersin Bayraktar, and M. Oguzhan Külekci. Huffman codes versus augmented non- prefix-free codes. In Evripidis Bampis, editor, Experimental Algorithms – 14th International Symposium, SEA 2015, Paris, France, June 29 – July 1, 2015, Proceedings, volume 9125 of Lecture Notes in Computer Science, pages 315–326. Springer, 2015
  21. Domenico Cantone, Simone Faro, and M. Oğuzhan Külekci. An efficient skip-search approach to the order-preserving pattern matching problem. In Jan Holub and Jan Žďárek, editors, Proceedings of the Prague Stringology Conference, pages 22–35, Prague, Czech Republic, 2015
  22. Tamanna Chhabra, M. Oğuzhan Külekci, and Jorma Tarhio. Alternative algorithms for order- preserving matching. In Jan Holub and Jan Žďárek, editors, Proceedings of the Prague Stringol- ogy Conference, pages 36–46, Prague, Czech Republic, 2015
  23. Boran Adas, Ersin Bayraktar, Simone Faro, Ibraheem Elsayed Moustafa, and M. Oguzhan Külekci. Nucleotide sequence alignment and compression via shortest unique substring. In Francisco M. Ortuño Guzman and Ignacio Rojas, editors, Bioinformatics and Biomedical En- gineering – Third International Conference, IWBBIO 2015, Granada, Spain, April 15-17, 2015. Proceedings, Part II, volume 9044 of Lecture Notes in Computer Science, pages 363– 374. Springer, 2015
  24. Muhammed Oğuzhan Külekci. Enhanced variable-length codes: Improved compression with efficient random access. In Proceedings of IEEE Data Compression Conference (DCC), pages 362–371, Snowbird, UT, USA, 2014. IEEE
  25. Atalay Mert İleri, M.Oğuzhan Külekci, and Bojian Xu. Shortest unique substring query revis- ited. In AlexanderS. Kulikov, SergeiO. Kuznetsov, and Pavel Pevzner, editors, Combinatorial Pattern Matching, volume 8486 of Lecture Notes in Computer Science, pages 172–181. Springer International Publishing, 2014
  26. M. Oğuzhan Külekci. Uniquely decodable and directly accessible non-prefix-free codes via wavelet trees. In Proceedings of IEEE International Symposium on Information Theory (ISIT), pages 1969–1973. IEEE, 2013
  27. Simone Faro and M. Oğuzhan Külekci. Fast packed string matching for short patterns. InProceedings of ACM-SIAM Symposium on Discrete Algorithms – Meeting on Algorithm En- gineering and Experiments (ALENEX), pages 113–121, New Orleans,Louisiana, USA, 2013. SIAM
  28. Simone Faro and M. Oğuzhan Külekci. Towards a very fast multiple string matching algorithm for short patterns. In Jan Holub and Jan Žďárek, editors, Proceedings of the Prague Stringology Conference, pages 78–91, Prague, Czech Republic, 2013
  29. M. Oğuzhan Külekci. On enumerating the dna sequences. In Proceedings of the ACM Con- ference on Bioinformatics, Computational Biology and Biomedicine, pages 442–449, Orlando, FL, USA, 2012. ACM
  30. Simone Faro and M.Oğuzhan Külekci. Fast Multiple String Matching Using Streaming SIMD Extensions Technology. In Liliana Calderón-Benavides, Cristina González-Caro, Edgar Chávez, and Nivio Ziviani, editors, String Processing and Information Retrieval, volume 7608 of Lecture Notes in Computer Science, pages 217–228. Springer Berlin Heidelberg, 2012
  31. M Oğuzhan Külekci. Compressed context modeling for text compression. In Proceedings of IEEE Data Compression Conference (DCC), pages 373–382, Snowbird, UT, USA, 2011. IEEE
  32. M Oguzhan Kulekci. A method to ensure the confidentiality of the compressed data. In 2011 First International Conference on Data Compression, Communications and Processing, pages 203–209. IEEE, 2011
  33. M Oğuzhan Külekci, Jeffrey Scott Vitter, and Bojian Xu. Boosting pattern matching per- formance via k-bit filtering. In Erol Gelenbe, Ricardo Lent, Georgia Sakellari, Ahmet Sacan, Hakki Toroslu, and Adnan Yazici, editors, Computer and Information Sciences, volume 62 of Lecture Notes in Electrical Engineering, pages 27–32. Springer Netherlands, 2010
  34. M. Oğuzhan Külekci, Jeffrey Scott Vitter, and Bojian Xu. Time-and space-efficient maxi- mal repeat finding using the burrows-wheeler transform and wavelet trees. In Proceedings of 2010 IEEE International Conference on Bioinformatics and Biomedicine, pages 622–625, Hong Kong, 2010. IEEE
  35. M Oğuzhan Külekci, Wing-Kai Hon, Rahul Shah, Jeffrey Scott Vitter, and Bojian Xu. Psi- ra: A parallel sparse index for read alignment on genomes. In Proceedings of 2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pages 663–668, Hong Kong, 2010. IEEE
  36. M. Oguzhan Külekci. Filter based fast matching of long patterns by using simd instructions. In Proceedings of Prague Stringology Conference, pages 118–128, Prague, Czech Republic, 2009
  37. M. Oguzhan Külekci. A method to overcome computer word size limitation in bit-parallel pattern matching. In Seok-Hee Hong, Hiroshi Nagamochi, and Takuro Fukunaga, editors, Algorithms and Computation, 19th International Symposium, ISAAC 2008, Gold Coast, Aus- tralia, December 15-17, 2008. Proceedings, volume 5369 of Lecture Notes in Computer Science, pages 496–506. Springer, 2008
  38. M. Oğuzhan Külekci. Tara: An algorithm for fast searching of multiple patterns on text files. In Proceedings of 22nd International Symposium on Computer and Information Sciences, pages 1–6, Ankara, Turkey, 2007. IEEE. Best paper award
  39. M. Oğuzhan Külekci. An empirical analysis of pattern scan order in pattern matching. InProceedings of World Congress on Engineering, International Conference on Data Mining and Knowledge Engineering (ICDMKE), pages 337–341, London, UK, 2007
  40. M Oğuzhan Külekci and Kemal Oflazer. An infrastructure for turkish prosody generation in text-to-speech synthesis. In Proceedings of the Turkish Symposium on Artificial Intelligence and Neural Network (TAINN), Akyaka, Muğla, Turkey, 2006
  41. M. Oğuzhan Külekci and Kemal Oflazer. Pronunciation disambiguation in turkish. In Pınar Yolum, Tunga Güngör, Fikret Gürgen, and Can Özturan, editors, Computer and Information Sciences, volume 3733 of Lecture Notes in Computer Science, pages 636–645. Springer Berlin Heidelberg, 2005
  42. M. Oğuzhan Külekci and Kemal Oflazer. Yazıdan konuşma üretmede kullanılan doğal dil işleme tekniklerine genel bir bakış (an overview of natural language processing techniques in text-to- speech systems). In Proceedings of the IEEE 12th Signal Processing and Communications Applications Conference, pages 454–457. IEEE, 2004
  43. M. Oğuzhan Külekci and Mehmed Özkan. Turkish word segmentation using morphological analyzer. In Proceedings of 7th European Conference on Speech Communication and Tech- nology (2nd INTERSPEECH Event), pages 1053–1056, Aalborg, Denmark, September 2001. International Speech Communication Association