Invited Speakers

Invited Talks

Jarkko Kari, University of Turku, Finland

Title: Arto Salomaa: Pioneer of Formal Languages and Automata Theory

Abstract: This presentation explores the life and scholarly contributions of Arto Salomaa (1934-2025), a Finnish mathematician whose work has shaped the field of theoretical computer science. Renowned for his research on formal languages and automata theory, Salomaa authored hundreds of influential papers over seven decades. His classic books on Automata theory, Formal Languages, L-systems, etc. have become standard references worldwide. Beyond his scientific achievements, the presentation will highlight Salomaa’s role as an educator and organizer, notably at the University of Turku, where he fostered a vibrant community of theoretical computer science research. This talk recalls milestones of Salomaa’s career, expressing appreciation of his lasting impact on the development of language theory.

Georg Zetzsche, Max Planck Institute for Software Systems, Germany

Title: Unboundedness problems for formal languages

Abstract: Informally, unboundedness problems are decision problems that ask about the existence of infinitely many words (satisfying certain properties) in a formal language. One example is: Is a given language infinite? Or: Does a given language have super-polynomial growth? These came into focus in recent years because of their connections to downward closure computation and separability problems. Although unboundedness problems may seem difficult at first, it turns out that there are techniques that are at the same time conceptually very simple, but also apply to a surprisingly wide variety of language classes. The talk will survey recent results (and techniques) concerning unboundedness problems.

Kyungmin Bae, POSTECH, South Korea

Title: Robust Model Checking for Signal Temporal Logic

Abstract: Signal Temporal Logic (STL) is a temporal logic for specifying linear-time properties of continuous signals. It is widely used in monitoring and testing hybrid systems that exhibit both discrete and continuous behavior. However, model checking for such systems has mostly focused on invariants and reachability, due to the difficulty of handling uncountably many continuously evolving signals. In this talk, I introduce a bounded model checking approach for general STL properties of hybrid systems. Our method is built on a novel logical foundation for STL, including: (i) syntactic separation, which decomposes STL formulas into parts that depend exclusively on separate signal segments; (ii) signal discretization, which abstracts continuous signals into discrete elements; and (iii) epsilon-strengthening, which reduces robust STL checking to Boolean STL checking. This foundation allows STL model checking—up to given bounds and robustness thresholds—to be reduced to the satisfiability of a first-order formula, which is decidable if the reachability of the underlying hybrid system is decidable.

Christoph Haase, University of Oxford, UK

Title: Formal Languages and Arithmetic Theories: Recent Results and Open Problems

Abstract: In this talk, I am going to survey recent developments at the intersection of formal language theory and logical theories of arithmetic, focusing particularly on Büchi and Semënov arithmetic. I will explore the connections between sets of integers defined by logical formulas and their representation as formal languages, and present some key results concerning the expressive power and algorithmic properties of these arithmetic theories. I will also outline several open problems that arise from the current body of literature.

Shunsuke Inenaga, Kyushu University, Japan

Title: Recent Developments on the CDAWGs: Algorithms and Combinatorics

Abstract: The compact directed acyclic word graph (CDAWG) of a string \(T\), denoted \(CDAWG(T)\), is the smallest edge-labeled DAG that is obtained by merging isomorphic subtrees of the suffix tree of \(T\). CDAWGs are a fundamental string data structure with applications in text pattern searching, data compression, and pattern discovery. For instance, \(CDAWG(T)\) allows for matching patterns and computing maximal exact matches (MEMs) in optimal \(O(m)\) time with \(O(e(T))\) space, where \(m\) is the pattern length and \(e(T)\) is the number of edges in \(CDAWG(T)\). It is known that there is a one-to-one correspondence between the internal nodes of \(CDAWG(T)\) and the maximal repeats in \(T\), and that there is a one-to-one correspondence between the edges of \(CDAWG(T)\) and the right-extensions of the maximal repeats in \(T\). Hence, \(e\) tends to be small when the string contains a small number of distinct maximal repeats (and their right extensions). In the best case, \(e\) can be \(O(\log n)\) for some highly repetitive strings of length \(n\), including the Fibonacci words and Thue-Morse words. In this talk, after revisiting the basic notions and properties, we review some of the recent developments on CDAWGs in both algorithms and combinatorics perspectives.

Dora Giammarresi, University of Rome Tor Vergata, Italy

Title: Partial Cubes and Fibonacci Dimension: Insights and Perspectives

Abstract: A partial cube is a graph \(G \) that embeds isometrically into a hypercube \( Q_k \), with minimum such \( k \) called the isometric dimension \( \operatorname{idim}(G) \) of \(G \). A Fibonacci cube \(\Gamma_k \) excludes from the vertices strings containing “11”, and any partial cube \(G \) embeds into some \( \Gamma_d \), defining the Fibonacci dimension \( \operatorname{fdim}(G) \) as the minimum such \( d \). It holds that \( \operatorname{idim}(G) \leq \operatorname{fdim}(G) \leq 2 \cdot \operatorname{idim}(G) - 1.\) While \(\operatorname{idim}(G) \) is computable in polynomial time, checking whether \( \operatorname{idim}(G) = \operatorname{fdim}(G) \) is NP-complete. We survey properties of partial cubes and generalized Fibonacci cubes, and present a new family of graphs \( G \) for which \( \operatorname{idim}(G) = \operatorname{fdim}(G). \) We conclude with some open problems.

Invited Tutorial

Marco Cognetta, Tokyo Institute of Technology, Japan

Title: Subword Tokenization Meets Formal Language Theory

Abstract: Tokenization is the first stage of the modern neural language modeling pipeline where raw text strings are converted to subword tokens that can be consumed by a language model. Popular tokenization algorithms like Byte-Pair Encoding or MaxMatch are rooted in compression algorithms and have ad-hoc implementations with many subtle differences in the details. Recent works have discovered connections between formal language theory (in particular, the use of finite-state transducers) and tokenization. We will discuss subword tokenization, a unifying finite-state transduction framework for popular tokenization algorithms, the use of subword-based automata in important downstream tasks such as constrained generation, and some open problems and challenges that still exist in the field.