Wojciech Chachólski (KTH)

Geometry, Homology and Data

One of the key steps towards a successful analysis is to provide suitable representations of data  by objects  amenable for statistical and  ML methods.  

Since geometrical properties are often not amenable for such tools, one of the  most important contribution of Topological data analysis has been to provide strategies and algorithms of transforming geometrical information into objects for which statistical and ML methods can be applied to. During the last decade there has been an explosion of applications in which such representations of data played a significant role. In my talks I will  present one such strategy based on hierarchical stabilisation process leading to invariants called stable ranks.

I will use classical Wisconsin Breast Cancer data as one of examples when homological invariants can give interesting information.

Anne Estrade (Université Paris Cité)

The geometry of Gaussian fields

The short title of the lecture has to be understood as "Some geometric properties of Gaussian random fields".

The first part will be dedicated to general definitions and properties of Gaussian fields indexed by the Euclidean space $R^d$ (with $d \ge 2$) with a focus on the special case of stationary Gaussian fields. We will deal with a geometric feature that is really specific to the multivariate context: anisotropy. We will present various models and will try to understand which characteristics of the field are impacted by the anisotropy property.

The second part will be dedicated to Rice formulas and their consequences. They consist in writing moments of some geometric functionals who depend on the level sets of the Gaussian field. We will visit recent works on related topics that open new perspectives in link with spatial statistics, image analysis or TDA.

Érika Roldán (MPI Leipzig)

Topology and Geometry of Random Cubical Complexes 

In this mini-course, we will explore the topology and local geometry of different random cubical complex models. In the first part, we explore two models of random subcomplexes of the regular cubical grid: percolation clusters (joint work with David Aristoff and Sayan Mukherjee), and the Eden Cell Growth model (joint work with Fedor Manin and Benjamin Schweinhart). In the second part, we study the fundamental group of random 2-dimensional subcomplexes of an n-dimensional cube; this model is analogous to the Linial-Meshulam model for simplicial complexes (joint work with Matt Kahle and Elliot Paquette).

Rasmus Waagepetersen (Aalborg University) 

Cox processes – mixed models for point processes

Cox process is the point process counterpart of a latent variable model. It arises by adding latent random effects to a Poisson point process model where these random effects constitute a random field and are used to model unobserved sources of variation influencing the occurrence of points. We will review a range of Cox process models including log Gaussian Cox processes and shot-noise Cox processes. We consider moment properties and methodology for statistical inference including estimating functions. We also consider extensions to multivariate point patterns.

Poster session

Péter Juhász (Aarhus University

Topological data analysis of higher-order networks

Preferential attachment is a popular mechanism for generating scale-free networks. While it offers a compelling narrative, the underlying reinforced processes make it difficult to rigorously establish subtle properties. Recently, age-dependent random connection models were proposed as an alternative that are capable of generating similar networks with a mechanism that is amenable to a more refined analysis. In this poster, we analyze the asymptotic behavior of higher-order topological characteristics such as higher-order degree distributions and Betti numbers in large domains.

Nikolaj Lundbye (Aarhus University) 

Extremal lifetimes in the sparse regime

The persistence diagram is one of the main object of study in topological data analysis. Given a point cloud, the persistence diagram represents the collection of birth and death times of topological features such as loops or cavities. In particular, the points in the persistence diagram with a large lifetime are of special interest. We study the extremal lifetimes when the point cloud is given by a Poisson point process in a sparse regime, i.e., where the expected number of neighbors of a typical vertex tends to 0. We derive a Poisson approximation result for the maximal lifetimes.

Karthik Viswanathan (University of Amsterdam) 

Information maximizing persistent homology for inference

A way to summarize a complex data set is to represent it via a filtered simplicial complex and compute the corresponding persistent homology. It is possible that we lose information about the dataset while we construct the persistence diagram. In this poster, I'll quantify the information content of the persistent diagram using Fisher Information. This algorithm may be useful for integrating topological information into statistical inference. We illustrate the pipeline using examples such as Gaussian Random Fields with a defined power spectrum. In certain cases, we show that the persistence diagrams can provide optimal summaries for inference, in the sense that the Cramer-Rao bound is saturated. The possibility of using Fisher Information as a loss function for unsupervised learning tasks to learn an "optimal" filtration is explored in this poster. 

Jacky Yip (University of Wisconsin-Madison

TDA for cosmology: parameter constraints and estimations from the large-scale structure

Persistent homology naturally describes the multi-scale characteristics of the large-scale structure in our universe. We apply the tool to mock galaxy catalogues to construct a simple and interpretable summary statistic. With the Fisher matrix formalism, constraints on cosmological parameters are found to be generally tighter than those from state-of-the-art momentum-space statistics conventionally used by the community. For parameter estimation, we explore the use of a neural network model trained on persistence images and find that it outperforms a traditional Bayesian inference method, demonstrating that TDA and machine learning can be combined for cosmology.

Navyanth Kusampudi (Max-Planck-Institut für Eisenforschung

Application of topological data analysis and sliding window embedding to analyze time series data from bird songs

Bird songs are a result of a complex interaction involving the brain, vocal muscles, and a vocal organ called the syrinx. This process, driven by the brain’s internal dynamics, induces changes in air pressure which create distinctive bird songs. These changes can be recorded, resulting in time series data that serves as a digital imprint of the song. This data has significant value for a broad spectrum of scientific research and provides insights into identifying bird species, analyzing their behavior, driving bioacoustics research, and understanding how young birds learn songs from their parents. Our study explores the application of topological data analysis to this time series data from bird songs. The approach comprises using a sliding window embedding technique, transforming the time series data into point cloud data within a reconstructed phase space. Subsequently, we implement persistent homology to analyze the topological features present in the reconstructed point cloud data. Our study underscores the potential of topological data analysis as a tool to unravel and visualize the temporal variations within the attractor dynamics intrinsic to bird song.

Jose Licon-Salaiz (University of Hamburg

A topological analysis of pattern formation in shallow cumulus convection

Turbulent convection in the planetary boundary layer is an archetypal case of pattern emergence and self-organization in nonlinear dynamics. Given the spatiotemporal complexity inherent to these patterns, it is difficult to give a precise quantitative characterization of their structure and interactions with the environment. This is even more difficult in a moist atmosphere, as the appearance of a cloud layer introduces nonlinear feedbacks to the system. Yet having a quantitative understanding of these patterns could greatly improve their representation in large-scale climate models, for which clouds constitute the largest source of uncertainty. Previous work has shown that topological invariants computed from two-dimensional cross sections of vertical wind velocity in a dry boundary layer carry important information about the underlying dynamics. Here we extend those methods to a moist boundary layer, and explore relationships between topological invariants obtained from model variables and the processes of cloud formation and evolution.