Tutorials - Digital Audio Effects Conference DAFx26

Reflecting on the Topology of Spheres in Digital Audio Signal Processing

Presenter: Georg Essl

The workshop will explain the topological notion of folding, specifically in the contexts of circles and spheres. We will explain historical notions of "folding diagrams" topologically, as well as convolutions which are called "Faltung" (folding) in German. All have pictorial interpretations, but connect to modern notions such as orbifolds and infinite dihedral groups. This will give the participants a fresh perspective on why the Nyquist frequency is half the sampling frequency. We will also explore ideas about wave fronts on a circular drumhead which turn out to be spheres under folding. Finally, we will discuss open research questions suitable for graduate level research. Prerequisites are a typical background in DAFx including linear algebra, Fourier analysis, basic knowledge of physical modeling and wave propagation, and the ability to fold some paper.

The workshop can be viewed as a continuation of the two prior tutorial held at both DAFx2022 and DAFx2023, but neither of these previous workshops will be expected as background. The tutorial will be self-contained and should be accessible to all DAFx attendees.

Georg Essl is Research Professor at the University of Wisconsin-Milwaukee. He has served on the faculties of the University of Michigan and Florida and held research positions at MIT Media Lab Europe, Deutsche Telekom Labs-TU-Berlin. He is a Guggenheim fellow and serves on the advisory board of NIME. His research interests include physical modeling sound synthesis, topological methods in audio signal processing, mobile music, and musical interface design. He received his Ph.D. from Princeton University where he studied with Perry Cook on the topic of physical modeling of ideophones.

Feedback Delay Networks in Python: From Theory to Differentiable Design with pyFDN

Presenters: Sebastian J. Schlecht and Facundo Franchino

Feedback Delay Networks (FDNs) are a fundamental and computationally efficient paradigm for artificial reverberation, widely used in audio effects, game audio engines, and music production tools. Despite their practical relevance, the underlying theory—including stability, modal density, coloration, and parameter optimization—can be difficult to approach through the literature alone. This tutorial introduces pyFDN, a new open-source Python toolbox for designing, analyzing, and optimizing FDNs, to be officially released at DAFx 2026. Participants will construct FDN reverberators from first principles, analyze their acoustic properties, and explore extensions based on differentiable optimization. The tutorial combines concise theoretical introductions with interactive Python notebooks and live coding exercises.

Sebastian J. Schlecht is an Associate Professor of Signal Processing at Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Germany. His research focuses on artificial reverberation, differentiable audio processing, and spatial audio. He is the lead developer of pyFDN and has authored numerous papers on feedback delay network stability, colorless reverberation design, and gradient-based optimization of reverberator parameters. He has received multiple best paper awards, including at DAFx, WASPAA, and JAES, and has served as a member of the DAFx Program Committee since 2020.

Facundo Franchino is an incoming graduate student at the Massachusetts Institute of Technology (MIT). He recently completed a BEng in Electronic Engineering with Music Technology Systems at the University of York, where his final-year work investigated structured pruning for feedback delay networks. His research interests include artificial reverberation, real-time audio signal processing, and machine listening.

Accelerated Real-Time C++ Audio: DSP Offloading and GPU Neural Inference

Presenter: Victor Zappi

Modern heterogeneous SoCs integrate a CPU, a digital signal processor and a GPU on a single chip, yet most real-time audio applications only use the CPU. This tutorial explores how to take advantage of all three compute targets for real-time C++ audio, using a Qualcomm SoC as the platform. Through live demonstrations on an RB3 Gen 2 board, the session walks through three stages. First, writing and running real-time audio applications on the CPU using an open-source C++ audio engine. Second, offloading audio processing to the Hexagon DSP via AudioReach, Qualcomm's open-source graph-based audio framework, and comparing CPU load before and after offloading. Third, deploying neural audio models and running inference via the QNN SDK (from Qualcomm AI Runtime), on both CPU and GPU backends, comparing performance in real time. The tutorial draws on ongoing research on real-time audio deployment on hardware accelerators. All code and examples will be available in a public repository. A limited number of boards will also be available for attendees who wish to follow along hands-on; participants interested in this option should sign up in advance and be comfortable with basic Linux embedded development workflows (cross-compilation toolchains are not required—all build tools are pre-installed on the boards).

Victor Zappi is an Associate Professor of Music Technology at Northeastern University, where he leads the LDSP Lab. His research centers on the design of digital musical instruments, with a focus on real-time audio and responsive systems. This drives his work on embedded audio platforms, low-latency C++ audio engines and the deployment of neural audio models on resource-constrained hardware. Victor is both an engineer and a musician, and his research spans embedded and mobile audio, neural synthesis, physical modeling, and human-computer interaction.

From Neural Fields to Personalized Spatial Audio: A Hands-on Tutorial on HRTF Modeling

Presenters: You (Neil) Zhang and Yoshiki Masuyama

Head-related transfer functions (HRTFs) are fundamental to binaural rendering and immersive audio, but dense individualized measurements remain costly and time-consuming. Recent work has shown that neural fields provide a flexible framework for representing HRTFs as continuous functions over direction, enabling interpolation from sparse measurements, scalable learning across subjects and databases, and new pathways toward practical personalization. This tutorial presents a unified and hands-on introduction to this emerging research area for the DAFx community.

We begin with the core neural-field formulation for HRTF modeling and explain why continuous representations are well-suited to high-dimensional, irregularly sampled HRTF data. We then show how this line of work has evolved toward more practical systems, including subject- and dataset-aware modeling for cross-dataset harmonization, integration with differentiable digital signal processing (DDSP), retrieval-augmented upsampling from sparse measurements, and latent-space perspectives that naturally support personalization.

The tutorial includes interactive Jupyter Notebook demos in Python/PyTorch. Participants will train and inspect a minimal neural-field HRTF model, explore extensions for harmonization and upsampling, and see how the same framework can support future personalization research, including mapping anthropometric features into a learned latent HRTF space. Attendees will leave with both conceptual understanding and a compact working implementation adaptable to their own research.

You (Neil) Zhang received his Ph.D. from University of Rochester in 2026, where he conducted research in the Audio Information Research (AIR) Lab. He is currently a Senior Researcher at Dolby Laboratories. His work explores human-centric audio intelligence, spanning spatial audio and HRTF personalization, audio deepfake detection, and audio-visual understanding and generation. He is a recipient of the ICASSP 2023 Rising Star award, the National Institute of Justice Graduate Research Fellowship, the IEEE Signal Processing Society Scholarship, and the WASPAA Best Student Paper Award.

Yoshiki Masuyama is a Visiting Research Scientist at Mitsubishi Electric Research Laboratories (MERL) in Cambridge, Massachusetts. He received his B.E. and M.E. degrees from Waseda University and his Ph.D. from Tokyo Metropolitan University. His research interest is in integrating signal processing and machine learning technologies for efficient and robust audio processing. He is a recipient of the Best Student Paper Award at the IEEE Spoken Language Technology Workshop 2022. He has worked on subject- and dataset-aware HRTF neural fields, retrieval-augmented HRTF upsampling, and neural IIR filter fields, with an emphasis on scalable and practical personalization. He led the MERL team on the Listener Acoustic Personalisation (LAP) 2024 Challenge Task 2 and ranked first.