EchoWatch

Interpretable Bioacoustic Monitoring for Early Detection of Anomalies in Complex Wildlife Soundscapes

March 2026 - May 2026

Introduction

I did not grow up near a forest. I grew up in Carmel, Indiana, which has a lot of things — excellent schools, good violin teachers, a suburb's worth of identical manicured lawns — but not much in the way of wild acoustic complexity. The most biodiverse soundscape I had regular access to as a child was probably the creek behind my neighbourhood, and even that ran through a storm drain for most of its length.

Which is maybe why I find the idea of a soundscape so striking. The concept that a place has a sound- not just background noise, but a structured, layered, ecologically meaningful acoustic signature — feels like the kind of thing you might learn as a child and then forget is remarkable. I did not learn it as a child. I learned it at seventeen, reading a paper by Bernie Krause about why he had been recording natural soundscapes since the 1960s, and how recordings from the same locations across decades showed that the acoustic complexity of ecosystems around the world was quietly collapsing. That is where this project starts.

Cheers,
Angie X.

PHASE 1: The Problem Nobody Noticed

The monitoring infrastructure for ecosystems is, in many ways, genuinely impressive. There are camera trap networks across national parks. There are satellite systems tracking deforestation in near-real-time. There are acoustic monitoring networks — ARBIMON, AudioMoth arrays, field recorders deployed by the thousands across conservation research sites — that collectively generate more audio data than any team of field biologists could ever manually review.

The bottleneck is not data collection. The bottleneck is what happens to the data after collection.

When I started digging into the existing tools, I found two main categories. The first: species detectors. BirdNET, PAMGUARD, Raven Pro. These are good at identifying specific calls from specific species in short audio clips. They are trained on labelled examples and require knowing what you are looking for. The second: acoustic indices. Mathematical summaries like the Acoustic Complexity Index (ACI), the Bioacoustic Index (BI), the Acoustic Diversity Index (ADI). These summarise individual recordings as single numbers, useful for cross-site comparison. But they have no memory. They cannot tell you whether the site sounded different last month.

What both categories miss is temporal dynamics. A forest might produce identical acoustic index scores every day for three months while the species composition slowly shifts underneath — as one acoustic niche is vacated and the silence is filled by generalist species or by nothing. The static indices miss this. The species classifiers miss it too, unless someone already knows which species to look for and has labelled examples ready.

The gap I wanted to fill: can a system learn what normal sounds like for a specific site, at a specific hour of the day, and then flag when that pattern changes — without any labels, without any prior knowledge of the site's biology, and without any assumption about what abnormal might look like?

The answer is yes, with significant caveats. Those caveats are the point.

EchoWatch Concepts Reference - A. Xiu.pdf

Page updated

Google Sites

Report abuse