I. INTRODUCTION

from Audio Onset Detection: A Brief History and Current Techniques

II. LITERATURE REVIEW

I. INTRODUCTION

At first glance onset detection appears to be a straightforward task: identify the beginning of notes or new musical events, specifically the earliest point in which a transient can be detected (fig 1). Though, in reality, there are varying types of onsets and the literature shows that a challenge of this research area has been developing an algorithm that produces satisfactory results for all onset types.

This paper traces the development of onset detection algorithms, specifically focusing on the dominant techniques of the past fifteen years. Onset detection is a sub-category within the wider active research area of Music Information Retrieval (MIR) and is at the heart of our understanding of beattracking, tempo estimation, and automatic music transcription (Chen, Jang, Liu, & Weng, 2016).

This time scale is significant to the research as it followsthetop-performing submissions of the Music Information Retrieval Evaluation eXchange (MIREX) in the Audio Onset Detection Category from the time whenit was founded in2005(Appendix I). In the same year(Bello et al., 2005)published a seminal text ‘A Tutorial on Onset Detection in Musical Signals’ , which is referenced to this day in the majority of literature pertaining to onset detection.

Over the past decade, deep learning techniques have become the standard for many MIR workflows and considerable literature has grown around the use of Neural Networks for onset detection algorithms (Schindler, Lidy, & Böck, 2020).

Figure 1. Onset on the Envelope of an Audio Signal (Mottaghi, Behdin, Esmaeili, Heydari, & Marvasti, 2017)

There are three stages to a generalised onset detection algorithm. These are:

1. Pre-processing

A first and optional step involving the transformation of the original signal into a format more appropriate for onset detection (Lindqvist, 2019). A conventional choice would be performing the Short-Time Fourier Transform (STFT) on the original signal in order to access its spectral properties.

2. Reduction

Reduction describes the process of converting the signal into a downsampled ‘detection function’ in order for the transient incidents to be more easily expressed and detected (Bello et al., 2005). As is common in the literature, this paper focuses mainly on the reduction stage as it is the core of all onset detection algorithms. In some literature, this may be referred to as a ‘novelty function’ (Eyben, Böck, Schuller, & Graves, 2010).

3. Peak Picking

Peak Picking is the final step in the onset detection process. It is essentially identifying the peaks in the detection function in order to locate the onsets (Bello et al., 2005). If a suitable method has been used to achieve the detection function these maxima (i.e., peaks) should be easily recognised (fig 2).

I. INTRODUCTION

Next Article

II. LITERATURE REVIEW

I. INTRODUCTION

1. Pre-processing

2. Reduction

3. Peak Picking

More articles from this publication:

II. LITERATURE REVIEW

IV. REFERENCES

III. CONCLUSION

This article is from:

Audio Onset Detection: A Brief History and Current Techniques