ODSC East 2020: Audio Event Detection via Deep Learning in Python
This course is available only as a part of subscription plans.
In this presentation, phData’s Director of Machine Learning Robert Coop will walk the audience through the process of detecting and classifying audio events using deep learning. The objective is to enable the listener to use open source tools, pre-trained networks, and data augmentation for audio analytics projects. The target audience is someone with intermediate or advanced experience programming with Python and any level of experience using deep learning.
The first part of the talk will introduce the background techniques, available data, and theory. We will cover basic audio processing techniques such as time-frequency domain transformations, frequency domains used to replicate human interpretation of audio, and spectrographic representations of audio. The talk will focus on Google’s publicly-available AudioSet data, and how it can be processed using deep learning. We will demonstrate the similarity between image recognition and audio processing, and cover a VGG-inspired network that has been successfully used in audio processing work. There will also be some discussion of state-of-the-art techniques that have recently been developed.
The second part of the talk will focus on the hands-on application of these techniques in Python. The focus will be on the end-to-end process of classification of AudioSet data using the VGGish deep learning network. We will cover loading the data and applying common preprocessing techniques in Python. Tensorflow will be used to load the VGGish network and to process the transformed audio data. We will demonstrate using the network to create embedding vectors for classification as well as the process of transfer learning using the pretrained weights as a foundation.
After this talk, audience members will be able to understand the motivation behind using deep learning for audio processing, locate and use publicly-available audio data for experimentation, and use Python with Tensorflow to classify audio samples.
Overview and Author Bio
Audio Event Detection via Deep Learning in Python
Robert Coop, PhD
Robert Coop, PhD