Audio Analysis Scratch

A collection of various scripts and experiments for audio feature extraction using Python, Librosa and other relevant libraries aimed at music genre classification and recommendations.

Data Set

Training on the GTZAN dataset

https://www.kaggle.com/datasets/andradaolteanu/gtzan-dataset-music-genre-classification

To Explore

Spectral contrast: Spectral contrast is a measure of the difference in amplitude between peaks and valleys in an audio signal's frequency spectrum. It provides valuable information about the texture and timbre of sounds, as different instruments or voices produce unique spectral contrasts.
Chroma features: Chroma features are a representation of the energy distribution across different pitches within an audio signal. They capture harmonic and melodic content and are often used for tasks like chord recognition, key detection, and genre classification.
Mel-frequency cepstral coefficients (MFCCs): MFCCs are widely used features in speech recognition systems, as well as music analysis applications. They describe the shape of power spectra on a mel-frequency scale, which more closely mimics human perception than linear frequency scales do.
Constant-Q Transform (CQT): The CQT is a time-frequency representation that provides constant Q resolution at all frequencies, meaning that it maintains consistent spectral resolution throughout its range while preserving temporal information better than other transforms like Fourier or Wavelet-based techniques.
Beat tracking/rhythm pattern analysis: Beat tracking aims to identify the underlying pulse or tempo of music by analysing regularities and patterns within an audio signal's structure over time whereas rhythm pattern analysis focuses on identifying rhythmic elements such as individual beats, accents or syncopations that contribute to creating complex musical structures.
Pitch class profile/Tonal centroid features: Pitch class profiles represent histograms providing distributions of pitch classes within music segments by considering their relative importance across octaves.Tonal centroid features deal with extracting multidimensional spaces where perceptually similar chords occupy nearby regions providing insights into tonality,hence contributing to interpretation , evaluation & synthesis tasks in Music Information Retrieval(MIR).
Statistical summary of spectral data: mean, variance ,skewness etc.: Statistical summaries compute various statistical measures such as mean (central tendency), variance (quantifying how much variation exists across values) and skewness(indicating the degree of asymmetry in distributions) on an audio signal's spectral data. These features help characterise the signal's timbral, harmonic or dynamic attributes & are often used as input for Machine Learning algorithms.
Onset detection & strength signal calculation: Onset detection identifies moments in time when new sounds or musical events begin by detecting abrupt changes in energy levels or spectral content. The strength signal calculation quantifies how strong these changes are (typically using some transformation of amplitude information), and can be used to assess the importance of detected onsets, contributing valuable information about a track's rhythmic structure,timbre ,status etc.

Resources:

GTZAN Dataset - Music Genre Classification
Audio Files | Mel Spectrograms | CSV with extracted features

Music Genre Classification with Python
A Guide to analyzing Audio/Music signals in Python

Music genre classification using Librosa and Tensorflow/Keras
How to implement a music genre classifier from scratch in TensorFlow/Keras using those features calculated by the Librosa library.

MaSC Compendium Visualization
A collection of scripts for visualizing the Arab Mashriq collection of the NYU Abu Dhabi Library and the Eisenberg collection

Getting to Know the Mel Spectrogram
Read this short post if you want to be like Neo and know all about the Mel Spectrogram!

music2vec: Generating Vector Embeddings for Genre-Classification Task
The aim of our project was to obtain similar vector representation for music segments. We hope to capture the structural and stylistic information of the music in this low dimensional space. using genre classification as the end task.

FMA: A Dataset For Music Analysis
We introduce the Free Music Archive (FMA), an open and easily accessible dataset suitable for evaluating several tasks in MIR, a field concerned with browsing, searching, and organizing large music collections.

Recommending music on Spotify with deep learning
Content-based music recommendation using convolutional neural networks.

Music Recommendation System Using Machine Learning
In this article, we will try to build a very basic recommender system that can recommend songs based on which songs you hear.

Music Genre Classification using LSTM
Learn to build your own model which will take in a song as an input and predict or classify that particular song in one of the basic genres. We’ll be classifying among the following groups: blues, classical, country, disco, hiphop, jazz, metal, pop, reggae, and rock.

t-SNE clearly explained
An intuitive explanation of t-SNE algorithm and why it’s so useful in practice.

Understanding K-means Clustering in Machine Learning
K-means clustering is one of the simplest and popular unsupervised machine learning algorithms.

Data Visualization using Python for Machine Learning and Data science
Python has several good packages to plot the data and among them Matplotlib is the most prominent one. Seaborn is also a great package which offers a lot more appealing plot and even it uses matplotlib as its base layer.

How I Understood: What features to consider while training audio files?
This post is aimed at briefing through some of the most important features that may be needed to build a model for an audio classification task.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
model_one_mean_min_max_features_add_tempo		model_one_mean_min_max_features_add_tempo
model_one_mean_min_max_features_add_zero_crossing_rate		model_one_mean_min_max_features_add_zero_crossing_rate
model_one_mean_min_max_features_one		model_one_mean_min_max_features_one
model_three_random_forest		model_three_random_forest
model_three_random_forest_more_features_and_statistics		model_three_random_forest_more_features_and_statistics
model_two_mfcc_mean_min_max		model_two_mfcc_mean_min_max
model_two_mfcc_only		model_two_mfcc_only
model_two_mfcc_only_lstm		model_two_mfcc_only_lstm
recommendations		recommendations
saved_models		saved_models
utilities		utilities
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
analyse_audio_features.ipynb		analyse_audio_features.ipynb
analyse_audio_first_steps.ipynb		analyse_audio_first_steps.ipynb
analyse_audio_tempo_bpm.ipynb		analyse_audio_tempo_bpm.ipynb
extract-librosa-data.py		extract-librosa-data.py
main.py		main.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
visualise-data-scratch.ipynb		visualise-data-scratch.ipynb

OrderAndCh4oS/analyse-audio-python

Folders and files

Latest commit

History

Repository files navigation

Audio Analysis Scratch

Data Set

To Explore

Resources:

About

Topics

Resources

Stars

Watchers

Forks

Languages