Project Overview
The Music Translator is an application designed to classify songs into genres using audio spectrograms and a deep learning model. The project will process audio files, convert them into spectrogram images, and use a pre-trained or custom-trained Convolutional Neural Network (CNN) to predict the genre. The app will be deployed as an interactive web interface using Streamlit, allowing users to upload audio files and get instant genre predictions.
Project Objectives
Audio Preprocessing:
Process audio files to generate spectrograms.
Normalize and prepare spectrograms for input into the CNN model.
Genre Classification:
Use a trained CNN model to classify audio spectrograms into genres.
User Interaction:
Build an intuitive Streamlit interface for audio upload and genre classification.
Display results along with spectrogram visualization.
Workflow
Data Preprocessing:
Convert uploaded audio files into spectrograms using Librosa.
Resize spectrograms to match the input size of the CNN model.
Extract features from the audio files to use for training the CNN.
Model Integration: Week 2 + 3
Use a pre-trained CNN (e.g., VGG16 fine-tuned for spectrograms) or train a custom model using datasets like GTZAN Music Genre Dataset.
Load the trained model into the app for inference.
Web App Development: Week 3/4
Create an interactive Streamlit interface with:
Audio file upload functionality.
Visualization of the spectrogram.
Display of the predicted genre.
Deployment:
Deploy the app on Streamlit Community Cloud.
Link to GitHub: https://github.com/SuperDataScience-Community-Projects/SDS-CP018-music-classifier