2025-02-05 23:00 (KST): Challenge Launch
The organizing committee of the ICSV31 conference hosts an AI Challenge focusing on diagnosing machine anomalies using acoustic signals.
Participants will train AI models on drone sound signals and identify anomalies from a provided dataset. A grand prize of 1,500 USD will be awarded to the winning team.
Participation is team-based, and at least one team member must be a registered attendee of the ICSV31 conference at the time of model submission. The conference will also hold a dedicated workshop session for participating teams.
Event | Date |
---|---|
Challenge launch (distribution of task descriptions, train & eval datasets) | 5 Feb. 2025 |
Distribution of test dataset | 1 April 2025 |
Final submission (Anomaly score, trained model, technical report) | 21 May 2025 |
Competition result announcement | 31 May 2025 |
ICSV31-AI-Challenge (2nd KSNVE AI Challenge) aims to diagnose the state of drones by detecting anomalous sounds caused by mechanical failures in propellers and motors in noisy environments. In particular, drone sounds that vary depending on operational conditions, such as flight directions, along with background noise, make this task highly challenging.
The ultimate goal of the competition is to develop anomaly detection models capable of identifying anomalies in drones using data collected under various conditions.
Dataset for ICSV31 AI Challenge: download
This ICSV31 AI Challenge dataset is based on the drone noise data, originally constructed by Wonjun Yi, Jung-Woo Choi, Jae-Woo Lee for the drone fault classification task.
(W. Yi, J-W. Choi., J-W. Lee, "Sound-based drone fault classification using multi-task learning", Proceedings of the 29th International Congress on Sound and Vibration (ICSV 29), Prague, Czech Republic, July. 2023.)
The previous dataset was significantly modified for this specific challenge.
The drones used in this study include the Holy Stone HS720 (Type A), MJX Bugs 12 EIS (Type B), and ZLRC SG906 Pro2 (Type C).
Figure 1: Three drone types used for the experiment.(a) Type A (Holy Stone HS720), (b) Type B (MJX Bugs 12 EIS), (c) Type C (ZLRC SG906 Pro2)
Drone sounds were recorded using a RØDE Wireless Go2 wireless microphone mounted on the top of the drone body. The sensitivity of the microphone was adjusted to prevent clipping even at high sound pressure levels. The recordings were conducted in an anechoic chamber to eliminate wall reflections.
Figure 2: (a) RøDE Wireless Go2 microphones (transmitter, receiver), (b) recording sounds of drone type B
The recorded drone sounds, originally sampled at 48 kHz, were downsampled to 16 kHz and segmented into 2-second segments.
Figure 3: Faults of drone type B. (a) propeller cut, (b) dented motor cap (red circle indicates dented part)
To simulate real-flight conditions, drone sounds were mixed with background noise at a signal-to-noise ratio (SNR) of -5 to 5 dB. The background noise consists of recordings from three distinct outdoor locations (ponds, hills, and gates), as well as DEMAND Noise, which includes noise from the park, town square, and traffic intersection environments.
Figure 4: Three different spots on the university campus chosen for background noise recording: (a) pond, (b) hill, and (c) gate.
Data is provided in three categories: train and eval for development, and test for competition submission.
The eval and test datasets must not be directly utilized as the training data.
Train and Evaluation Data (5,400 train files, 1,080 eval files)
The filenames for the train dataset follow the format:
[dataset]_[drone_type]_[moving_direction]_[anomaly_flag]_[data_index].wav
where:
Test Data (Total 1,440 files)
The filenames for the test dataset follow the format:
[dataset]_[drone_type]_[moving_direction]_[data_index].wav
The evaluation metric for this challenge is ROC-AUC (Receiver Operating Characteristic - Area Under the Curve, AU-ROC). ROC-AUC evaluates how well the distributions of normal and anomalous data are separated, independent of any specific decision boundary.
ROC-AUC will be calculated by the organizing committee using the .csv files submitted by participants. (Participants are not required to calculate or submit the ROC-AUC score themselves.)
This challenge provides a baseline code.
However, using or improving the baseline model is not mandatory for participation.
Teams are free to develop their own models.
The code consists of three files:
The baseline model utilizes WaveNet to perform a prediction task on spectrograms.
WaveNet is primarily used as an audio generation model for anomalous sound detection. Due to its exponentially increasing dilation rate in each residual block, WaveNet has a wide receptive field.
In the baseline model, WaveNet performs causal convolution to predict future spectral frames in the spectrogram.
The model trained to minimize the prediction error of normal data tends to produce significantly higher prediction errors on anomalous data. This characteristic is leveraged for anomaly detection.
The anomaly score is computed using the mean squared error (MSE) between the target and predicted spectrograms.
Let the input spectrogram be X = [x1, ..., xT] ∈ RF × T, where \(\mathit{F}\) and \(\mathit{T}\) represent the frequency and time dimensions of the spectrogram, respectively.
The model takes input data of length equal to the receptive field, [x1, ..., xl], and predicts the next spectral frame x̂l+1. This is formulated as:
x̂t+1 = ψφ(xt-l+1, ..., xt).
Given the input data X = [x1, ..., xT], the model predicts X̂ = [x̂l+1, ..., x̂T+1].
The anomaly score for each data point is computed as:
where the mean squared error (MSE) measures the deviation between the actual and predicted spectrogram frames.
The baseline model was trained for 100 epochs with a learning rate of 1e-3 and a batch size of 64.
The anomaly detection performance of the baseline model on the evaluation dataset is as follows:
Drone | A | B | C |
---|---|---|---|
AUC (%) | 63.71 | 54.05 | 61.77 |
Please submit your files to Submission link
Participants must submit four files for challenge participation:
The eval.py and test.py files generate and save the anomaly scores for the evaluation and test datasets as eval_score.csv and test_score.csv, respectively.
The organizers will use these files to compare the performance of participants' models.
Participants must submit code that can reproduce the model’s performance. This process is intended to verify the reproducibility of the submitted code and will not be used for any other purpose. The copyright of the code fully belongs to the author.
The trained model file should be included in a compressed zip file, which must contain all necessary components to generate eval_score.csv and test_score.csv.
The technical report should describe:
The technical report must be formatted according to the ICSV paper guideline. All reports submitted to the challenge will be included in the conference proceeding as non-referred papers. Participants do not need to submit the report through the ICSV website separately.
The final performance score \(\mathit{S}\) is calculated using the AUC scores of three drones (A, B, and C). First, the average AUC scores for the evaluation and test datasets are computed as follows:
The final performance score \(\mathit{S}\) is then calculated as a weighted sum of the evaluation and test scores, with 30% weight for the evaluation dataset and 70% weight for the test dataset:
The final ranking will be determined based on this weighted score.
This challenge is hosted by the KSNVE (Korean Society for Noise and Vibration Engineering). For inquiries, please send an email to Organizing Committee (icsv31aichallenge@gmail.com).
Challenge organizers
We look forward to your participation and are happy to assist with any questions!
Shield:
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.