Interspeech2026-Audio-Encoder-Challenge

20-Hour Non-speech Dataset from DataoceanAI

1. Dataset Overview

This dataset is constructed by extracting non-speech data from 8 original datasets, with a total duration of approximately 20 hours. It mainly contains environmental noise in various scenarios, suitable for research fields such as speech signal processing, noise suppression, and acoustic model training.

2. Dataset Composition

2.1 Overall Composition

Total Duration: 20 hours
Data Format: WAV format
Sampling Rate: 16kHz, 44.1kHz, or 48kHz
Bit Depth: 16bit
Channels: 1 channel

2.2 Detailed Information of Each Original Dataset

King-ASR-457

Contribution Duration: 2.46 hours
Included Scenes:
1. BookStore: 0.615 hour
1. Gym: 0.615 hour
1. Subway: 0.615 hour
1. Restaurant: 0.615 hour
Technical Features: recorded by mobil，pure scene environmental noise，PCM 44.1kHz/16bit

King-ASR-610

Contribution Duration: 2.50 hours
Included Scenes:
1. Home Background Noise-1m: 0.83 hour
1. Home Background Noise-2m: 0.83 hour
1. Home Background Noise-5m: 0.83 hour
Technical Features: pure background noise, PCM 44.1kHz/16bit

King-ASR-719

Contribution Duration: 2.40 hours
Included Scenes:
1. Water Noise: 0.8 hour
1. Footstep Noise: 0.8 hour
1. Outdoor Window Noise: 0.8 hour
Technical Features: recorded by Huawei mobile，pure non-speech interference, PCM 16kHz/16bit

King-ASR-829

Contribution Duration: 2.50 hours
Included Scenes:
1. Subway Car Noise-PEAK: 1.3 hour
1. Subway Car Noise-NON-PEAK: 1.3 hour
Technical Features: pure noise segments，PCM 44.1kHz/16bit

King-ASR-862

Contribution Duration: 2.50 hours
Included Scenes:
1. CALL Howling: 0.83 hour
1. GAME Howling: 0.83 hour
1. LIVE Howling: 0.83 hour
Technical Features: pure howling without speech, PCM 48kHz/16bit

King-ASR-876

Contribution Duration: 2.50 hours
Included Scenes:
1. OUT-CAR Parking Lot Noise: 0.83 hour
1. OUT-CAR Roadside Noise: 0.83 hour
1. OTHERS Underground Parking Lot Noise: 0.83 hour
Technical Features: recorded by mobile, pure outdoor environmental noise, PCM 16kHz/16bit

King-ASR-955

Contribution Duration: 2.50 hours
Included Scenes:
1. Vehicle Mechanical Noise: 0.83 hour
1. Air Conditioning Running Noise: 0.83 hour
1. Car Window Open Wind Noise: 0.83 hour
Technical Features: vehicle mechanical/environmental noise, PCM 48kHz/16bit

King-ASR-958

Contribution Duration: 2.54 hours
Included Scenes:
1. Cafe: 0.635 hour
1. Hospital: 0.635 hour
1. Market: 0.635 hour
1. Walking Street: 0.635 hour
Technical Features: recorded by mobile, pure scene environmental noise, PCM 48kHz/16bit

3. Data Usage

This dataset can be widely used in the following research and application scenarios:

Training and testing of speech noise suppression algorithms
Construction of acoustic environment classification models
Optimization of anti-noise performance of speech recognition systems
Evaluation of audio signal processing algorithms
Improvement of environmental adaptability of human-computer interaction systems

4. Data Download Method

4.1 Download Process

Application Registration: Users who wish to obtain the dataset need to register first
Eligibility Review: Staff will review the registration information to confirm compliance with usage conditions
Obtain Link: After passing the review, staff will send the data download link via private message

4.2 Notes

This dataset is for academic research and non-commercial use only
Users must comply with the data usage agreement and shall not use it for commercial profit-making activities
For commercial use, please contact the original data provider for authorization
Please properly store the data after download to avoid secondary dissemination

5. Version Information

Dataset Version: V1.0
Release Date: 2024
Data Update Record: First release, no update records yet

The final interpretation right of this description document belongs to the data provider