Logo of Nexdata Storefront Contact Us
Back

Nexdata | Multilingual Casual Conversation Speech Data | 20,000 Hours | Spontaneous Speech |Audio Data

Off-the-shelf 20,000 hours of Casual Conversation Speech data, covering 30+ languages. Covering diverse domains like self-media, conversations, live streams, and variety shows, the data reflects authentic, real-world interactions.

Request Information
Attribute Type Example
Product Name String Volume
Multilingual Children Speech Data String 10,000 hours

Description

1. Specifications Format: 16kHz, 16 bit, wav, mono channel; Recording environment: Low background noise; Recording content: Including live, variety-show, speech etc; Language: English,French, German, Japanese, Portugese, Dutch, Turkish, Korean, Vietnamese, Indonesian, Malay, Thai, Burmese, etc. Features of annotation: Transcription text, timestamp, speaker ID, gender, noise Accuracy rate: Word Accuracy Rate (WAR) 98% 2. About Nexdata Nexdata owns off-the-shelf PB-level Large Language Model(LLM) Data, 1 million hours of Audio Data and 800TB of Annotated Imagery Data. These ready-to-go data supports instant delivery, quickly improve the accuracy of AI models. For more details, please visit us at https://www.nexdata.ai/datasets/speechrecog?source=Datarade

Country Coverage

(41 countries)
Asia (13)
Australia (2)
Europe (18)
North America (4)
South America (4)

Data Categories

  • Deep Learning (DL) Data
  • Transcription Data
  • Audio Data
  • Large Language Model (LLM) Data
  • Speech Data

Pricing

Starts at
$20K
One-off purchase
$20K
Monthly License
Not available
Yearly License
Not available
Usage-based
Not available

Volumes

hours
20K

Does this product fit your data needs?

Get in touch with our team to start unlocking your data solutions.

Request Information