Logo of Nexdata Storefront Contact Us
Back

Natural Scene and Handwriting OCR Data | 500,000 Images| Computer Vision Data| AI Training Data

Off-the-shelf OCR data covers natural scenes image and handwriting image data, covering 20 languages, multiple natural scenes, and multiple photographic angles.

Request Information
Dataset Name Samples
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Request Sample Access

Description

1. Overview 1) Natural Scenes Data size : 200,000 images Language: English, French, German, Italian, Portuguese, Russian, Spanish, Japanese, Korean, Indonesian, Malay, Vietnamese, Thai, Turkish, Arabic, Traditional Chinese and etc. Collecting environment : including shop plaque, stop board, poster, ticket, road sign, comic, cover picture, prompt/reminder, warning, packing instruction, menu, building sign, etc. Diversity : including 20 languages, multiple natural scenes, multiple photographic angles (looking up angle, looking down angle, eye-level angle) Device : cellphone, camera Image parameter : the image data format is .jpg, and the annotation file data format is .json Annotation content : line-level quadrilateral bounding box annotation and transcription for the texts Accuracy : the error bound of each vertex of quadrilateral bounding box is within 5 pixels, which is a qualified annotation, the accuracy of bounding boxes is not less than 97%; the texts transcription accuracy is not less than 97% 2) Handwriting Data size : 300,000 images Language: English, French, German, Spanish, Arabic, Italian, Japanese, Korean, Traditional Chinese Collecting environment: pure color background Device: scanner Photographic angle: eye-level angle Data format: the image data format is .png Data content: including address, company name and personal name, each image has 20 writing boxes Accuracy rate: The collection content accuracy is not less than 97% 2. About Nexdata Nexdata owns off-the-shelf PB-level Large Language Model(LLM) Data, 1 million hours of Audio Data and 800TB of Annotated Imagery Data. The ready-to-go AI & ML Training Data supports instant delivery, quickly improve the accuracy of AI models. For more details, please visit us at https://www.nexdata.ai/datasets/ocr?source=Datarade

Country Coverage

(61 countries)
Africa (8)
Asia (14)
Australia (2)
Europe (24)
North America (4)
South America (9)

Data Categories

  • Annotated Imagery Data
  • Machine Learning (ML) Data
  • Deep Learning (DL) Data
  • Object Detection Data
  • Computer Vision Data

Pricing

Starts at
$10K
One-off purchase
$10K
Monthly License
Not available
Yearly License
Not available
Usage-based
Not available

Volumes

images
500K

Does this product fit your data needs?

Get in touch with our team to start unlocking your data solutions.

Request Information