Category ➡️ Data Science
Subcategory ➡️ Computer Vision Engineer
Difficulty ➡️ Medium
Expected solution time ➡️ 6 hours. It is essential to complete your solution within this timeframe, as it is a critical performance indicator used by the hiring company to evaluate your work. The timer will begin when you click the start button and will stop upon your submission.
In a swiftly progressing world geared towards inclusivity and adapting to the needs of all, technological tools are pivotal in bridging gaps between various communities. One such community consists of the deaf and mute individuals who communicate through sign language. While it's a rich and expressive language, communication barriers still linger in numerous everyday situations. Addressing this challenge is the ambition to devise a computer vision solution capable of interpreting sign language from real-time videos.
Your role as a data scientist is paramount in this venture. By crafting an algorithm that can process videos and recognize the performed signs, you're unlocking doors to enhance communication and understanding between the deaf/mute community and the world surrounding them.
You will be furnished with a set of videos with length ranging from few seconds to 30 seconds, each showcasing a specific sign or word in sign language. These videos will serve as the foundation to train and validate your model.
Remember: Both accuracy and speed are of the essence, as the goal is achieving real-time interpretation. Moreover, the model should be resilient enough to adapt to diverse individuals and conditions. Best of luck in this challenge of inclusivity and communication! 🍀🤟
Category | Code |
---|---|
mal | 0 |
de nada | 1 |
bien | 2 |
con permiso | 3 |
no | 4 |
sí | 5 |
hola | 6 |
como estás | 7 |
gracias | 8 |
perdón | 9 |
For the training dataset: Download train.zip
For the testing dataset: Download test.zip
The repository structure is provided and must be adhered to strictly:
nuwe-data-cv1/
├── data
│ └── labels_path_train.csv
├── model.py
├── predictions
│ ├── example_predictions.json
│ └── predictions.json
├── README.md
└── requirements.txt
train/
: Folder with the video samples for training and validation.
README.md
: This file, encapsulating an overview of the challenge.
Task 1: Your mission is to develop a computer vision model that, upon processing a video of varying length, identifies the sign or word being conveyed through sign language. It's crucial for the model to acclimatize to varying environments, lighting conditions, and individuals.
Submit a predictions.json file containing the model's classification of video samples. Ensure the file is formatted correctly, with the video file identifier as the key and the predicted category as the value. predictions.json:
{
"target":{
"video_244.mp4": 4,
"video_245.mp4": 3,
"video_246.mp4": 1,
"video_247.mp4": 5,
"video_248.mp4": 6,
"video_249.mp4": 2,
"video_250.mp4": 7,
"video_251.mp4": 3,
"video_252.mp4": 1
}
}
Performance will be measured using the F1 Score to gauge precision and recall, offering a balanced view of the model's accuracy and robustness. Your predictions will be rigorously tested against unseen video samples to determine the F1 Score.
Task 1: 900 points
⚠️ Please note:
All submissions might undergo a manual code review process to ensure that the work has been conducted honestly and adheres to the highest standards of academic integrity. Any form of dishonesty or misconduct will be addressed seriously, and may lead to disqualification from the challenge. The file to be evaluated will be predictions.json. This file must be inside /predictions folder.
Timeline
01
Start the challenge & clone the repository
02
Solve the challenge & submit your solution
03
Apply to job offers
Next action: