By Prem Nawaz Khan / @mpnkhan
Statistics
References: World Health Organization Report , WRD Report
Most cases, needs Special Input, Output to access information
const SpeechRecognition = window.SpeechRecognition || window.webkitSpeechRecognition;
const recognition = new SpeechRecognition();
recognition.lang = 'en-US';
recognition.interimResults = false;
recognition.addEventListener('result', (e) => {
let last = e.results.length - 1;
let text = e.results[last][0].transcript;
console.log('This is the input text');
});
function synthVoice(text) {
const synth = window.speechSynthesis;
const utterance = new SpeechSynthesisUtterance();
utterance.text = text;
synth.speak(utterance);
}
navigator.mediaDevices.getUserMedia({video: true})
on callback
video.srcObject = stream
canvas.getContext('2d').drawImage(video, 0, 0);
img.src = canvas.toDataURL('image/webp');
Smart phones & Smart speakers - best examples of reviving AI
Apple's Siri, Amazon echo, Google Assistant, Cortana for Windows
Pros & Cons
<img src="niceFlower.jpg" alt="Lily in a Pond">
It is about educating Content editors to input alternate text in their Content Management System
(and/or)Educating Web Developers to put alt text for images
April 2017, Automatic Alt text extension for Chrome
Based on Tensor flow and im2txt Model
April 2018, Around 25 API's made available by Microsoft for public use
Using Facial recognition, we can
Recognizing Humans versus Bots
CAPTCHA sucks?
Examples like iPhone X Facial Unlock, Windows Hello
Automated captions using Lip Reading
Oxford University researchers partnered with Google on a new AI tool that reads lips, and the results were significant.
Trained with a dataset of more than 100,000 natural sentences.
Helpicto is an android app which uses speech to text and Microsoft Cognitive API to convert speech a set of images which students with Language disorders related to autism, dysphasia, or Alzheimer’s disease
Cognitive API uses AI to split the sentences and sends back the intents which is converted to a list of images
The speech command is “Do you want to eat an apple?” Helpicto will then generate three images: the child himself, the action of eating, and a picture of an apple.
Real Time American Sign Language Video Captioning using Deep Neural Networks
More info: Slides and NVIDIA Blog