My first meet up, GCP and ML APIs…

Karan Balani
3 min readJun 10, 2018

--

I have heard and seen a lot of people posting and telling about the meet up events or hackathons, organised by a number of tech companies, most famously like Google, Facebook, PayTM, etc…

This was my first meet up where I met a lot of people working on various technologies and companies, basically this was a small meet up where I learnt about the new ML APIs launched by Google. Also this is my first medium story too! Apologies for being hidden, but now you have found the gem. :P

I had Goosebumps!

The moment when the instructor asked us to give an brief intro about ourself, I was just into my fear world, I kept on making sentences to say when my turn comes and finally at last I came up with a totally different speech which was a mixture of many others who have already said! Oh yes I have stage fear, you get it right. I’m working on it.

Okay, so enough talk, lets get into the event, this event was mainly focused on the Vision API and Speech API and some brief intro to Kubernetes. I will show here the working demo of both the APIs, and also be sharing the codes. We will be working on python and terminal with no impressive GUI. Also, this will not be a whole long procedure, I’ll add only the required things and results.

Google Vision API

Here, using this API we will be analysing images and getting relevant labels about the input image, like smile, color, mood, type of photography, lighting, etc…

Things you’ll need: Python 2.7.14 or above, OpenCV, google-cloud, and below shared code files.

Command to get Google Cloud:

sudo pip install google-cloud

Command to get open-cv:

sudo pip install opencv-python

The main code:

Smile XML:

https://gist.github.com/krnblni/62499bd87e6a1f13c07e8dcb165fde8f

Frontal Face XML:

https://gist.github.com/krnblni/62499bd87e6a1f13c07e8dcb165fde8f

Procedure: Just create a new API key for Vision API from GCP and download the credential JSON file, after that change the paths to smile, frontal face , image and credential json file in the opencv-gcp.py as shown in the image below (see image caption):

ON LINE 9, 13, 14 and 20 :)

RESULTS:

Test Image and Results

Google Speech API

Now we will use the Speech API to detect what we are saying and convert into text.

Requirements: Python 2.7.14 or above, google-cloud, PyAudio, PortAudio(for PyAudio), SpeechRecognition, Pocketsphinx, and the python code below.

Commands:

Google Cloud: "sudo pip install google-cloud"
PyAudio: "sudo pip install pyaudio"
PortAudio: "brew install portaudio"
SpeechRecognition: "sudo pip install SpeechRecognition"
Pocketsphinx: "sudo pip install pocketsphinx"

The main code:

RESULTS:

Thanks for riding along!

Well that was all about Getting started with the new Google ML APIs. All these codes and files are uploaded on Github too under a repo by the organiser/host. LINK TO VISION REPO and SPEECH REPO.

So, it was a great experience there, and I just can’t resist myself to share the group photo and the swags…

Group Photo
SWAGS! :P

If you like this, hit theclap button and let me know how can I improve in the comments.

Until next time, bye! Thanks for reading and congrats for making upto here. :)

--

--

Karan Balani
Karan Balani

Written by Karan Balani

Engineering, Infrastructure and Security @ FinBox

No responses yet