Friday, 29 May 2015

Invoking Android Assistant using Speech Recognition

This post is the first step towards creating a comprehensive voice based assistant application. We've chosen Android using Android Studio on Linux as our development platform, but the same can be extended to others as well. This is mostly gonna be on the implementation part, with links given about the installation of the required dependencies.

Excited, yeah but Why??? why now???? Lets think about it.   OK touch screen is fab. You can pinch the screen to zoom into an image? Cool !!! Can wipe across to delete/archive stuff??? Pretty much impressive. Can swipe across the keyboard or, thanks to Alex Greaves, can write in your own handwriting on the screen????? Thats sci-fi man !! HI-5 !!!!!!

However, we do live in a real physical world and not in some touch screen hologram. The most natural way in which humans communicate is by speaking, not by touching
others, isnt it? . In case this were true, it would make more sense to be able to use the device using voice, just like we interact with humans, isnt it?


So, lets begin by naming the assistant as Sarah ( sounds cool...has a western touch and also implies Saraswati , the Goddess of Knowledge in Hindu mythology !!! ). Now having said that, we'll start by breaking down the assistant making into a list of achievable modules. The most natural use of a phone is to open an app isnt it ?? (ok….calling a person too, but technically call is an application so it is indeed a subset).  Lets see if we can achieve this using speech recognition technology.


Basic Requirements:

An Android Mobile phone .

PocketSphinx for Android.

Pocketsphinx on Android is an excellent and upto date tutorial on the installation of pocketsphinx for android in Android Studio.

Lets sketch out the basic functionality that we desire in the app:
a) It should be running all the time.
b) It should open an app on demand from any screen. 
One way to make sure that the above are satisfied to use thread programming in Android.

So what is this thread??

 Apps are the abstraction of running programs. Threads are the unit of execution in an app. 
A process contains one or more threads. In single-threaded processes, the process contains one thread. You can say the thread is the process—there is one thing going on. In multithreaded processes, the process contains more than one thread—there's more than one thing going on.

How to implement it?

There are basically two ways of implementing threads in Android:

1) Providing a new class that extends Thread and overriding its run() method.
2) Providing ne instance of thread with a Runnable object during its creation.

Leaving specific details, lets agree on this. In order to create a new thread, the code to be executed in that thread needs to be put within the Run() method of a Runnable instance. A new Thread object then needs to be created, passing through a reference to the Runnable instance to the constructor. Finally, the start() method of the thread object needs to be called to start the thread running. 

[code]
Runnable runnable = new Runnable() { public void run() {

//script to be executed here

}} Thread mythread = new Thread(runnable); mythread.start();

[/code] 


Now that we understand threads, lets try to put it to use in our case. Pocket sphinx provides a pretty decent demo of the speech recognition interface. Lets, for now, tweak it to implement threaded programming and open an application, lets say facebook. The way to go is Intents.

INTENTS??

An Intent is an object that provides run time binding between separate components (such as two activities). The intent represents an app’s "intent to do something." You can use intents for a wide variety of tasks, but most often they’re used to start another activity. 

Now lets put an intent in the thread we've just written. 

***************************************
private void start() {
  thread = new Thread() {
      @Override
      public void run() {
  Intent intent = getPackageManager().getLaunchIntentForPackage("com.facebook.katana");  
      }
  };
  thread.start();
****************************************************
Now, lets assume the case that the user doesnt have the app installed. It'll be cooler to program in such a way that the user gets redirected to the page of the app so that he can download the app right?
Here's the module for the same:

*******************************************
 if (intent != null) {
/* We found the activity now start the activity */
      intent.addFlags(Intent.FLAG_ACTIVITY_NEW_TASK);
      startActivity(intent);
  } else {
/* Bring user to the market or let them choose an app? */
         intent=new  Intent(Intent.ACTION_VIEW);
          intent.addFlags(Intent.FLAG_ACTIVITY_NEW_TASK);
          intent.setData(Uri.parse("market://details?id="+"com.facebook.katana"));
          startActivity(intent);
}
********************************************

Combining both, ******************************************************
private void start() { thread = new Thread() { @Override public void run() { Intent intent = getPackageManager().getLaunchIntentForPackage("com.facebok.katana"); if (intent != null) { /* We found the activity now start the activity */ intent.addFlags(Intent.FLAG_ACTIVITY_NEW_TASK); startActivity(intent); } else { /* Bring user to the market or let them choose an app? */ intent=new Intent(Intent.ACTION_VIEW); intent.addFlags(Intent.FLAG_ACTIVITY_NEW_TASK); intent.setData(Uri.parse("market://details?id="+"com.skype.raider")); startActivity(intent); }}};

*******************************************

  Here's the github link  for the project including the apk. Just say "Sarah open facebook" and it should be opening facebook.
Possible errors if trying to run the code using Studio:

NDK missing/ path not set:


Pocket sphinx has some of the modules written in native languages and hence requires NDK to function.  Download NDKIn case you have the NDk but still the problem persists, probably the path is missing. Here's an SO question about the same.


Feel free to play around with the code and report bugs.