This is a quick demonstration of the Android AIR Bridge I have developed. The bridge works using socket communications together with AMF serialisation/de-serialisation. It builds on and adapts projects such as Merapi, BlazeDS, GraniteDS, Apache Commons and Apache Harmony.
I believe this opens up all functionality on Android to AIR applications and in fact allows you to develop fully mixed Android and AIR applications.
The demo video above showcases this two-way communication between Adobe AIR and the Android Voice Recognition functionality using the Motorola Xoom.
A button click on the AIR side results in the Voice Recognition service being activated on the Android side. The result from the Voice Recognition gets passed back into AIR which then uses it to make a remote service which allows me to create an intelligent response and play it back as audio.
If you want to get started with developing mixed AIR / Android applications then a great place to start is this superb article by James Ward on Extending AIR for Android . In this article he concludes that adapting the Merapi project to work with Android would provide a better bridging mechanism.
I thought that getting that working would be an interesting garage build challenge and that is what you see working in this demo. For any of you interested in experimenting with or using it, the code for the bridge is now available to check out at http://code.google.com/p/android-java-air-bridge/.
This is something I dreamt up and actually finished in July but haven’t yet had time to write up a post on it. The original idea was to use Flex/AIR on the desktop to control a NXT Lego Mindstorm robot but then things got a little more ambitious by adding a bit of Augmented Reality together with real time Speech Generation.
The first step was to replace the firmware on the Mindstorm brick with LeJOS, which includes a Java Virtual Machine that allows the Mindstorm to be programmed in Java. This was a tense moment but actually went relatively smoothly.
I then created a Java application on my Macbook Pro that could control the robot via a Bluetooth link. I discovered that there is a nice little library called icommand that is designed to allow you to do just this.
Once I had that connection up and running the next step was to make sure that I could use this application as a bridge to control the Robot from an AIR application.
I therefore created an interface to my desktop Java application using Merapi, which is the same Java/AIR bridge that I used to create a previous project for Flex speech recognition.
Using this, I then created an AIR application that could send and receive Merapi messages across the bridge, which allowed to me to get sensor information back from the Mindstorm and to send messages to it for navigation and pincer/sensor control.
Having got all that working I couldn’t resist putting in some Augmented Reality, so I added an iphone to the Robot and used that as a wireless webcam to get the video feed into my AIR application.
I then adapted and incorporated my previous augmented reality projects to allow me to have interactive avatars that I could switch at runtime. These are integrated with a remote java server using BlazeDS to give me the artificial intelligence part that they need to provide realistic answers and speech.
Now around this time AIR 2 was released so I decided to create the whole thing using that instead. Instead of having separate AIR and Java Applications the whole lot is bundled together and deployed as a Native Application. This is very cool and I think adds a whole new dimension to what can be done with AIR. What actually happens is that the Air application contains on its source path, the executable java jar file for the Minstorm’s controller. Once the AIR2 application is launched it is able to launch the java application as a native process and then communicate with it across the bridge using Merapi by serialising objects between Java and ActionScript.
It has been a while since my last blog post as I am busy working out of Paris and London on a very interesting and high profile project creating fully integrated Flex front ends as part of a global SAP implementation project for one of the worlds leading advertising agency groups.
Anyway, once I finished the Talking Head I decided it would be fun to push the boat out a little further and create a full body character with a wider range of interactions. So the result is this augmented reality girl. Pressing the cmd button gets her to dance and you can get her to walk around using the arrow keys. She also has all the lip synch functionality from the talking head.
The technology used is the same as for the head and is in fact the same custom written Java web application, deployed to tomcat, using BlazeDS to communicate with the AIR application.
I also briefly looked at the new sound API available in flash 10.1 beta but unfortunately it doesn’t look like you can feed it an mp3 byte array as it requires a raw sound byte array instead. Although being able to manipulate and dynamically generate audio is a great new feature, I think it would be awesome to be able to generate mp3 byte arrays on a server and just load and play them directly from within the flash player. The draw back currently is that we still need to write the byte array as an mp3 file to the local drive and then load it into a Sound object before playing it.
This is the second demonstration of the augmented reality talking head and we now have eyes that move to look at the marker and we also have blinking.
Some hurdles that have been overcome are:
1. The head does not jitter so much – initially the head was set at 200px in front of its origin. The further from the origin you get the more things tend to jitter. So this has been corrected.
2. Lip-synch is improved by removing the timer and putting everything into the enterframe_handler. Running a timer in parallel to your enter frame loop degrades the performance.
3. Framerate has been improved in a number of ways. The stage quality is set to low. Interactivity is set to false on the viewport. The texture has been optimized so that it is not as large a file as before.
Furthermore, the Papervision3D part has now been upgraded to version 2.1 as in this version the whole org.papervision3d.core.animation.* package has been revamped completely to allow for the changes in the DAE class. The main change as it affects this project is that dae animation is no longer frames based but time based. This means that whereas before I might go to frame 6, now given a dae framerate of 30 frames per second I would simply go to time 6/30 or 0.2.
More information about PV3D 2.1 can be found in the blog posting by Tim Knip here http://techblog.floorplanner.com/2009/05/26/papervision3d-21-alpha/
While I have been playing around with Merapi to add voice recognition for the talking head, I created a little Merapi application that allows me to move a target on the screen by saying the voice commands LEFT, RIGHT, MIDDLE.
I have seen Rich Tretola’s blog Everything Flex on text to speech through Merapi and thought it would make sense to do it the other way around. In other words, voice recognition and voice control.
This is fairly simple to achieve. I decided to use the Sphinx 4 Speech Recognizer which is an open source java project by Carnegie Mellon University and others.
I then wrote a client for that framework, added the merapi jar files and broadcast a merapi message whenever the Sphinx Client detected speech.
You can download the full source code by clicking here.
I have been working on this project for the past few weeks and it is now coming together nicely so I thought I would create a little blog entry about it. The idea is to create an Augmented Reality head with which one can have a conversation.
So far I have got as far as being able to type something and have the augmented reality head answer me back.
What takes place is that we have an AIR client (built using Cairngorm) communicating with a Java server side using Remote Objects over BlazeDS. The text is sent to the Java server application using remote objects where a text response is generated using AIML and a Java chatbot framework. This text response is passed to a text to speech (TTS) socket server to generate both an mp3 byte array and something called MBROLA input format. MBROLA input format is a stream of text symbols (phonemes) together with duration in milliseconds, that represent visemes (mouth shapes).
The whole lot is packaged and sent back over the wire via BlazeDS where we have an Augmented Reality Viewer created as an Advanced Flex Visual Component (using Papervision3D and FLARToolkit). The model head was created in Maya and is an animated Collada with 13 different mouth shapes that have been mapped to the output received from the MBROLA stream.
To play the speech response in AIR, the mp3 byte array is written as a temporary file, read into a sound object and then played back. At the same time the MBROLA stream has been parsed into an ArrayCollection of frames (for the model head) and durations and this is now iterated over in the handler method of a timer.
Coming soon hopefully will be speech recognition via the Merapi Java/AIR so that you can talk to the head.
Well I have finally gone and done it! After much consideration and procrastination I have decided to set up my own blog, broadly to act as an interesting pointer to materials, code samples and books that will find useful on your journey to developing full stack Flex, Java Spring, Hibernate, Maven apps running on Tomcat and BlazeDS from scratch.