1. Are you ready for the Galaxy S20? Here is everything we know so far!

Google Voice Recognition API - How does learning work?

Discussion in 'Android Development' started by idoun, Apr 14, 2012.

  1. idoun

    idoun Newbie
    Thread Starter

    I am planning an app using the android speech input api (Speech Input | Android Developers). Does anyone know how the learning occurs? So for example initially my phone could not recognize proper nouns, local locations, uncommon names of my friends, etc, but now it recognizes all of these things so clearly it has learned. But are all of these words now available to other users?

    I am envisioning an app when each of my users would be contributing to a database constructed by other users, so that a proper noun that someone else entered is now available to be recognized by someone who has never used my app, is this something I can count on happening automatically?


    1. Download the Forums for Android™ app!


  2. wubbzy

    wubbzy Well-Known Member

    I'll share what I know, which is not terribly much, I'm afraid, but may get you started.

    There are two providers for android - Vlingo and Google voice. When you use speech-to-text voice samples are sent to one of these providers servers, yes they are, you can read the fine print - don't believe me, use wireshark or similar to trace it out. Most of the information in proprietary on these system's so it is hard to say. Check this out for some overview Speech recognition - Wikipedia, the free encyclopedia

    As far as sending samples to your own database, I think it is doable, you'll have to work with MediaRecorder and/or Sensors (MIC). Check this source code "VoiceNotes for Android": Sample App using Flex, AIR, and the Microphone API
  3. idoun

    idoun Newbie
    Thread Starter

    Thanks, this is definitely helpful.

    What I would like is for my app to understand a specialized vocabulary and google's built-in speech recognition has not done very well so far. I was hoping that it would learn so that if I used a technical term a few times (or had a user base using the same terms), and then delete what it thinks I said, and enter the correct spelling, it would associate the voice with the text, and be able to do it, but it has not happened after a number of attempts. The Swype learns fairly fast, however.

    If the google voice recognition does not learn fast enough, and does not have the specialized vocabulary, I would want to do it on my own, but I'm not sure that it is feasible. The app you have linked allows me to use the microphone api to record speech so I could say 1000 specialized terms and record the information to a sql database, but how would I analyze these voice recordings and I'm also not clear on how I would use speech-to-text even if I was able to analyze the sound because I am still at the mercy of the google speech recognition api.

    Appreciate if anyone could steer me in the right direction for what I would need to do conceptually.

Share This Page