Well incoming reports of the problem died off pretty suddenly both here and at the GPF Nexus Help Forum. It seems like there were several potential contributing factors (Assistant roll-out, Gboard update with new Voice Input flow, Android version/security updates, etc.), but the way things magically went back to normal suggests to me a server-side fix.
At any rate, glad to hear that this should be resolved - thanks for reporting back Steven!
Hello Codesplice! Thank you for giving me and others the sensation that the higherups are aware of the problem. However real your "Elite Recognized Moderator" status is, it is a great relief to hear something back from somebody in a position to do something.
That said, this problem just started back up for me. I am not sure when, exactly, it started, as I wasn't near WiFi until today and as we already know the bug only happens on WiFi.
I have been doing some tests by putting my phone on a network I control (linux box as router rather than ISP) and then picking through the tcpdumps with a traffic analyzer. It has been quite revealing, and I believe I have a complete understanding of the bug.
First, I use Voice Input to insert some text. My phone divvies up the audio into packets, opens a port 80 connection to google, uses that to negotiate an https handshake, opens an https connection to google, then sends the audio over it. After a very short delay, google uses the already-existing https connection to send back a very small amount of data: almost certainly the transcription plaintext. But Google also tries to open a second connection, over a seemingly-random port, along which some additional information is sent. Those packets are also encrypted, so I don't know what's in them. But if those packets fail to arrive, then we see the undesired behavior being bemoaned in this thread. I can only assume that those packets include *instructions* on which alternate transcription to choose.
This makes sense, because the original transcription must happen very quickly in order to not feel sluggish, and yet (while properly functioning) the text stream can be changed even up to 3 or 4 seconds after the first transcription packets arrive. It's a sort of two-tiered response; the first transcription server creates the base data of every possible transcription, and then the second transcription server uses that data to fine-tune, perhaps after checking an enormous corpus of transcription data and doing ccomputationally-expensive analytics. That way it does not feel sluggish, but you still get good accuracy.
But if you want to have different machines doing the work, then my android device must have two separate tcp connections, one for each server. It makes the first connection on its own, when it sends the audio to Google, and thereafter that connection is treated as Ongoing for the purposes of network routing and firewalls. But the second connection, well. Apparently it is google's seconf server attempting to open a *new* connection to my phone. Which means that my router must not only allow for incoming connections, it must also forward that (unknown, seemingly-random) port to my phone ahead of time!
And absolutely *no one*'s router allows all incoming connections by default. The only way this might conceivably work is if google published the algorithm for deciding what port to make the second cconnection on, and then users set up a script to forward traffic coming in on that port to their phone whenever the router detects outgoing Voice Input traffic, then switch it back shortly after. This is well beyond the capabilities of most ISP-issued routers, much less most end users.
This hypothesis explains the data: why the bug is tied to WiFi being turned on.
But if this hypothesis is right, the problem ought to be far, far more widespread. The default behavior for most routers is to drop all incoming traffic not associated with an ongoing connection. So any android device behind any factory-settings router ought to have this bug. Yet clearly this isn't the case. I also have a suspicion that there probably ought to be a router somewhere between my phone and the 4g nnetwork, a router would probably also ought to block this traffic, and yet clearly it doesn't. So there are still some unanswered questions.
And yet. If I route my phone through a linux box, I can make the bug appear instantly with
Then I can make the bug vanish again with
So it
must have something to do with that second connection.
If you could forward this to the appropriate place to make sure it gets seen, I would be very grateful (I cba). I also have the raw tcpdumps and wireshark analyses with my comments on request.