3s of lag seems a bit on the high side. Using cheap huawei's here in the
lab, we typically see about 250-500ms of lag. Note that the cheap
huawei's seem to have a 2KB hardware audio buffer, leading to (2KB /
16bit 8KHz ~=) 120ms of jitter that we can't do anything about.
We'd like to implement some kind of echo canceler, but we haven't found anything that would be trivial to drop in. Our jitter buffer could probably use some improvements, I'm not going to say that it is in any way ideal.
We're not managing network buffer bloat right now. When the wifi channel is busy, the rate that the kernel will send packets drops and can easily start to back up in the network stack. This can add a significant amount of latency. If packets don't arrive in time, the link state routing engine may be dropping the link, which would limit the number of outgoing packets, which would drop the latency, and the link would come back up again...
We should be using RTT estimates to throttle our transmit rate to keep latency low, and feed into our link state calculations to work out when a link is really dead, or everything on that interface is just slow. But we haven't built anything like that yet.
If we were limiting our transmit rate better, instead of sending lots of small UDP packets, we would try to aggregate more voice packets per UDP frame. Hopefully allowing the voice call to continue, just with higher jitter.
Rhizome, and hence messaging, doesn't care as much if the network path is unstable. And it will just keep on trying until it gets there, making messaging *much* more reliable.
We have some rough plans to implement a combination of push-to-talk / voicemail, using rhizome. And built into the messaging UI. But we don't have any concrete plans to start this work right now.
And one day we may implement a form of network coding to reduce the number of packets required for bi-directional traffic over multiple hops.