[twitter] snapshot of cutting edge public transformer model research

Wed Jan 26 11:22:16 PST 2022

I am very lucky to have one of the content feed AIs out there give me
content that is not random and/or gross. This twitter feed gives me
high-quality artificial intelligence research papers, tutorials, etc,
new ones almost every day. I hope it keeps up. I see the AK user a lot
in this feed.

arXiv.org

ShapeFormer: Transformer-based Shape Completion via Sparse Representation

We present ShapeFormer, a transformer-based network that produces a
distribution of object completions, conditioned o...

Read more at Twitter
<https://twitter.com/i/redirect?url=https%3A%2F%2Ftwitter.com%2Fi%2Ftopics%2Fnews%2Fe-1834110365%3Fcn%3DZmxleGlibGVfcmVjcw%253D%253D%26refsrc%3Demail&t=1+1643223174567&cn=ZmxleGlibGVfcmVjcw%3D%3D&sig=6dd62d3b19f23bd95a710fde80f8aecfc442dfcd&iid=4bc5bc5345354663ac8195bc6a777030&uid=1300187465689509891&nid=244+272830480>

----------

AK sharedAK shared

arXiv.org

PONI: Potential Functions for ObjectGoal Navigation with...

State-of-the-art approaches to ObjectGoal navigation rely on
reinforcement learning and typically require significant...

Read more at Twitter
<https://twitter.com/i/redirect?url=https%3A%2F%2Ftwitter.com%2Fi%2Ftopics%2Fnews%2Fe524318864%3Fcn%3DZmxleGlibGVfcmVjcw%253D%253D%26refsrc%3Demail&t=1+1643223174567&cn=ZmxleGlibGVfcmVjcw%3D%3D&sig=2cba46b9094e58a934dcd84b8f832ab4c306248e&iid=4bc5bc5345354663ac8195bc6a777030&uid=1300187465689509891&nid=244+277024784>

----------

AK sharedAK shared

arXiv.org

SPIRAL: Self-supervised Perturbation-Invariant Representation...

We introduce a new approach for speech pre-training named SPIRAL which
works by learning denoising representation of ...

Read more at Twitter
<https://twitter.com/i/redirect?url=https%3A%2F%2Ftwitter.com%2Fi%2Ftopics%2Fnews%2Fe-780219390%3Fcn%3DZmxleGlibGVfcmVjcw%253D%253D%26refsrc%3Demail&t=1+1643223174567&cn=ZmxleGlibGVfcmVjcw%3D%3D&sig=7c627670ff6d3547a6c40fdfee659c13ccc83fb9&iid=4bc5bc5345354663ac8195bc6a777030&uid=1300187465689509891&nid=244+281219088>

----------

AK sharedAK shared

arXiv.org

Improving the fusion of acoustic and text representations in RNN-T

The recurrent neural network transducer (RNN-T) has recently become
the mainstream end-to-end approach for streaming ...

Read more at Twitter
<https://twitter.com/i/redirect?url=https%3A%2F%2Ftwitter.com%2Fi%2Ftopics%2Fnews%2Fe486494415%3Fcn%3DZmxleGlibGVfcmVjcw%253D%253D%26refsrc%3Demail&t=1+1643223174568&cn=ZmxleGlibGVfcmVjcw%3D%3D&sig=f4c2ee0a52bab617406ae14c9294380cafb01ebc&iid=4bc5bc5345354663ac8195bc6a777030&uid=1300187465689509891&nid=244+285413392>

----------

AK sharedAK shared

arXiv.org

Zero-Shot Long-Form Voice Cloning with Dynamic Convolution Attention

With recent advancements in voice cloning, the performance of speech
synthesis for a target speaker has been rendered...

Read more at Twitter
<https://twitter.com/i/redirect?url=https%3A%2F%2Ftwitter.com%2Fi%2Ftopics%2Fnews%2Fe667714482%3Fcn%3DZmxleGlibGVfcmVjcw%253D%253D%26refsrc%3Demail&t=1+1643223174568&cn=ZmxleGlibGVfcmVjcw%3D%3D&sig=8e558273c59323f7627c6bc5536e9734dfb410f2&iid=4bc5bc5345354663ac8195bc6a777030&uid=1300187465689509891&nid=244+289607696>

----------

AK sharedAK shared

arXiv.org

Transformer-Based Video Front-Ends for Audio-Visual Speech Recognition

Audio-visual automatic speech recognition (AV-ASR) extends the speech
recognition by introducing the video modality. ...

Read more at Twitter
<https://twitter.com/i/redirect?url=https%3A%2F%2Ftwitter.com%2Fi%2Ftopics%2Fnews%2Fe-1400674422%3Fcn%3DZmxleGlibGVfcmVjcw%253D%253D%26refsrc%3Demail&t=1+1643223174568&cn=ZmxleGlibGVfcmVjcw%3D%3D&sig=c3ca535be11bccd7f5926b54a1d460d520cf7dde&iid=4bc5bc5345354663ac8195bc6a777030&uid=1300187465689509891&nid=244+293802000>