This is the official repo for paper DiVISe: Direct Visual-Input Speech Synthesis Preserving Speaker Characteristics And Intelligibility.
Code and weights will be coming soon and are expected to be released by May, 2025.
We have a brief demo page here.