i think stable diffusion can do video now (not sure) but i think it ( likely takes tools to keep something consistent
maybe it could be first person or there could be distinctive accessories