Dynamic Fashion Video Synthesis from Static Imagery

Islam, T; Miron, A; Liu, X; Li, Y

Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/29612

Full metadata record

DC Field	Value	Language
dc.contributor.author	Islam, T	-
dc.contributor.author	Miron, A	-
dc.contributor.author	Liu, X	-
dc.contributor.author	Li, Y	-
dc.date.accessioned	2024-08-27T11:03:23Z	-
dc.date.available	2024-08-27T11:03:23Z	-
dc.date.issued	2024-08-08	-
dc.identifier	ORCiD: Tasin Islam https://orcid.org/0000-0001-7568-9322	-
dc.identifier	ORCiD: Alina Miron https://orcid.org/0000-0002-0068-4495	-
dc.identifier	ORCiD: Xiaohui Liu https://orcid.org/0000-0003-1589-1267	-
dc.identifier	ORCiD: Yongmin Li https://orcid.org/0000-0003-1668-2440	-
dc.identifier	287	-
dc.identifier.citation	Islam, T. et al. (2024) 'Dynamic Fashion Video Synthesis from Static Imagery', Future Internet, 16 (8), 287, pp. 1 - 21. doi: 10.3390/fi16080287.	en_US
dc.identifier.uri	https://bura.brunel.ac.uk/handle/2438/29612	-
dc.description	Data Availability Statement: This paper did not generate any new data.	en_US
dc.description.abstract	Online shopping for clothing has become increasingly popular among many people. However, this trend comes with its own set of challenges. For example, it can be difficult for customers to make informed purchase decisions without trying on the clothes to see how they move and flow. We address this issue by introducing a new image-to-video generator called FashionFlow to generate fashion videos to show how clothing products move and flow on a person. By utilising a latent diffusion model and various other components, we are able to synthesise a high-fidelity video conditioned by a fashion image. The components include the use of pseudo-3D convolution, VAE, CLIP, frame interpolator and attention to generate a smooth video efficiently while preserving vital characteristics from the conditioning image. The contribution of our work is the creation of a model that can synthesise videos from images. We show how we use a pre-trained VAE decoder to process the latent space and generate a video. We demonstrate the effectiveness of our local and global conditioners, which help preserve the maximum amount of detail from the conditioning image. Our model is unique because it produces spontaneous and believable motion using only one image, while other diffusion models are either text-to-video or image-to-video using pre-recorded pose sequences. Overall, our research demonstrates a successful synthesis of fashion videos featuring models posing from various angles, showcasing the movement of the garment. Our findings hold great promise for improving and enhancing the online fashion industry’s shopping experience.	en_US
dc.description.sponsorship	Engineering and Physical Sciences Research Council (EPSRC) grant number EP/T518116/1.	en_US
dc.format.extent	1 - 21	-
dc.format.medium	Electronic	-
dc.language	English	-
dc.language.iso	en_US	en_US
dc.rights	Copyright © 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).	-
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/	-
dc.subject	diffusion models	en_US
dc.subject	fashion synthesis	en_US
dc.subject	generative AI	en_US
dc.subject	image-to-video synthesis	en_US
dc.title	Dynamic Fashion Video Synthesis from Static Imagery	en_US
dc.type	Article	en_US
dc.date.dateAccepted	2024-08-07	-
dc.identifier.doi	https://doi.org/10.3390/fi16080287	-
dc.relation.isPartOf	Future Internet	-
pubs.issue	8	-
pubs.publication-status	Published online	-
pubs.volume	16	-
dc.identifier.eissn	1999-5903	-
dc.rights.license	https://creativecommons.org/licenses/by/4.0/legalcode.en	-
dc.rights.holder	The authors	-
Appears in Collections:	Dept of Computer Science Research Papers

Files in This Item:

File	Description	Size	Format
FullText.pdf	Copyright © 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).	17.11 MB	Adobe PDF	View/Open

Show simple item record

This item is licensed under a Creative Commons License