Transforming the Shopping Experience with Image-Based Virtual Try-On Technology

Mohamed Illiyas
5 min readJun 6, 2024

--

Have you ever thought about how amazing it would be to try on clothes without actually going to the store? Yes, that’s exactly what image-based virtual try-on (VTON) technology is bringing to the table.

Hi Guys!

When we shop online, we usually see clothes on random models or dolls used by stores to show off clothes, which makes it hard to imagine how they’d look on us. Imagine shopping clothes online and being able to see exactly how they’d look on you before clicking “Buy.” That is not a dream anymore. It’s happening right now and it’s going to be the way we shop in future.

VTON — Virtual Try ON

So, what exactly is VTON? In simple terms, it’s a technology that allows you to see a virtual version of yourself wearing different outfits. This is helpful for the e-commerce industry, as it makes shopping a whole lot more fun. No more guessing if that dress will fit right or if those jeans will match your style — you can see it all in real-time!

Let’s see couple of scenarios where we can use VTON.

Scenario 1: Shopping for a Special Event

Imagine you are going to your favorite clothing store for a wedding purchase. You roam through every sections of the shop. You want to try out everything one after another. But you will end up being exhausted by losing all energy when changing multiple clothes. Sometimes, the bodyguards outside the trail rooms won’t allow us to try more than 3 clothes at a time or they won’t allow us to try white clothes which we will be eager to see on ourselves. I am sure, you would have faced any of these in your shopping life.

But what if the store offered offline Virtual Try ON? You can take a snap of yourself, scan the dress with their virtual try-on app, and in seconds, your digital self is wearing the dress on the screen -an instant personal virtual fitting room right in the store.

Scenario 2: Casual Everyday Shopping

Now, think about your everyday online shopping. You’re looking for a new pair of jeans. You usually wear skinny jeans, but you’re curious about how you’d look in baggy jeans. Instead of buying both and returning one, you can use VTON to see which one looks better on you. It saves time, and even reduces the product returns and damages.

The Big Challenges

The development of Virtual Try-On (VTON) technology didn’t happen overnight. It was the result of ongoing research and exploration in the field. In every stage, there were multiple blockers and shortcomings. VTON has its own challenges, here are few,

Let’s see that,

1. Making It Look Real

The first challenge is making the virtual try-on look as realistic as possible. Early methods used something called generative adversarial networks (GANs). While GANs are pretty cool, they often had trouble with details like garment folds, natural lighting, human pose angles and realistic body shapes. Think about how awkward it would be if the dress you’re trying on in the app looks all wrinkled or the lighting makes it look completely different from what it actually is. We definitely wouldn’t add that dress to our cart, right?

This is where latent diffusion models (LDMs) come into play. These models are like the upgraded version of GANs, making the images look much more realistic and natural. It’s like changing from a raw sketch to a high-definition photograph.

2. Keeping the Details

The second challenge is all about the details. When you’re trying on a virtual outfit, you want to see every little pattern, texture, and color just like you would see in real life. Traditional methods often used a warping process(fitting one shape onto another.) to fit the garment onto your virtual body. While this works to some extent, it can sometimes mess up the details of the body and also making the clothes look less appealing.

Approach

Now, let’s talk about the approach which made VTON better than before,

Outfitting UNet

This UNet Architecture ensures that the generated images look realistic and that the try-on effects are natural. It is designed to learn the detailed features of garments in the latent space in a single step. It learns all the little details of the garments in one go, which means no more awkward warping processes.

Outfitting Fusion

This feature uses self-attention layers (think of it like the technology paying close attention to details) to align the garment perfectly with your body or target.

Outfitting Dropout

This one’s all about control. It adjusts the strength of garment features during training, making sure that the virtual try-on looks just right.

Limitations

Of course, no technology is perfect. VTON still has some limitations.

For example, it can struggle with cross-category try-ons, like putting a T-shirt on someone who’s currently wearing a long dress or vice versa. This happens because the models are trained on paired images of people and garments, and they can get confused with this change.

Another issue is that some details of your original image might get altered.

For Example, you have a tattoo or a watch; the VTON might mask and repaint those areas, leading to a loss of information from the image. Sometimes, it might change the muscle strength in your arms.

Future Enhancements:

First issue can be partially addressed by collecting more diverse datasets of individuals in different clothes but in the same pose. Second may require pre-processing and post processing to retain the information as it is and fitting the clothes on to the person. So, future research should focus on developing more practical processing methods to mitigate these issues.

Conclusion

Virtual try-on technology is going to change the way we approach the e-commerce industry. It brings convenience, fun, and accuracy to online shopping, helping consumers make better choices and reducing the costs for retailers and marketing. As this technology evolves, we can look forward to an even more interesting shopping experience. Already we have Lenskart where we can try on our glasses. Now lets try on our clothes too!

Reference:

Research Paper: OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on

Github: https://github.com/levihsu/OOTDiffusion

Cheers!

--

--