How Machine Learning Models Work in Virtual Fitting Room Apps?

Photo of Damian Kosowski

Damian Kosowski

Updated Jul 10, 2023 • 18 min read
VTO

Machine learning and augmented reality have the potential to turn the world of retail on its head. How? Via virtual fitting room apps. Offering a personalized, realistic experience, 36% of Americans are interested in sampling virtual fitting rooms.

“Return rates for online shopping are over 60%, so the possibility of trying clothes on virtually before you physically have to do that will make a huge difference to consumer experience,” says Matthew Drinkwater, Head of Innovation Agency at the London College of Fashion.

To facilitate that, enter virtual try-on technology, using machine learning, and augmented reality (AR). Virtual fitting has many uses in retail, from makeup sampling to trying on different outfits or shoes.

Virtual try-ons offer a way for people to see how items look and find a perfect fit for different body types, without actually visiting a store – especially useful if they prefer shopping online. Equally, some applications are found in-store too.

From Converse back in 2012 to Gucci and Burberry more recently, the technology is quickly advancing, enhancing customer experience, satisfaction and boosting sales. The retailers who use augmented reality to implement virtual fitting room apps are varied and expanding.

What is virtual fitting room technology?

Virtual fitting room or virtual try-on is a general term for technology that allows customers to visualize how items like clothes, accessories, shoes, jewellery, and makeup look with their body shape in an accessible way.

It doesn't require leaving home or physically buying the item, a virtual fitting room app provides a convenient way of trying on items wherever and whenever a customer choses to.

Most of the time, virtual try-ons involve a dedicated mobile application as a part of the online store experience, but it’s also possible to use a desktop with a webcam, or even a single photo.

Sometimes, virtual fitting rooms take place in shops in the form of magic or smart mirrors. For example, Timberland implemented a virtual mirror. Regardless, the technology is the same.

Given an image or a live video feed, machine learning algorithms find objects like a t-shirt, shoes, feet, or ears – depending on what the customer wants to try on. Next, the virtual try-on item is rendered onto the image using augmented reality, blending the virtual item with the real-world image. And voilà, that’s the whole process.

Of course, retailers that want to create virtual fitting rooms have to handle a lot of technicalities, because the quality of the machine learning model must be good. Otherwise, it may fail to detect a given object. For example, training data that incorporates a range of ethnicities is crucial with makeup fitting room apps.

Moreover, using augmented reality is not straightforward. Why? Factors like object scaling, orientation, correct handling of occlusion, and proper lighting are just a few things to cover in order to provide a realistic experience.

Virtual experiences make it easier, faster, and more convenient to try on multiple clothes in different sizes, without searching for items in-store and dealing with cumbersome fitting rooms. Instead, it’s possible to see how you look in a top or wearing a pair of shoes at the touch of a button.

Virtual try-on apps also offer valuable help when shopping online, by providing suggestions about sizing. A customer doesn’t need to order the same item in two or three different sizes, only to return some (or all) of them later.

The same advantages apply when shopping in a mall, where a virtual mirror is present. Plus, there are no queues, because the whole process is much quicker.

What’s a virtual fitting room app?

Virtual fitting rooms are mobile, desktop or web applications that allow consumers to use virtual try-on technology. From the customer’s point of view, they scroll through and open items they like – no different to a ‘normal’ online shopping experience in that respect. What is different, is the addition of a button to activate the virtual fitting room.

At that moment, a real-life 3D model of an item downloads onto their device, and the camera feed starts to capture a live image or takes a single photo. By knowing something fits or looks good, they’re more confident parting with their hard-earned money.

For brands, virtual fitting room apps can impact marketing, sales, and conversion rates, by offering realistic and personalized AR experiences in seconds. The technology also offers data insights, enabling product recommendations, encouraging upsales, and boosting customer engagement.

Virtual try-ons also help retailers keep track of which goods are selling the most, and they reduce costs associated with returns. However, businesses need more than virtual fitting rooms and AR experiences to drive sales. Instead, they should form part of a company’s wider ecosystem.

Moreover, the technology itself must be effective and offer an authentic experience, otherwise you defeat the objective of what you’re trying to achieve

How does a virtual try-on app work?

From a technical point of view, you need a few things. First of all, you need to locate the person on the image and understand it, by knowing what part of the image is a t-shirt, a foot, face position, etc. Next, you set up key points you’re interested in, such as the edge of clothes or face details like the mouth or eyes.

The selection of proper key points depends on the usage of virtual fitting rooms. For instance, they’re different for makeup and shoes, and most of the time are set up by using a machine learning model that’s trained for the special case in question. The key points must be tracked, in case a video is used, because as a person moves, the virtual items must also move accordingly.

Now, it’s time to display the virtual items – for example, a 3D model of shoes or applying a 2D makeup look. Even though those cases are different, you need to account for the same problems. In the case of 3D models, proper orientation and scaling is needed. Scanning a 3D model of each shoe size is expensive, so it may be easier to scan just one, and post-process it to fit a person.

With both 2D and 3D items, estimating light is a crucial process, because light density affects the final color, shading, casting, and shadows that are displayed.

Light can be the difference between a poor and great user experience, and how realistic the rendered item looks.

Occlusion handling is also important, requiring a depth of information to partly cover virtual items when needed. All these problems should be accounted for, to render virtual items as realistically as possible. Blending virtual items with real images using augmented reality isn’t simple, especially when you’re aiming for an authentic impression of a real item.

Luckily for both iOS and Android, there are specialized libraries that handle some of these issues, or make them easier to solve. That’s also the reason why most virtual try-on apps are available via smartphone. Thanks to quality cameras and additional sensors, phones provide a ton of valuable information while being convenient to use.

Regardless of the specifics, the primary features of fitting room ​applications must be easy to use and as realistic as possible.

Machine learning and AR drive virtual fitting rooms

These technologies are at the forefront of virtual try-on apps, from makeup and shoes to clothing. Read on for details and specific retail examples.

Virtual makeup try-on

Virtual fitting rooms dedicated to makeup – eye shadows, lipsticks, or even contact lenses – depend on models that can accurately detect specific face landmarks. Depending on the use case, these models detect a number of points, describing the geometry of the face, eyes, nose, or mouth.

Face landmark detection was previously based on classical machine learning techniques like decision tree ensembles. However, gradually, they’ve been replaced or combined with deep convolutional networks, a more accurate offering.

Like other machine learning approaches, convolutional networks rely on training data – face images – when thinking about makeup. An appropriate level of image variation is necessary for makeup try-on apps training models.

These models must be robust, because different facial expressions and poses are usually deployed in real-time environments when users experiment with the app.

Additionally, people may wear accessories like sunglasses when using virtual try-on applications for makeup, so the training models should take that into account. By doing that, the final solution doesn’t render the makeup on occluded regions of the face. Rigorous bias analysis should also be conducted, because in production, these models will encounter different skin tones and other facial features.

Currently, most makeup try-on solutions are deployed on mobile devices, offering a real-time, immersive video try-on experience.

Although, that does introduce another type of challenge – handling potential jitter. Landmarks shouldn’t jump from one place to another in subsequent video frames.

Usually, a stabilization pipeline is introduced to handle that issue. A motion tracking algorithm may also be included – one that reduces unnecessary computation whenever there isn't much movement.

Although the machine learning model is a crucial part of the virtual fitting rooms pipeline, it’s not the only component. After detecting landmarks, the makeup must be rendered.

The exact color of the eyeshadow, the subtle gloss of the lipstick, smooth transitions – all these things – must be preserved, to achieve a natural effect. A range of computer vision techniques such as texture transfer methods and blending help achieve that.

Beauty brands make good use of machine learning and AR to offer personalized makeovers. If you’re a body art fan, love sampling new nail polish, hair colors, and beauty products, check out a virtual cosmetics try-on app. Read on for a few examples.

Sephora virtual fitting room

This French multinational retailer is one player that’s entered the try-on makeup world. Their in-store and mobile app lets you virtually try-on products like lipstick, eyeliner, or eyeshadow. Using face segmentation technology, augmented reality, and machine learning tools, you can see how different shades look without actually putting them on.

Sephora’s Virtual Artist allows app users to look at the screen and see themselves ‘wearing’ totally different makeup, over the top of their actual makeup. One of features Sephora offers is miming a kiss at the screen, signaling the technology to change makeup color from a large range shown along the bottom of the screen.

You can also control the app and choose colors via touch. How do virtual makeovers compare to the real thing? The virtual result is pretty close to real life, according to testing carried out by CNN Business: "The technology did a pretty good job matching up to reality."

MAC Cosmetics virtual try-on

Canadian company MAC partnered with YouCam (best known for a photo editing application) to introduce an online virtual try-on app that's convenient and personalized. Via pictures and live video, the nifty augmented reality technology allows users to swatch multiple products on themselves.

How? By creating realistic simulations to test on any skin tone, adapted to various makeup textures like matte, sheen, and gloss. MAC offers more than 200 shades of lip and eye colors, and says: “Through our Virtual Try-On feature, we’re able to bring a quintessential element of the in-store experience right to our consumers’ homes.”

Take part in a virtual lipstick try-on or virtually wear an eye shadow by heading to the 'Virtual Try On ' page and picking a makeup product you'd like to sample. Next, enable your live camera or upload a photo of yourself (or a model if you'd prefer). You can test out a unique makeup look before you hand over any cash.

Ulta Beauty try-on app

Ulta Beauty is an American chain of beauty stores using machine learning and computer vision to offer app users a realistic, personal virtual try-on experience with true-to-life makeup textures, finishes, and colors. The mobile app supporting the technology is called GlamLab; it allows you to use either your camera or a photo to virtually sample makeup.

Using data insights, Ulta Beauty also gives people bespoke makeup suggestions. And in 2020, they expanded further to offer people the ability to virtually try-on brows, hair color, and lashes.

Shoes virtual fitting rooms

Virtual fitting apps for shoes require a different machine learning approach to virtual makeup try-on. Here, rather than finding point coordinates in 2D, you must find the orientation of the object, its size in all three dimensions, and its distance from the camera.

These params are often found indirectly through localizing a 3D bounding box – a cuboid containing the object you want to detect. In this case, the object is the user’s foot.

Although convolutional neural networks are usually the backbone of 3D object detection architectures, the setup to obtain the desired output is different to makeup. Often, post-processing techniques are implemented to recover the 3D coordinates from 2D projections of the bounding box vertices.

Lack of training data is a challenge for 3D object detection, and the process of collecting labeled data is costly compared to face landmark detection.

Instead, synthetic data may be generated, to increase the number of training examples and speed up the collection process.

To provide a proper augmented reality experience, occlusion must be handled in these application types as well. Rather than handling the occlusion directly in the 3D object detection model itself, make use of the provided scene reconstruction. That reconstruction of the 3D world must be generated, to prevent rendering occluded parts of the objects.

The scene reconstruction is built from a depth map, providing information on how far each point in the scene is from the camera. It’s possible to generate depth maps from photos using computer vision algorithms such as stereovision, or with the assistance of dedicated hardware like a lidar scanner. End-to-end deep learning approaches have also been deployed for this task.

The final step in the AR shoe try-on applications is rendering the shoe itself. Here, the solution is quite different to a virtual makeup try-on. Rather than 2D texture with metadata to transfer the color properly, you require a full 3D model of the shoe.

Source that from a computer graphics designer, who may well build the 3D model with the help of a professional 3D scanner, or reconstruct it from a set of photos using photogrammetry – an approach yielding increasingly impressive results.

A high-quality virtual shoe fitting room app accurately represents the shoe’s material and recreates how it interacts with light. Estimating lighting conditions and utilizing the texture and surface data provided in 3D models are important for obtaining a natural effect. By doing that, you offer a realistic virtual try-on experience and please users testing the look of their desired pair of footwear.

Nike try-on app

This collaboration between Nike and JD Sports enabled users to virtually try-on shoes – specifically, Nike’s Air VaporMax. The AR shoes try-on app was made possible via TikTok and its Creative Lab.

JD launched a 3D interactive social campaign, allowing TikTok users to virtually superimpose the sneaker model in a trio of colors.

Reebok virtual fitting app

Using virtual reality to try on shoes was also developed by another sneaker giant – Reebok. In May 2020, the company prepared to launch the Nano X, a new training shoe.

At that time, in the midst of the global pandemic, they needed to think outside the box. Enter Reebok’s AR experience, whereby their commercial and experiential agency Bastion EBA worked with Wanna Kicks to create a virtual shoe try-on mobile app. The initiative created hype, with 3,600 virtual try-ons.

Virtual fitting rooms for clothes

The standard virtual try-on technology for a virtual clothes fitting app is composed of an image of the person who would like to try on a particular outfit, and an image of the clothing itself. Most existing solutions use that setup, so you’re working with images rather than real-time videos. The level of difficulty depends on the posture of the person, interactions between clothes and body parts, and the amount of detail in the clothing.

Generative adversarial networks (GANs) are often deployed with virtual fitting room apps for clothing, to synthesize the person in the clothing. These models are part of a larger pipeline that needs to be built, where different body parts and clothes must be segmented.

For example, if the person wants to try on a long-sleeved shirt but the reference image shows the person wearing a t-shirt, the model must take that information into account, by generating a new texture to cover the arms appropriately.

In that case, segmentation involves marking particular pixels in the image as members of a semantically similar class – clothes or body parts. That way, the model also knows which parts of the image to preserve, avoiding altering the person’s identity.

These pipelines contain algorithms and machine learning models that spatially transform the target clothing. While the target clothing is often presented in the image in a standard layout, the number of poses we can expect from the user in the reference photo are wide-ranging.

Handling these poses requires a geometric transformation – or image warping, as it’s often called. With each pose, a natural deformation of the clothing will occur.

A proper solution should recreate this deformation while preserving the rich photo-realistic details of the clothing itself, including texture and logos.

Last but not least, occlusions may occur. For instance, the person in the reference image may hold his/her hand in front of the clothing, or transfer one part of the clothing while preserving other wardrobe items – for example, trying on a blouse while wearing a scarf. The previously mentioned combination of generative and segmentation models is designed to solve such challenges.

Farfetch virtual fitting app

In March 2021, British-Portuguese online luxury fashion retail platform Farfetch revealed it was planning to extend an AR experience pilot scheme with tech start-up firm Zeekit. The initiative was all about developing an immersive try-on experience. Zeekit’s Switch Model technology allowed Farfetch customers to virtually try-on clothes on different size models as they shopped.

Furthermore, Zeekit’s Fitting Room technology enabled private clients to choose and style whole outfits. A small cohort of these private clients had the chance to use their own image, styling themselves rather than a model in different outfits.

Additionally, in May 2021, Farfetch and Prada announced they were working with Snap Inc. to offer AR shopping features.

Kohl’s try-on app

The largest department store chain in the United States, in August 2020, the company connected with customers via an augmented reality try-on clothes experience using Snapchat.

Kohl’s Augmented Reality (AR) Virtual Closet gave online shoppers the opportunity to use Snapchat’s Selfie Lens to virtually try-on clothes like Levi’s Trucker Jacket.

Machine learning and virtual fitting rooms

Technology is revolutionizing the retail industry, with machine learning and AR enabling virtual try-on. From clothes and makeup to shoes, potential customers can sample products and items without physically trying them on. That can take place from their home via a virtual fitting room, or in-store using a virtual mirror.

Creating a realistic, personal experience is the key to successful virtual fitting rooms. However, it poses a number of challenges that require a team with robust technical and business expertise.

From deep convolutional networks and 3D object detection architectures to generative adversarial networks and face segmentation technology, the mechanics are there, enabling developers and brands to forge ahead transforming the retail sector, and other industries.

Photo of Damian Kosowski

More posts by this author

Damian Kosowski

Damian started as a Ruby on Rails developer at Netguru. After three years of backend development,...
Optimize with AI solutions  Automate processes and enhance efficiency   Get Started!

Read more on our Blog

Check out the knowledge base collected and distilled by experienced professionals.

We're Netguru

At Netguru we specialize in designing, building, shipping and scaling beautiful, usable products with blazing-fast efficiency.

Let's talk business