Lun.ua, a leading real estate aggregator in Ukraine, reportedly launched the 2nd version of its AI-based image recognition (IR) algorithm to fight duplicated ads.
In Ukraine, realtors advertise almost every existing apartment on almost every portal to attract more buyers. For example, one flat can be represented by hundreds and thousands of ads on dozens of websites.
Lun aggregates millions of ads from 100+ websites in Ukraine, so the problem of de-duplication is one of the most important to solve.
Before 2015, Lun used only textual parameters, like address, price, parameters to find and group similar ads into one card. The rise of AI, ML, and convolutional neural networks led to implementation the first version of image recognition (IR) algorithm in 2016.
It could split all the images into seven classes (kitchen, toilet, living room, bedroom, outdoor, facade, corridor) with the proximity of 92%, so it became possible to separate images into significant in terms of duplicating (indoor images) from insignificant (outdoor).
The recently launched 2nd version of IR-algorithm can split images into 28 classes with the proximity of 98.6% (only 1.4% mistakes) which gave deduplicating algorithm more accurate data to group ads into one card.