Style in the Long Tail, Discovering Unique Interests with Latent Variable Models in Large Scale Social E-commerce

Introduction

An online marketplace for handmade and vintage items, with over 30 million active users and 30 million active listings. This is a marketplace known for tis diverse and eclectic content (e.g. Figure 1); people come in order to find those unusual items that match the peculiarities of their style. Indeed, in its entirety could be considered part of the e-commerce long tail; in addition to wide ranging functions and styles, the handmade and vintage nature of the site means that most items for sale are unique.

Works
(1)Recommendation Systems

recommender systems are nothing new, with the first papers on collaborative filtering appearing in the 1990s.The range of techniques available when building recommender systems is vast, too board to cover here. For a good overview of common techniques, we urge the curious reader to read the survey of Adomavicius and Tuzhilin. Also of note is the work of Koren, Kolinsky and others describing the approaches that won the Netflix prize.

(2)Latent Dirichlet Allocation (LDA)

Latent Dirichlet Allocation (LDA) is an unsupervised, probabilistic, generative model that aims to find a low dimensional description that can summarize the contents of large document collections.

Identifying User Interests
(1)Social E-commerce

There are four important entities:
“User”: Anyone registered on our website, including sellers
“Seller”: our user who own a shop
“Shop”: A collection of items sold by the same seller. Each shop has its own online storefront.
“Listing”: Products/items listed in a shop, each with its unique listing id.
To give an idea of scale, we currently have approximately 1 million active sellers/shops, 30 million active listings, and 30 million active members.

(2)Inferring User Interests

Our use of LDA is based on the premise that users with similar interests will act upon similar listings. We chose to user the social action of “favoriting” listings as a reliable signal for user style. This is done in lieu of more traditional user intent signals, for instance “purchasing” as is commonly done in collaborative filter development.

References

[1] …
[2] …