Recommendation models with ranking

nipun deelaka
4 min readMay 9, 2022

--

current day, recommendation system(RS) is not a simple recommendation engine(RE) you learned.

Photo by Joshua Golde on Unsplash

Have you ever thought about the recommendation systems used on YouTube, Google (Ads), Amazon, Netflix, etc.?. RS in those systems is pretty complex, even Facebook has implemented a Query Language(SQL) only for the recommendation query processing optimally. Since, most of the product above entirely depends on how advanced and dynamic their RS is, makes they customize and improve the system.

Simply put most companies follow multi-stage RS as a practice and improve their volatility of the RS (this is a really good article on multi-stage RS). On the other hand, implementing a multi-stage RS for small/medium web services or your RS projects is a tedious process to undergo.

However, there are few solutions to build “personalized ” recommendation systems without building your SQL. First of all, discuss from where this “personalized” word comes. Basically, it means, in addition to suggesting a set of items based on your previous choice, here the system rank and filters those suggested items before showing them to users. This ranking and filtering are done according to both business rules [such as to maximize profit] and personal preferences.

Mainly there must be a filtering stage to avoid recommending items that the user already purchased. And there could a several filtering processes based on user requirements, such as avoiding expired items, price range, and transportation restrictions. However, in general, most RE libraries provide necessary filtering, or else it’s fairly easy to implement by ourselves.

The main challenge is to rank and sort items that are suggested from the RE we built. At this point most of you could argue on “why do we need to rank / personalized rank, the set of items recommended by the model, where model trained on each users personal preference?”. The simple answer is, most of the time item order is as important as the items itself. A good example of this is, in Facebook show more preferred post at the top is more commercially valuable than getting all the post that user would be preferred.

So, let’s start with the building ranking models that included the ranking phase as well. In general, item ranking is done by issuing a user score with each suggestion produced by the model. then in the post-processing layer, the item ordering processes could handle based on the user scores, given to the items. However, there are only a few recommendation models that support this ranking of the items.

This comes in handy when you want to make recommendations instantly, or in data stream processing. Also, there are a few downsides to this method, such as this way recommendation engine is limited to a few and halts the chance to use many advanced models like DLRM, xDeepMF, SAR, etc.

Bayesian Personalized Ranking model(BPR), Ranking Matrix Factorization model, and ALS model are some popular RE, where ranking is already a part of the model.

So, let’s discuss these models one by one, with some practical examples;

Bayesian Personalized Ranking model

from the original BPR publication

Simply, it is a Bayesian classification model, that compares two items given user to give which item the user prefers more. So, in the preprocessing stage, it creates triplets of (user, positive item, negative item) to fed into the model and trained user and item latent space to make apart positive item from negative item given user as much as.

the model supports both implicit data (Ex. binary record of whether watch or not), and explicit data (Ex. the movie rating). Also, the BPR model could use and separate ranking layer, when the model only fed with a subset of items [items recommended by the RE model]and a particular user.

this is a really good article to get an in-depth theoretical understanding of the BPR model.

You could follow the step-by-step explanation in that article for BPR model implementation. Also, In this implementation, it gives a generic class, which you could use directly.

Ranking Matrix Factorization model

Basically, this is Matrix Factorization(MF) RE with a scoring regularized. In this way, the model outputs a score for each item in addition to the most probable items. In general, this method works only with explicit data. but the turicreate library implementation supports both explicit and implicit data (by using proxy measures).

for more theoretical details on the model, you can visit this page in turicreate documentation.

you could follow the implementation of the Ranking MF model on the MovieLens dataset.

Alternative Least Square(ALS) model

ALS is a very popular RE for data stream processing and distributed database-based RS build. There the model intended to build on apache Spark RDD clusters, and most of the implementations were done targeting apache Spark datasets. But there are some implementations (implicit) built on centralized datasets such as .csv files. In general ALS model run on an explicit rating dataset, and therefore the ALS-based ranking usually applies to the explicit datasets.

for more theoretical understanding you could follow these white papers from Stanford University.

the Microsoft/recommender library has implemented ALS distributed training model and has a very intuitive example of the model.

The implicit library is based ALS RE model implementation.

At this point, you must have a clear understanding of what models used at when and the pros & cons of those models. Also, I was encouraged to go over all the references provided throughout the article, especially model implementations.

Thank you for your concentration and congratulation on being a practitioner ranking recommendation system.

--

--

nipun deelaka
nipun deelaka

No responses yet