Yves Peirsman bio photo

Yves Peirsman

Natural Language Processing Enthusiast at NLP Town.

Twitter LinkedIn

Last month I took the online course Text Retrieval & Search Engines on Coursera, and I’m using this blog to write down the main takeaways every week. In the fourth and final week of the course, we shifted our gaze to Web Search and Recommender Systems. In this blog post I share my notes on Recommender Systems.

Recommender Systems are systems that don’t require you to search for content, but that recommend relevant content to you themselves. They are also called “filtering systems”, because they filter a large collection of candidate documents and keep only the useful ones. This filtering takes place in one of two ways. Content-based filtering looks at what items a user likes, and checks if a candidate item is similar to these. Collaborative filtering looks at the users who like the candidate item, and then checks if the current user is similar to those other users. These two modes of filtering can obviously be combined.

A typical content-based filtering system consists of a number of components. First, it has a binary classifier with a user interest profile, which decides whether a document is relevant to a user or not. Second, it has an initialisation module that takes input from the user to feed the system with an initial user profile. Third, it has a learning module that allows the system to learn from the user’s behaviour through time. The system is evaluated on the basis of a utility function, which compares the number of good documents it recommends to the number of bad documents.

A search-based information retrieval system can be turned into a recommender system by combining its ranking function with a score threshold above which documents are recommended to the user. Traditional feedback methods can then be used to improve the scoring. Of course, finding an appropriate threshold can be hard: judgements are only available on recommended documents, and the system needs to deal with the trade-off between exploration and exploitation. On the one hand, it can sometimes be useful to lower the threshold to explore the search space for potential interests of the user that are currently unknown. On the other hand, the system should focus on exploiting its current knowledge about the user rather than exploring too much. Some methods have been developed for finding an optimal threshold, such as Beta-Gamma Threshold Learning, which starts from the threshold that gives the maximum utility on the training data set and then heuristically lowers it to improve recall.

In contrast to content-based filtering, collaborative filtering makes filtering decisions for an individual user based on the judgements of other users. It assumes that users with the same interest have similar preferences, and that users with similar preferences share the same interest. Obviously, it needs a sufficiently large number of user preferences to make its estimates, otherwise there will be a problem of “cold start”. In mathematical terms, collaborative filtering tries to fill in the unknown values in a sparse user-by-object matrix. A simple memory-based implementation would normalise the ratings made by each user, to make the numbers comparable across users, and then compute unknown ratings by a certain user as a form of weighted average of the ratings of other users, where the weighting depends on the similarity between the current user and the other users.

A central question in collaborative filtering is how we determine the similarity between two users. Basic solutions are the Pearson correlation coefficient or the cosine between the vectors of user preferences. More advanced methods take the Inverse User Frequency into account, based on the assumption that a similar rating for a rare item is more informative than a similar rating for a popular item. In addition, they often fill in the unknown values in an iterative way: they first use a basic prediction of the missing values, use these figures to improve the similarity rating between two users, predict the unknown values again on the basis of this new similarity, and so on. Finally, actual recommender systems often rely on more information about the user than just their ratings, and combine collaborative filtering with content-based filtering.

With this introduction into recommender systems, we came to the end of the online course Text Retrieval & Search Engines. Despite its short duration, this was one of the best courses I’ve taken on Coursera so far. The lectures were very well-structured and focused, and even though most of the content remained fairly superficial, it succeeded in giving a succinct overview of a complex field. There were many pointers to related literature, and it surely kept me interested throughout. Now it’s time to put some of these ideas into practice.