Currently, recommender systems are used extensively to find the right audience with the "right" content over various platforms. Recommendations generated by these systems aim to offer relevant items to users. Different approaches have been suggested to solve this problem mainly by using the rating history of the user or by identifying the preferences of similar users. Most of the existing recommendation systems are formulated in an identical fashion, where a model is trained to capture the underlying preferences of users over different kinds of items. Once it is deployed, the model suggests personalized recommendations precisely, and it is assumed that the preferences of users are perfectly reflected by the historical data. However, such user data might be limited in practice, and the characteristics of users may constantly evolve during their intensive interaction between recommendation systems.
Moreover, most of these recommender systems suffer from the cold-start problems where insufficient data for new users or products results in reduced overall recommendation output. In the current study, we have built a recommender system to recommend movies to users. Biclustering algorithm is used to cluster the users and movies simultaneously at the beginning to generate explainable recommendations, and these biclusters are used to form a gridworld where Q-Learning is used to learn the policy to traverse through the grid. The reward function uses the Jaccard Index, which is a measure of common users between two biclusters. Demographic details of new users are used to generate recommendations that solve the cold-start problem too.
Lastly, the implemented algorithm is examined with a real-world dataset against the widely used recommendation algorithm and the performance for the cold-start cases.