Content Based (CB) Vs. Collaborative Filtering (CF)
The content-based approach, is the first approach to be proposed for recommender systems (RS), it’s based on a technique known from the information retrieval field. In this approach items are recommended to the user based on their similarity to the ones the user preferred in the past, or on their proximity to the user’s areas of interest. In the content-based approach, items are recommended to a user based only on the data observed about him or her and there is no need for data provided by other users. Therefore, content-based systems do not care how many users exist in the system; they deal with each user independently. As a result content-based systems can recommend items to users with special tastes, and items (e.g., new items / unpopular items) of particular interest to them, as well. Since each item is represented by its important features, users can understand why items are considered relevant to them.
While content-based systems can provide accurate recommendations, especially in textual items, they suffer from major drawbacks:
• New user problem: content-based systems have to learn the user profile based on historic transactions; therefore for a new user the system can provide no recommendations.
• Machine readable item: content-based systems are intended mainly for text-based items that a computer can automatically parse. Multimedia items (e.g., audio and video clips) are therefore out of the game, although these items spread widely on the internet. In order to allow such items to be recommended, they must be manually specified by a human editor, who can add some machine-readable content, such as Meta data or tag terms.
• Over fitting problem: the content-based system recommends items similar to the ones the user liked in the past, and this may lead to a situation in which items only from a specific topic are recommended to the user while other relevant items that the user might be interested in but that belong to different topics might be ignored. Intelligent RS should be able to recommend items from diverse topics.
The content-based approach is based on the most representative set of keywords (features) extracted from the items, and therefore such systems cannot distinguish between a well-written and a badly-written document that both use the same terms. This limitation stems from the fact that this approach does not take other users’ opinion about an item into consideration, but only the similarity of the item with other items the user previously found relevant.
In Collaborative Filtering (CF), the main idea is to automate the process of “word-of-mouth.” In the offline world we can seek advice from trusted people whom we know, while in the online world we can get advice from trusted people worldwide and exploit their collective wisdom and experience. This approach requires the user to express the sources they trust (we refer to those users as recommenders, sources or predictors), but this involves significant user effort. The early CF recommender systems, based their recommendations on data elicited by asking people to specify manually their “Shared-Interest” people, without any automatic process for detecting the similar behavior of other users.
Researchers started investigating more automatic approaches, and in the middle of 1995, similarity-based CF was suggested, here we using similarity as a proxy for the source’s trustworthiness. No user effort would be needed, since the process of finding trusted sources would be automatically performed. The first system to achieve this approach being the MovieLens system. Many web-based RS applications are being developed for various domains, and many on-line stores (e.g., Amazon, MediaUnbound, Netflix, MoodLogic, CDNOw, SongExplorer, etc) use these techniques to recommend items that meet their customers’ preferences and hence substantially increase their sales.
-
Recent
-
Links
-
Archives
- July 2009 (2)
- June 2008 (1)
- May 2008 (2)
- April 2008 (1)
-
Categories
-
RSS
Entries RSS
Comments RSS