Ibrahim Sana

All my thoughts

Content Based (CB) Vs. Collaborative Filtering (CF)

The content-based approach, is the first approach to be proposed for recommender systems (RS), it’s based on a technique known from the information retrieval field. In this approach items are recommended to the user based on their similarity to the ones the user preferred in the past, or on their proximity to the user’s areas of interest. In the content-based approach, items are recommended to a user based only on the data observed about him or her and there is no need for data provided by other users. Therefore, content-based systems do not care how many users exist in the system; they deal with each user independently. As a result content-based systems can recommend items to users with special tastes, and items (e.g., new items / unpopular items) of particular interest to them, as well. Since each item is represented by its important features, users can understand why items are considered relevant to them.
While content-based systems can provide accurate recommendations, especially in textual items, they suffer from major drawbacks:

• New user problem: content-based systems have to learn the user profile based on historic transactions; therefore for a new user the system can provide no recommendations.
• Machine readable item: content-based systems are intended mainly for text-based items that a computer can automatically parse. Multimedia items (e.g., audio and video clips) are therefore out of the game, although these items spread widely on the internet. In order to allow such items to be recommended, they must be manually specified by a human editor, who can add some machine-readable content, such as Meta data or tag terms.
• Over fitting problem: the content-based system recommends items similar to the ones the user liked in the past, and this may lead to a situation in which items only from a specific topic are recommended to the user while other relevant items that the user might be interested in but that belong to different topics might be ignored. Intelligent RS should be able to recommend items from diverse topics.

The content-based approach is based on the most representative set of keywords (features) extracted from the items, and therefore such systems cannot distinguish between a well-written and a badly-written document that both use the same terms. This limitation stems from the fact that this approach does not take other users’ opinion about an item into consideration, but only the similarity of the item with other items the user previously found relevant.

In Collaborative Filtering (CF), the main idea is to automate the process of “word-of-mouth.” In the offline world we can seek advice from trusted people whom we know, while in the online world we can get advice from trusted people worldwide and exploit their collective wisdom and experience. This approach requires the user to express the sources they trust (we refer to those users as recommenders, sources or predictors), but this involves significant user effort. The early CF recommender systems, based their recommendations on data elicited by asking people to specify manually their “Shared-Interest” people, without any automatic process for detecting the similar behavior of other users.

Researchers started investigating more automatic approaches, and in the middle of 1995, similarity-based CF was suggested, here we using similarity as a proxy for the source’s trustworthiness. No user effort would be needed, since the process of finding trusted sources would be automatically performed. The first system to achieve this approach being the MovieLens system. Many web-based RS applications are being developed for various domains, and many on-line stores (e.g., Amazon, MediaUnbound, Netflix, MoodLogic, CDNOw, SongExplorer, etc) use these techniques to recommend items that meet their customers’ preferences and hence substantially increase their sales.

July 23, 2009 Posted by ibrahimabd | Uncategorized | | No Comments Yet

Using social network in recommender systems

Researchers started investigating Social Networks mainly after the work of Stanley Milgram, a Harvard social psychologist. In 1967, Milgram conducted several experiments called the “small world experiment” examining the average path length for social networks of people in the United States. Milgram concluded that any two people in the USA are linked in a social network by a mere “six degrees of separation,” meaning that two randomly chosen people are connected by a short chain of intermediate acquaintances regardless of their geographical and cultural distance. This phenomenon has been confirmed in sociological researches.
The emergence of social networks (e.g., Myspace, Facebook) offered new opportunities for RS. Social networks and virtual communities are the main lineament of Web 2.0 which might make a huge contribution to RSs. Recent studies detected this potential and started investigating the impact of social relations in RS. To date, most of this research has focused mainly on trust, while behavioral theory suggests that other social relations impact people’s advice-taking (e.g., communication frequency, reputation). As far as we know, these social relations have been neglected until now.

Users communicate via E-mail applications (e.g., Gmail: Google mail service) and produce interaction-based social ties. Users can also contact friends and search for new friends (e.g., Facebook) producing friendship-based social ties; others may participate in online auctions in C2C trade environment (e.g., eBay.com) producing reputation-based ties. These are a few examples of the different platforms which enable members to establish different social ties with each other. These social ties are more available today than ever before, and we believe that such information regarding users’ relationships could potentially be exploited for improving the performance of recommender systems.

Today, having realized the usefulness of and the rich information stored in such networks, many applications have been built that exploit these advantages. In the internet network there are more than 200,000,000 user accounts and over 141 different social networks. Eighteen sites exist that have over 1,000,000 members. Popular examples of such networks are MySpace, which has 60,000,000 user accounts and “Friendster,” which has over 27,000,000. Facebook is the most popular social network having more than 90 million users, and more than 13,000 social applications [26]. These networks were established for several purposes: Blogging, Business, Dating, Entertainment, etc.
The social networks that emerged created virtual communities. Community members share information such as photos, personal information, hobbies, professional knowledge, etc. (e.g. Friendster.com). Communities enable their members to conduct social relations similar to those that people conduct in the real world. Users in social networks collaborate to satisfy their own needs.

These facts encourage the use of social network’s characteristics for generating more effective recommendations and present new opportunities for using the collaborative filtering approach. Many web applications enable people to find friends worldwide and to conduct reciprocal social relations with them (e.g., Facebook, Sleeper, MovieCritics and Real.com).

The internet provides an opportunity for people to interact with each other, and thus many types of social relationship are established among users, of which businesses, friendship and colleague relations are examples.
Research studies in the marketing and applied psychology fields have identified four salient social measures that are relevant in the advice-taking context. These are cognitive similarity, tie strength, trust, and social capital. It has also been shown that different types of social relations impact recipients’ advice-taking in different ways. Therefore, social relationships can be incorporated into recommender systems to provide users with more realistic recommendations. The collaborative filtering (CF) approach commonly used in recommendation systems emerged in the mid-1990s, and has since become the de-facto standard. Collaborative filtering tries to mimic the social process of advice-seeking through the users’ cognitive similarity. Although, this method has been proved useful for producing accurate recommendation, it produces inaccurate predictions in many real situations, especially in sparse data sets.
Therefore, enhancing collaborative filtering can be achieved by integrating types of social measures (e.g., tie strength, trust, and friendship). Social measures can be differ, so we need to identify the effectiveness of different types of social relations and we need to be able to identify the most valuable measure for recommender systems (for different context). For evaluation purpose, we can develop a social filtering models that incorporated these social measures and conduct an empirical experiment to test these models. We can explore several social-based prediction methods and to benchmark these methods against the traditional CF method.
Incorporating social relationships into information retrieval systems in general can yield to more accurate results than pure IR systems. But, inferring social relationships from social networks (e.g,: facebook,myspace and others), integrating them into IR systems, and finally evaluating the produced model still need more works and efforts.

July 23, 2009 Posted by ibrahimabd | Collaborative Filtering, Mysql, Recommender System, Social Networks | , , | No Comments Yet