Friday, 16 January 2015

SUPPORTING PRIVACY PROTECTION IN PERSONALIZED WEB SEARCH


SUPPORTING PRIVACY PROTECTION IN PERSONALIZED WEB SEARCH

ABSTRACT
Personalized web search (PWS) has demonstrated its effectiveness in improving the quality of various search services on the Internet. However, evidences show that users’ reluctance to disclose their private information during search has become a major barrier for the wide proliferation of PWS. We study privacy protection in PWS applications that model user preferences as hierarchical user profiles. We propose a PWS framework called UPS that can adaptively generalize profiles by queries while respecting user specified privacy requirements. Our runtime generalization aims at striking a balance between two predictive metrics that evaluate the utility of personalization and the privacy risk of exposing the generalized profile. We present two greedy algorithms, namely GreedyDP and GreedyIL, for runtime generalization. We also provide an online prediction mechanism for deciding whether personalizing a query is beneficial. Extensive experiments demonstrate the effectiveness of our framework. The experimental results also reveal that GreedyIL significantly outperforms GreedyDP in terms of efficiency.




The solutions to PWS can generally be categorized into two types
Ø Click-log-based methods and
Ø Profile-based methods

Click-log-based methods

Ø The click-log based methods are straightforward they simply impose bias to clicked pages in the user’s query history.
Ø It can only work on repeated queries from the same user, which is a strong limitation confining its applicability.
Profile-based methods
Profile-based methods can be potentially effective for almost all sorts of queries, but are reported to be unstable under some circumstances.
Improve the search experience with complicated user-interest models generated from user profiling techniques.
PWS has demonstrated more effectiveness in improving the quality of web search recently, with increasing usage of personal and behavior information to profile its users, which is usually gathered implicitly from query history, browsing history, click-through data bookmarks, user documents and so forth.




EXISTING SYSTEM
Profile based PWS
Ø A user profile is typically generalized for only once offline, and used to personalize all queries from a same user indiscriminatingly.
Ø Such “one profile fits all” strategy certainly has drawbacks given the variety of queries.
Ø Profile-based personalization may not even help to improve the search quality for some ad hoc queries, though exposing user profile to a server has put the user’s privacy at risk.
Ø A better approach is to make an online decision on whether to personalize the query and what to expose in the user profile at runtime.
Customization of privacy requirements
Ø This considers, all the sensitive topics are detected using an absolute metric called surprisal based on the information theory, assuming that the interests with less user document support are more sensitive.
Iterative user interactions
Ø They usually refine the search results with some metrics which require multiple user interactions, such as rank scoring, average rank, and so on.
Ø This paradigm is, however, infeasible for runtime profiling, as it will not only pose too much risk of privacy breach, but also demand prohibitive processing time for profiling.
Ø Thus, we need predictive metrics to measure the search quality and breach risk after personalization, without incurring iterative user interaction.

Disadvantages
Ø The existing profile-based PWS do not support runtime profiling.
Ø The existing methods do not take into account the customization of privacy requirements.
Ø Many personalization techniques require iterative user interactions when creating personalized search results.


PROPOSED SYSTEM

Ø To propose UPS (User customizable Privacy-preserving Search) framework, which is a privacy-preserving personalized web search framework, which can generalize profiles for each query according to user-specified privacy requirements.
Ø To develop two simple but effective generalization algorithms, GreedyDP and GreedyIL, to support runtime profiling. GreedyDP tries to maximize the discriminating power (DP), GreedyIL attempts to minimize the information loss (IL).
Ø The framework assumes that the queries do not contain any sensitive information, and aims at protecting the privacy in individual user profiles while retaining their usefulness for PWS.
Ø UPS consists of a nontrusty search engine server and a number of clients. Each client (user) accessing the search service trusts no one but himself/ herself.
Ø The key component for privacy protection is an online profiler implemented as a search proxy running on the client machine itself.
Ø The proxy maintains both the complete user profile, in a hierarchy of nodes with semantics, and the user-specified (customized) privacy requirements represented as a set of sensitive-nodes.
Ø During the offline phase, a hierarchical user profile is constructed and customized with the user-specified privacy requirements.
Ø The online phase handles queries as When a user issues a query qi on the client, the proxy generates a user profile in runtime in the light of query terms. The output of this step is a generalized user profile Gi satisfying the privacy requirements. The generalization process is guided by considering two conflicting metrics, namely the personalization utility and the privacy risk, both defined for user profiles.
Ø The query and the generalized user profile are sent together to the PWS server for personalized search.
Ø The search results are personalized with the profile and delivered back to the query proxy.
Ø Finally, the proxy either presents the raw results to the user, or reranks them with the complete user profile.

Advantages
Ø UPS provides runtime profiling, which in effect optimizes the personalization utility while respecting user’s privacy requirements;
Ø Allows for customization of privacy needs; and
Ø Does not require iterative user interaction.
Ø Provides an inexpensive mechanism for the client to decide whether to personalize a query in UPS.





Hardware requirements:
Processor                       : Any Processor above 500 MHz.
Ram                               :  128Mb.
Hard Disk                      :  10 Gb.
Compact Disk                :  650 Mb.
Input device                             :  Standard Keyboard and Mouse.
Output device                :  VGA and High Resolution Monitor.
Software requirements:
Operating System          : Windows Family.
Language                       : JDK 1.5
Database                        : MySQL 5.0
Tool                               : HeidiSQL 3.0