Privacy-preserving document similarity detection
MetadataShow full item record
The document similarity detection is an important technique used in many applications. The existence of the tool that guarantees the privacy protection of the documents during the comparison will expand the area where this technique can be applied. The goal of this project is to develop a method for privacy-preserving document similarity detection capable to identify either semantically or syntactically similar documents. As the result two methods were designed, implemented, and evaluated. In the first method privacy-preserving data comparison protocol was applied for secure comparison. This original protocol was created as a part of this thesis. In the second method modified private-matching scheme was used. In both methods the Natural Language processing techniques were utilized to capture the semantic relations between documents. During the testing phase the first method was found to be too slow for the practical application. The second method, on the contrary, was rather fast and effective. It can be used for creation of the tool for detecting syntactical and semantic similarity in a privacy-preserving way.
Masteroppgave i informasjons- og kommunikasjonsteknologi IKT590 2011 – Universitetet i Agder, Grimstad