Evaluation of String Metrics in Relation to Text Comparison of Privacy Policies

The application of the string metrics is a crucial part in text analysis depending on the nature of the task.  Regarding privacy policies, the question arises popular and less-known string distance metrics can aid in the task of mining privacy policies. 

The goal of this thesis is to revisit and compare different existing string metrics to analyze and mine privacy policies. 

References:

  1. Michel Marie Deza and Elena Deza,  Encyclopedia of Distances. 4th Edition. Springer, 2016
  2. Gomaa, Wael H., and Aly A. Fahmy. A survey of text similarity approachesInternational Journal of Computer Applications 68.13 (2013): 13-18.
  3. Schütze, Hinrich, Christopher D. Manning, and Prabhakar Raghavan. Introduction to information retrieval. Cambridge: Cambridge University Press, 2008