Skip to content

Text similarity/comparison

Native pacakage - SequenceMatcher

https://docs.python.org/3/library/difflib.html

There are are many are different string metrics like LevenshteinDamerau-LevenshteinHamming distanceJaro-Winkler  and Strike a match .

Levenshtein

  • much faster than sequenceMatcher

Locality-sensitive hashing

Elasticsearch

Comments