We present DISTSIM a Scalable Distributed in-Memory Semantic Similarity Estimation framework for Knowledge Graphs. DISTSIM provides a multitude of state-of the-art similarity estimators. We have developed the Similarity Estimation Pipeline by combining generic software modules. For large scale RDF data, DISTSIM proposes MinHash with locality sensitivity hashing to achieve better scalability over all-pair similarity estimations. The modules of DISTSIM can be set up using a multitude of (hyper)-parameters allowing to adjust the tradeoff between information taken into account, and processing time. Furthermore, the output of the Similarity Estimation Pipeline is native RDF.11
ADVANTAGES:11
· Representation of Semantic Similarity Estimation Experiments and their results in native RDF format.
· Integration of DISTSIM into the holistic SANSA stack over a set of generic modules.11
Comments