ruby on rails - Elasticsearch fixed score based on content similarity -
i working on tool identify similar documents , mark them duplicated.
to so, using elasticsearch check on documents content elasticsearch take care of managing synomns , possible typos, haven't got come query reach goals.
so far came query:
{ "query":{ "filtered":{ "query":{ "more_like_this":{ "fields":[ "description" ], "like_text":"lorem ipsum dolor sit amet, consectetur adipiscing elit.", "min_term_freq":1, "max_query_terms":999, "min_doc_freq":1 } } } }, "from":0, "size":999, "search_type": "dfs_query_then_fetch", "sort":[ { "_score":{ "order":"desc" } } ] } but seems score gives me quite random, have score 100 contents equal while 0 different.
i see going, out of box, scoring going relevant particular query because based on term frequencies , position. score great results query, meaningless query query. so, wrap in constant score query.
if down putting each term in own query, can provide example of possibly solving multiple constant scores ina bool query inside bool query.
Comments
Post a Comment