text - Article scoring algorithm by keywords -
i looking algorithm can give score article based on weighted keywords.
so suppose have next article:
economic anxiety amid dwindling oil , gas industry raising hard questions future. shaping senate race in democrat seeking re-election in state long dominated republicans.
and have next keywords given weight (-100 100) of importance:
economic (50) senate (70) republicans (-100) democrats (100)this means want article goes economy, senate , democrats have high end score, article repulicans score low. 1 simple solution seems add together values of keywords occuring in article. in reality article has 5 times word democrats, , 1 times word republicans occuring in text should still have low ranking.
my question is: there efficient , effective algorithms problem?
if have understood right, can annotating words have scored in set. illustration in python:
class="lang-py prettyprint-override">article = """economic anxiety amid dwindling oil , gas industry raising hard questions future. shaping senate race in democrat seeking re-election in state long dominated republicans.""" keyword_score = {'economic': 50, 'senate': 70, 'republicans': -100, 'democrats': 100} seen_keywords = set() score = 0 word in article.split(): word = word.lower() if word in keyword_score , word not in seen_keywords: score += keyword_score[word] seen_keywords.add(word) print(score)
that way words not scored twice.
algorithm text keyword scoring
No comments:
Post a Comment