Most common words in file

Hello,
I'm back !
Today i face a very simple task i need to read a file then get ten most common words in text and print them you can use the same idea for e.g at facebook timeline application print most common words in your timeline todo summarize for facebook feeds.

Algorithm steps:
1- Import string module
2-Open file
3-create empty dictionary
4- read each line in file
5-remove any special character
6-convert all string into lower case
7- split every line into words
8-loop through words
9-count every word how many times appeared
10-create empty list
11-reverse desc
12- print top ten as key,value

#author mohamed fawzy
# print the ten most common words in the text
import string
fhand = open('romeo-full.txt')
counts = dict()
for line in fhand:
    line = line.translate(None, string.punctuation)
    line = line.lower()
    words = line.split()
    for word in words:
        if word not in counts:
            counts[word] = 1
        else:
            counts[word] += 1
# sort dictionary by value
lst = list()
for key, val in counts.items():
    lst.append((val, key))

lst.sort(reverse=True)
for key, val in lst[:10]:
    print key, val


That's it :)


Comments

Popular Posts