1answer.
Ask question
Login Signup
Ask question
All categories
  • English
  • Mathematics
  • Social Studies
  • Business
  • History
  • Health
  • Geography
  • Biology
  • Physics
  • Chemistry
  • Computers and Technology
  • Arts
  • World Languages
  • Spanish
  • French
  • German
  • Advanced Placement (AP)
  • SAT
  • Medicine
  • Law
  • Engineering
Bingel [31]
3 years ago
8

When an author produce an index for his or her book, the first step in this process is to decide which words should go into the

index; the second is to produce a list of the pages where each word occurs. Instead of trying to choose words out of our heads, we decided to let the computer produce a list of all the unique words used in the manuscript and their frequency of occurrence. We could then go over the list and choose which words to put into the index.
The main object in this problem is a "word" with associated frequency. The tentative definition of "word" here is a string of alphanumeric characters between markers where markers are white space and all punctuation marks; anything non-alphanumeric stops the reading. If we skip all un-allowed characters before getting the string, we should have exactly what we want. Ignoring words of fewer than three letters will remove from consideration such as "a", "is", "to", "do", and "by" that do not belong in an index.

In this project, you are asked to write a program to read any text file and then list all the "words" in alphabetic order with their frequency together appeared in the article. The "word" is defined above and has at least three letters.

Computers and Technology
1 answer:
Igoryamba3 years ago
7 0

Answer:

import string

dic = {}

book=open("book.txt","r")

# Iterate over each line in the book

for line in book.readlines():

   tex = line

   tex = tex.lower()

   tex=tex.translate(str.maketrans('', '', string.punctuation))

   new = tex.split()

   for word in new:

       if len(word) > 2:

           if word not in dic.keys():

               dic[word] = 1

           else:

               dic[word] = dic[word] + 1

for word in sorted(dic):

   print(word, dic[word], '\n')

                 

book.close()

Explanation:

The code above was written in python 3.

<em>import string </em>

Firstly, it is important to import all the modules that you will need. The string module was imported to allow us carry out special operations on strings.

<em>dic = {} </em>

<em>book=open("book.txt","r") </em>

<em> </em>

<em># Iterate over each line in the book</em>

<em>for line in book.readlines(): </em>

<em> </em>

<em>    tex = line </em>

<em>    tex = tex.lower() </em>

<em>    tex=tex.translate(str.maketrans('', '', string.punctuation)) </em>

<em>    new = tex.split() </em>

<em />

An empty dictionary is then created, a dictionary is needed to store both the word and the occurrences, with the word being the key and the occurrences being the value in a word : occurrence format.

Next, the file you want to read from is opened and then the code iterates over each line, punctuation and special characters are removed from the line and it is converted into a list of words that can be iterated over.

<em />

<em> </em><em>for word in new: </em>

<em>        if len(word) > 2: </em>

<em>            if word not in dic.keys(): </em>

<em>                dic[word] = 1 </em>

<em>            else: </em>

<em>                dic[word] = dic[word] + 1 </em>

<em />

For every word in the new list, if the length of the word is greater than 2 and the word is not already in the dictionary, add the word to the dictionary and give it a value 1.

If the word is already in the dictionary increase the value by 1.

<em>for word in sorted(dic): </em>

<em>    print(word, dic[word], '\n') </em>

<em>book.close()</em>

The dictionary is arranged alphabetically and with the keys(words) and printed out. Finally, the file is closed.

check attachment to see code in action.

You might be interested in
Match each Excel term to its definition. cell a group of cells containing related data ribbon a row of tabs, groups, and command
Artist 52 [7]

Answer:

ribbon- a row of tabs, groups, and commands

range- a group of cells containing related data

title bar- file name

cell- a container used to input data

worksheet- Excel’s version of a spreadsheet

Explanation:

6 0
3 years ago
How do u set up a Wi-Fi network on Android ​
AlladinOne [14]

Answer:

These are some way I know

5 0
3 years ago
The application window controls include: minimize, maximize, and close<br><br> A) true <br> B) false
Arlecino [84]

Answer:

True

Explanation:

Look at the top right of your computer while having a chrome tab open. You will see a line, a box, and a x. Hover over each and you can see the names.

6 0
3 years ago
1.A tachometer measures:
Art [367]
1. D


2. D


I hope I helped :)
3 0
3 years ago
In what cattell called the ____ form of factor-analysis, large amounts of data are collected on one subject over a long period.
True [87]
The P technique. 


Good luck! (:
3 0
4 years ago
Other questions:
  • Copying and pasting from the internet can be done without citing the internet page, because everything on the internet is common
    13·1 answer
  • Where can the Ease of Access and Speech Recognition centers be found?
    8·2 answers
  • Consider a router that interconnects three subnets: subnet 1, subnet 2, and subnet 3. suppose all of the interfaces in each of t
    11·2 answers
  • A bug collector collects bugs every day for seven days. Write a program in Python that finds the highest number of bugs collecte
    11·1 answer
  • C programmig : Output all combinations of character variables a, b, and c, using this ordering:abc acb bac bca cab cbaSo if a =
    12·2 answers
  • A user brings a technician a mobile device that has no sound. What should a technician check for first?
    5·1 answer
  • A pizza delivery restaurant decides to stop hiring drivers and start hiring cyclers to deliver its pizza. The restaurant thinks
    15·1 answer
  • To summarize means to
    11·2 answers
  • Which types of file formats are the best choice for files that may need to be edited later?
    14·1 answer
  • Que nombre reciben los procesadores en miniatura en que se subdividen un microprocesador para mejorar su desempeño
    11·1 answer
Add answer
Login
Not registered? Fast signup
Signup
Login Signup
Ask question!