1answer.
Ask question
Login Signup
Ask question
All categories
  • English
  • Mathematics
  • Social Studies
  • Business
  • History
  • Health
  • Geography
  • Biology
  • Physics
  • Chemistry
  • Computers and Technology
  • Arts
  • World Languages
  • Spanish
  • French
  • German
  • Advanced Placement (AP)
  • SAT
  • Medicine
  • Law
  • Engineering
lys-0071 [83]
3 years ago
6

Modify the WordCount program so it outputs the wordcount for each distinct word in each file. So the output of this DocWordCount

program should be of the form ‘word#####filename count’, where ‘#####’ serves as a delimiter between word and filename and tab serves as a delimiter between filename and count. Submit your source code in a file named DocWordCount.java.
Explanation: Consider two simple files file1.txt and file2.txt. $ echo "Hadoop is yellow Hadoop" > file1.txt $ echo "yellow Hadoop is an elephant" > file2.txt Running ‘DocWordCount.java’ on these two files will give an output similar to that below, where ##### is a delimiter.

Output of DocWordCount.java

yellow#####file2.txt 1

Hadoop#####file2.txt 1

is#####file2.txt 1

elephant#####file2.txt 1

yellow#####file1.txt 1

Hadoop#####file1.txt 2

is#####file1.txt 1

an#####file2.txt 1

Initial code that needs to be modified:

package org.myorg;

import java.io.IOException;
import java.util.regex.Pattern;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
import org.apache.log4j.Logger;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;


public class WordCount extends Configured implements Tool {

private static final Logger LOG = Logger .getLogger( WordCount.class);

public static void main( String[] args) throws Exception {
int res = ToolRunner .run( new WordCount(), args);
System .exit(res);
}

public int run( String[] args) throws Exception {
Job job = Job .getInstance(getConf(), " wordcount ");
job.setJarByClass( this .getClass());

FileInputFormat.addInputPaths(job, args[0]);
FileOutputFormat.setOutputPath(job, new Path(args[ 1]));
job.setMapperClass( Map .class);
job.setReducerClass( Reduce .class);
job.setOutputKeyClass( Text .class);
job.setOutputValueClass( IntWritable .class);

return job.waitForCompletion( true) ? 0 : 1;
}

public static class Map extends Mapper {
private final static IntWritable one = new IntWritable( 1);
private Text word = new Text();

private static final Pattern WORD_BOUNDARY = Pattern .compile("\\s*\\b\\s*");

public void map( LongWritable offset, Text lineText, Context context)
throws IOException, InterruptedException {

String line = lineText.toString();
Text currentWord = new Text();

for ( String word : WORD_BOUNDARY .split(line)) {
if (word.isEmpty()) {
continue;
}
currentWord = new Text(word);
context.write(currentWord,one);
}
}
}

public static class Reduce extends Reducer {
@Override
public void reduce( Text word, Iterable counts, Context context)
throws IOException, InterruptedException {
int sum = 0;
for ( IntWritable count : counts) {
sum += count.get();
}
context.write(word, new IntWritable(sum));
}
}
}
Computers and Technology
1 answer:
stepladder [879]3 years ago
8 0

Answer and Explanation:

package PackageDemo;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.fs.Path;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.LongWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapreduce.Job;

import org.apache.hadoop.mapreduce.Mapper;

import org.apache.hadoop.mapreduce.Reducer;

import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;

import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

import org.apache.hadoop.util.GenericOptionsParser;

public class WordCount {

public static void main(String [] args) throws Exception

{

Configuration c=new Configuration();

String[] files=new GenericOptionsParser(c,args).getRemainingArgs();

Path input=new Path(files[0]);

Path output=new Path(files[1]);

Job j=new Job(c,"wordcount");

j.setJarByClass(WordCount.class);

j.setMapperClass(MapForWordCount.class);

j.setReducerClass(ReduceForWordCount.class);

j.setOutputKeyClass(Text.class);

j.setOutputValueClass(IntWritable.class);

FileInputFormat.addInputPath(j, input);

FileOutputFormat.setOutputPath(j, output);

System.exit(j.waitForCompletion(true)?0:1);

}

public static class MapForWordCount extends Mapper<LongWritable, Text, Text, IntWritable>{

public void map(LongWritable key, Text value, Context con) throws IOException, InterruptedException

{

String line = value.toString();

String[] words=line.split(",");

for(String word: words )

{

Text outputKey = new Text(word.toUpperCase().trim());

IntWritable outputValue = new IntWritable(1);

con.write(outputKey, outputValue);

}

}

}

public static class ReduceForWordCount extends Reducer<Text, IntWritable, Text, IntWritable>

{

public void reduce(Text word, Iterable<IntWritable> values, Context con) throws IOException, InterruptedException

{

int sum = 0;

for(IntWritable value : values)

{

sum += value.get();

}

con.write(word, new IntWritable(sum));

}

}

}

You might be interested in
Describe the layout of an article on Wikipedia​
GuDViN [60]

An article with a table of contents block and an image near the start, then several sections

Sample article layout (click on image for larger view)

This guide presents the typical layout of Wikipedia articles, including the sections an article usually has, ordering of sections, and formatting styles for various elements of an article. For advice on the use of wiki markup, see Help:Editing; for guidance on writing style, see Manual of Style.

Contents

1 Order of article elements

2 Body sections

2.1 Headings and sections

2.2 Names and orders for section headings

2.3 Section templates and summary style

2.4 Paragraphs

3 Standard appendices and footers

3.1 Headings

3.2 Works or publications

3.3 "See also" section

3.4 Notes and references

3.5 Further reading

3.6 External links

3.6.1 Links to sister projects

3.7 Navigation templates

4 Specialized layout

5 Formatting

5.1 Images

5.2 Horizontal rule

5.3 Collapsible content

6 See also

7 Notes

8 References

A simple article should have, at least, (a) a lead section and (b) references. The following list includes additional standardized sections in an article. A complete article need not have all, or even most, of these elements.

The same article, with the central left highlighted: it contains just text in sections.

Body sections appear after the lead and table of contents (click on image for larger view).

Articles longer than a stub are generally divided into sections, and sections over a certain length are generally divided into paragraphs; these divisions enhance the readability of the article. The names and orders of section headings are often determined by the relevant WikiProject, although articles should still follow good organizational and writing principles regarding sections and paragraphs.

5 0
2 years ago
Write a program that asks the user to enter a birth year. the program should then indicate which generation a person of that yea
USPshnik [31]
Something like the following. Also you need to give what language you are using. Anyways, you should be able to convert this to your language of choice. 

<script type="text/javascript">
function checkGeneration() { 
   var gen = ["Baby Boomer ","Generation X","Xennials","Generation Y"];
   var reversestr = "";
   var getyear = window.prompt("Enter a 3 digit number: ");
    if (parseInt(getyear) <= 1964) {
        alert(gen[0]);   
    } else if(parseInt(getyear) <= 1979) {
        alert(gen[1]);
    } else if(parseInt(getyear) <= 1985) {
        alert(gen[2]);               
   }  else if(parseInt(getyear) <= 1995) {
        alert(gen[3]);
   }
 }
checkGeneration();
</script>
7 0
3 years ago
Choose the answer.
Alik [6]
A home network aswell business networking would use a Router to connect the lam and internet all together
3 0
2 years ago
Read 2 more answers
Five programs are currently being run in a computer. Program 1 is using 10 GiB of RAM, program 2 is using 5 GiB of RAM, program
muminat

Virtual memory could be used to allow program 5 to access RAM without any of the data from the other four programs being lost because it is one that tend to allows the system to give all of the process its own memory space that is said to be  isolated from the other processes.

<h3>How is virtual memory used instead of RAM?</h3>

A system is known to make use of a virtual memory and this is one that tend to make use of a section of the hard drive to act like the RAM.

With the use of virtual memory, a system can be able to load bigger or a lot of programs running at the same time, and this is one that tends to hep one to work as if it has more space, without having to buy more RAM.

Therefore, Virtual memory could be used to allow program 5 to access RAM without any of the data from the other four programs being lost because it is one that tend to allows the system to give all of the process its own memory space that is said to be  isolated from the other processes.

Learn more about virtual memory from

brainly.com/question/13088640

#SPJ1

6 0
1 year ago
What would you have to know about the pivot columns in an augmented matrix in order to know that the linear system is consistent
Scrat [10]

Answer:

The Rouché-Capelli Theorem. This theorem establishes a connection between how a linear system behaves and the ranks of its coefficient matrix (A) and its counterpart the augmented matrix.

rank(A)=rank\left ( \left [ A|B \right ] \right )\:and\:n=rank(A)

Then satisfying this theorem the system is consistent and has one single solution.

Explanation:

1) To answer that, you should have to know The Rouché-Capelli Theorem. This theorem establishes a connection between how a linear system behaves and the ranks of its coefficient matrix (A) and its counterpart the augmented matrix.

rank(A)=rank\left ( \left [ A|B \right ] \right )\:and\:n=rank(A)

rank(A)

Then the system is consistent and has a unique solution.

<em>E.g.</em>

\left\{\begin{matrix}x-3y-2z=6 \\ 2x-4y-3z=8 \\ -3x+6y+8z=-5  \end{matrix}\right.

2) Writing it as Linear system

A=\begin{pmatrix}1 & -3 &-2 \\  2& -4 &-3 \\ -3 &6  &8 \end{pmatrix} B=\begin{pmatrix}6\\ 8\\ 5\end{pmatrix}

rank(A) =\left(\begin{matrix}7 & 0 & 0 \\0 & 7 & 0 \\0 & 0 & 7\end{matrix}\right)=3

3) The Rank (A) is 3 found through Gauss elimination

(A|B)=\begin{pmatrix}1 & -3 &-2  &6 \\  2& -4 &-3  &8 \\  -3&6  &8  &-5 \end{pmatrix}

rank(A|B)=\left(\begin{matrix}1 & -3 & -2 \\0 & 2 & 1 \\0 & 0 & \frac{7}{2}\end{matrix}\right)=3

4) The rank of (A|B) is also equal to 3, found through Gauss elimination:

So this linear system is consistent and has a unique solution.

8 0
3 years ago
Other questions:
  • What are the requirements of a data dictionary ?
    7·2 answers
  • You would set a ___________ to prevent users from immediately changing their password several times in one day to return to the
    15·2 answers
  • Based on the passage​ and/or drawing on your prior​ knowledge, you realize that an HMO is​ what?
    9·1 answer
  • A beginning driver may tend to oversteer. This means the driver what? Btw Cars are technology so that is why it is under Compute
    11·1 answer
  • Which graphic file format is used for commercial purposes.
    10·1 answer
  • 5. Compare the telephone network and the internet. What are the similarities? What are the differences?
    12·1 answer
  • What is the diffrent between ibm pc and ibm compatibles in table:​
    11·1 answer
  • By limiting the number of times a person can use a type of software before registering as the authorized owner of that software,
    15·1 answer
  • The following pieces are known as (image shown above)
    15·2 answers
  • 18. WHICH MENU WOULD MOST LIKELY ALLOW YOU TO ADJUST YOUR LINE SPACING? *
    5·1 answer
Add answer
Login
Not registered? Fast signup
Signup
Login Signup
Ask question!