1answer.
Ask question
Login Signup
Ask question
All categories
  • English
  • Mathematics
  • Social Studies
  • Business
  • History
  • Health
  • Geography
  • Biology
  • Physics
  • Chemistry
  • Computers and Technology
  • Arts
  • World Languages
  • Spanish
  • French
  • German
  • Advanced Placement (AP)
  • SAT
  • Medicine
  • Law
  • Engineering
lys-0071 [83]
3 years ago
6

Modify the WordCount program so it outputs the wordcount for each distinct word in each file. So the output of this DocWordCount

program should be of the form ‘word#####filename count’, where ‘#####’ serves as a delimiter between word and filename and tab serves as a delimiter between filename and count. Submit your source code in a file named DocWordCount.java.
Explanation: Consider two simple files file1.txt and file2.txt. $ echo "Hadoop is yellow Hadoop" > file1.txt $ echo "yellow Hadoop is an elephant" > file2.txt Running ‘DocWordCount.java’ on these two files will give an output similar to that below, where ##### is a delimiter.

Output of DocWordCount.java

yellow#####file2.txt 1

Hadoop#####file2.txt 1

is#####file2.txt 1

elephant#####file2.txt 1

yellow#####file1.txt 1

Hadoop#####file1.txt 2

is#####file1.txt 1

an#####file2.txt 1

Initial code that needs to be modified:

package org.myorg;

import java.io.IOException;
import java.util.regex.Pattern;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
import org.apache.log4j.Logger;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;


public class WordCount extends Configured implements Tool {

private static final Logger LOG = Logger .getLogger( WordCount.class);

public static void main( String[] args) throws Exception {
int res = ToolRunner .run( new WordCount(), args);
System .exit(res);
}

public int run( String[] args) throws Exception {
Job job = Job .getInstance(getConf(), " wordcount ");
job.setJarByClass( this .getClass());

FileInputFormat.addInputPaths(job, args[0]);
FileOutputFormat.setOutputPath(job, new Path(args[ 1]));
job.setMapperClass( Map .class);
job.setReducerClass( Reduce .class);
job.setOutputKeyClass( Text .class);
job.setOutputValueClass( IntWritable .class);

return job.waitForCompletion( true) ? 0 : 1;
}

public static class Map extends Mapper {
private final static IntWritable one = new IntWritable( 1);
private Text word = new Text();

private static final Pattern WORD_BOUNDARY = Pattern .compile("\\s*\\b\\s*");

public void map( LongWritable offset, Text lineText, Context context)
throws IOException, InterruptedException {

String line = lineText.toString();
Text currentWord = new Text();

for ( String word : WORD_BOUNDARY .split(line)) {
if (word.isEmpty()) {
continue;
}
currentWord = new Text(word);
context.write(currentWord,one);
}
}
}

public static class Reduce extends Reducer {
@Override
public void reduce( Text word, Iterable counts, Context context)
throws IOException, InterruptedException {
int sum = 0;
for ( IntWritable count : counts) {
sum += count.get();
}
context.write(word, new IntWritable(sum));
}
}
}
Computers and Technology
1 answer:
stepladder [879]3 years ago
8 0

Answer and Explanation:

package PackageDemo;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.fs.Path;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.LongWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapreduce.Job;

import org.apache.hadoop.mapreduce.Mapper;

import org.apache.hadoop.mapreduce.Reducer;

import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;

import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

import org.apache.hadoop.util.GenericOptionsParser;

public class WordCount {

public static void main(String [] args) throws Exception

{

Configuration c=new Configuration();

String[] files=new GenericOptionsParser(c,args).getRemainingArgs();

Path input=new Path(files[0]);

Path output=new Path(files[1]);

Job j=new Job(c,"wordcount");

j.setJarByClass(WordCount.class);

j.setMapperClass(MapForWordCount.class);

j.setReducerClass(ReduceForWordCount.class);

j.setOutputKeyClass(Text.class);

j.setOutputValueClass(IntWritable.class);

FileInputFormat.addInputPath(j, input);

FileOutputFormat.setOutputPath(j, output);

System.exit(j.waitForCompletion(true)?0:1);

}

public static class MapForWordCount extends Mapper<LongWritable, Text, Text, IntWritable>{

public void map(LongWritable key, Text value, Context con) throws IOException, InterruptedException

{

String line = value.toString();

String[] words=line.split(",");

for(String word: words )

{

Text outputKey = new Text(word.toUpperCase().trim());

IntWritable outputValue = new IntWritable(1);

con.write(outputKey, outputValue);

}

}

}

public static class ReduceForWordCount extends Reducer<Text, IntWritable, Text, IntWritable>

{

public void reduce(Text word, Iterable<IntWritable> values, Context con) throws IOException, InterruptedException

{

int sum = 0;

for(IntWritable value : values)

{

sum += value.get();

}

con.write(word, new IntWritable(sum));

}

}

}

You might be interested in
Bob gets an e-mail addressed from his bank, asking for his user ID and password. He then notices that the e-mail has poor gramma
Arlecino [84]

This type of attack is called as phishing emails.

<u>Explanation:</u>

Normally any bank will not ask end user about login details such user id, password, pin number in email. So bob as think and before responding the mail. Moreover this type email to be considered as hacker email, who want to access the bob accounts and do access his account and steal money from bob accounts.

Luckily bob contacted bank to stop further hackers not to his accounts. Since it is grammar mistake bob understood it is hackers who asking bob user id and password.

Bob has prompted these type of mails are called phishing emails. Better to avoid it.

6 0
3 years ago
A Trojan horse:
Elis [28]

Answer:

The answer is "option D"

Explanation:

Trojan horse is a software, that is installed on a system and appears benign, but in actual it is the addition of malicious which unexpected modifications in config files and unusual activity even when the computer is idle are strong signs that a Trojan exists on a system. and other options are not correct that can be described as follows:

  • In option A, It is a malware that is also known as Trojan.
  • In option B, It is malware, not a virus but the malware includes viruses.
  • In option C, It doesn't install spyware on the user's computer.
  • In option E, It doesn't use for enterprise networks to penetrate.
4 0
3 years ago
What is the purpose of a search engine?
soldi70 [24.7K]
The purpose is to look for the topic that you need to ask a question and to find/get to a website you want to go to. hope this helped :)
4 0
4 years ago
Read 2 more answers
Which of the following options correctly represent a formula with Absolute References?
lakkis [162]
The answer is D =(<span>$A$1-$B$1) 

sources:just took the test</span>
3 0
3 years ago
Read 2 more answers
While driving you encounter an emergency vehicle stopped ahead. Discuss how the move over law applies to this situation and your
SOVA2 [1]

According to the move over law, when an individual encounters an emergency vehicle stopped ahead, the drivers who happen to be travelling in the same direction need to move to the adjacent lane carefully and reduce the speed to avoid any collisions.  

<u>Explanation:</u>

After the law, coming to the responsibilities, the driver should make sure that the emergency vehicle that has stopped has not encountered an accident or untoward pull over due to some technical fault.

The driver should also make sure that the traffic behind their vehicle is moving by law and cooperating with the emergency situation.

6 0
3 years ago
Other questions:
  • Fluyen en tecnologia
    15·1 answer
  • Which of these is a Microsoft certification for system engineers?
    9·1 answer
  • If a menu-driven program uses a loop to redisplay the menu after a selected operation has been performed, the menu should probab
    14·1 answer
  • The local city youth league needs a database system to help track children that sign up to play soccer. Data needs to be kept on
    10·2 answers
  • Haigy Paigy is as a children's invented language which sounds exactly like English, except that "aig" is inserted before the vow
    12·1 answer
  • There are some processes that need to be executed. Amount of a load that process causes on a server that runs it, is being repre
    7·1 answer
  • PLEASE HELP!
    5·1 answer
  • Embedded operating systems control?
    5·1 answer
  • if someone has become very attached to their mobile device and feels anxious if the cannot connect to the internet, what are the
    7·1 answer
  • When you get a new sim card do it come with a new number or do you have a activate the phone and get a new number in store ?
    9·1 answer
Add answer
Login
Not registered? Fast signup
Signup
Login Signup
Ask question!