1answer.
Ask question
Login Signup
Ask question
All categories
  • English
  • Mathematics
  • Social Studies
  • Business
  • History
  • Health
  • Geography
  • Biology
  • Physics
  • Chemistry
  • Computers and Technology
  • Arts
  • World Languages
  • Spanish
  • French
  • German
  • Advanced Placement (AP)
  • SAT
  • Medicine
  • Law
  • Engineering
lys-0071 [83]
3 years ago
6

Modify the WordCount program so it outputs the wordcount for each distinct word in each file. So the output of this DocWordCount

program should be of the form ‘word#####filename count’, where ‘#####’ serves as a delimiter between word and filename and tab serves as a delimiter between filename and count. Submit your source code in a file named DocWordCount.java.
Explanation: Consider two simple files file1.txt and file2.txt. $ echo "Hadoop is yellow Hadoop" > file1.txt $ echo "yellow Hadoop is an elephant" > file2.txt Running ‘DocWordCount.java’ on these two files will give an output similar to that below, where ##### is a delimiter.

Output of DocWordCount.java

yellow#####file2.txt 1

Hadoop#####file2.txt 1

is#####file2.txt 1

elephant#####file2.txt 1

yellow#####file1.txt 1

Hadoop#####file1.txt 2

is#####file1.txt 1

an#####file2.txt 1

Initial code that needs to be modified:

package org.myorg;

import java.io.IOException;
import java.util.regex.Pattern;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
import org.apache.log4j.Logger;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;


public class WordCount extends Configured implements Tool {

private static final Logger LOG = Logger .getLogger( WordCount.class);

public static void main( String[] args) throws Exception {
int res = ToolRunner .run( new WordCount(), args);
System .exit(res);
}

public int run( String[] args) throws Exception {
Job job = Job .getInstance(getConf(), " wordcount ");
job.setJarByClass( this .getClass());

FileInputFormat.addInputPaths(job, args[0]);
FileOutputFormat.setOutputPath(job, new Path(args[ 1]));
job.setMapperClass( Map .class);
job.setReducerClass( Reduce .class);
job.setOutputKeyClass( Text .class);
job.setOutputValueClass( IntWritable .class);

return job.waitForCompletion( true) ? 0 : 1;
}

public static class Map extends Mapper {
private final static IntWritable one = new IntWritable( 1);
private Text word = new Text();

private static final Pattern WORD_BOUNDARY = Pattern .compile("\\s*\\b\\s*");

public void map( LongWritable offset, Text lineText, Context context)
throws IOException, InterruptedException {

String line = lineText.toString();
Text currentWord = new Text();

for ( String word : WORD_BOUNDARY .split(line)) {
if (word.isEmpty()) {
continue;
}
currentWord = new Text(word);
context.write(currentWord,one);
}
}
}

public static class Reduce extends Reducer {
@Override
public void reduce( Text word, Iterable counts, Context context)
throws IOException, InterruptedException {
int sum = 0;
for ( IntWritable count : counts) {
sum += count.get();
}
context.write(word, new IntWritable(sum));
}
}
}
Computers and Technology
1 answer:
stepladder [879]3 years ago
8 0

Answer and Explanation:

package PackageDemo;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.fs.Path;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.LongWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapreduce.Job;

import org.apache.hadoop.mapreduce.Mapper;

import org.apache.hadoop.mapreduce.Reducer;

import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;

import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

import org.apache.hadoop.util.GenericOptionsParser;

public class WordCount {

public static void main(String [] args) throws Exception

{

Configuration c=new Configuration();

String[] files=new GenericOptionsParser(c,args).getRemainingArgs();

Path input=new Path(files[0]);

Path output=new Path(files[1]);

Job j=new Job(c,"wordcount");

j.setJarByClass(WordCount.class);

j.setMapperClass(MapForWordCount.class);

j.setReducerClass(ReduceForWordCount.class);

j.setOutputKeyClass(Text.class);

j.setOutputValueClass(IntWritable.class);

FileInputFormat.addInputPath(j, input);

FileOutputFormat.setOutputPath(j, output);

System.exit(j.waitForCompletion(true)?0:1);

}

public static class MapForWordCount extends Mapper<LongWritable, Text, Text, IntWritable>{

public void map(LongWritable key, Text value, Context con) throws IOException, InterruptedException

{

String line = value.toString();

String[] words=line.split(",");

for(String word: words )

{

Text outputKey = new Text(word.toUpperCase().trim());

IntWritable outputValue = new IntWritable(1);

con.write(outputKey, outputValue);

}

}

}

public static class ReduceForWordCount extends Reducer<Text, IntWritable, Text, IntWritable>

{

public void reduce(Text word, Iterable<IntWritable> values, Context con) throws IOException, InterruptedException

{

int sum = 0;

for(IntWritable value : values)

{

sum += value.get();

}

con.write(word, new IntWritable(sum));

}

}

}

You might be interested in
A wide variety of "apps" are available to customize devices. Which category of app does word processing software fall into? Ente
kkurt [141]
 word processing falls into the productivity education application.
6 0
3 years ago
Which feature of vitualization helps increase the IT productivity of a business?
laila [671]
A the answer is yes
7 0
3 years ago
Write a bool function isPalindrome that takes one string parameter and returns true if that string is a palindrome and false oth
kkurt [141]

Answer:

Here is the bool function isPalindrome:

bool isPalindrome(string s)

{  bool palindrome;      

 int j;

 int len = s.length();

 for (int i =0, j= len-1; i < len/2; i++, j--)  

   if (s[i] == s[j])

     palindrome = true;  

   else

     palindrome = false;

 return palindrome; }

Explanation:

The bool function isPalindrome takes a string as parameter and returns true if the string is a palindrome or false if the string is not palindrome. The string variable s holds the string value. In function isPalindrome, there is a bool type variable palindrome. bool is a data type which is used to hold values true or false which is then stored in the palindrome variable.

Next it is being found that how many characters are there in the input string using length() function which returns the length of the string. The length of the string is stored in int type variable len.

Next the for loop starts which has two variables i and j where i starts at 0 position of the input string and j starts at the last position of the string. These two variables move or traverse along the string this way. i variable stops when half the length of the input string is reached. At the end of every iteration i moves forward by the increment of its value to 1 and j moves backwards by the decrement  of its value to 1.

Every the body of  the loop executes, there is an if condition in the body of the loop which checks if the string is a palindrome. This is checked by matching the character at position/index i with the character at position j of the string. This will continue character by character until the i reached half the length of the input string and if every character in i was matched to every character in j then palindrome = true; statement is executed which returns true as the string is a palindrome otherwise palindrome = false; is returned as the string is not a palindrome.

For taking input string from the user and to check if the function isPalindrome() works correctly a main() function is written which takes input string from the user and reads in variable str calls isPalindrome() function. If the string is palindrome then the cout statement in main() function prints the message: String is a palindrome and if the input string is not a palindrome the the following message is displayed in the output screen: String is not a palindrome

int main()

{

string str;

cout<<"Enter a string: ";

cin>>str;

if(isPalindrome(str))

cout<<"String is a palindrome";

else

cout<<"String is not a palindrome";

}

The screenshot of the program along with the output is attached.

7 0
2 years ago
List the six external parts or "peripherals" of a computer system and identify which are output and which are input devices/ Lis
vitfil [10]
Keyboard-Input
Mouse-Input
<u></u>Monitor-Output
Speakers-Output
Printer-Output
Hard Drive<span>-Output</span>
3 0
3 years ago
Java what are synchronized functions.
adelina 88 [10]

<h2>answer:</h2>

Synchronized method is used to lock an object for any shared resource. When a thread invokes a synchronized method, it automatically acquires the lock for that object and releases it when the thread completes its task.','.

4 0
2 years ago
Other questions:
  • We will pass you 2 inputsan list of numbersa number, N, to look forYour job is to loop through the list and find the number spec
    12·1 answer
  • Learning about public speaking can help improve your ________________.
    15·1 answer
  • What is a 'balanced' dfd?
    5·1 answer
  • Making the data impossible to recover even by applying physical forensics methods is known as __________ of media.
    13·1 answer
  • If you do not want to keep a change made by the autocorrect feature, you can click the ____ button on the auto correction option
    12·1 answer
  • To give your app users the ability to open your app directly from other apps by clicking a link, you should use:.
    11·1 answer
  • Write a program that ask the user to enter air water or Steele and the distance that a sound wave will travel in the medium the
    9·1 answer
  • To ensure that comments identify the initials of the person making changes, you would need to _____.
    5·2 answers
  • PLEASE ANSWER FAST, WILL GIVE BRAINLIEST AND 20 POINTS
    11·2 answers
  • Tennis players are not allowed to swear when they are playing in Wimbledon
    6·2 answers
Add answer
Login
Not registered? Fast signup
Signup
Login Signup
Ask question!