1answer.
Ask question
Login Signup
Ask question
All categories
  • English
  • Mathematics
  • Social Studies
  • Business
  • History
  • Health
  • Geography
  • Biology
  • Physics
  • Chemistry
  • Computers and Technology
  • Arts
  • World Languages
  • Spanish
  • French
  • German
  • Advanced Placement (AP)
  • SAT
  • Medicine
  • Law
  • Engineering
lys-0071 [83]
3 years ago
6

Modify the WordCount program so it outputs the wordcount for each distinct word in each file. So the output of this DocWordCount

program should be of the form ‘word#####filename count’, where ‘#####’ serves as a delimiter between word and filename and tab serves as a delimiter between filename and count. Submit your source code in a file named DocWordCount.java.
Explanation: Consider two simple files file1.txt and file2.txt. $ echo "Hadoop is yellow Hadoop" > file1.txt $ echo "yellow Hadoop is an elephant" > file2.txt Running ‘DocWordCount.java’ on these two files will give an output similar to that below, where ##### is a delimiter.

Output of DocWordCount.java

yellow#####file2.txt 1

Hadoop#####file2.txt 1

is#####file2.txt 1

elephant#####file2.txt 1

yellow#####file1.txt 1

Hadoop#####file1.txt 2

is#####file1.txt 1

an#####file2.txt 1

Initial code that needs to be modified:

package org.myorg;

import java.io.IOException;
import java.util.regex.Pattern;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
import org.apache.log4j.Logger;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;


public class WordCount extends Configured implements Tool {

private static final Logger LOG = Logger .getLogger( WordCount.class);

public static void main( String[] args) throws Exception {
int res = ToolRunner .run( new WordCount(), args);
System .exit(res);
}

public int run( String[] args) throws Exception {
Job job = Job .getInstance(getConf(), " wordcount ");
job.setJarByClass( this .getClass());

FileInputFormat.addInputPaths(job, args[0]);
FileOutputFormat.setOutputPath(job, new Path(args[ 1]));
job.setMapperClass( Map .class);
job.setReducerClass( Reduce .class);
job.setOutputKeyClass( Text .class);
job.setOutputValueClass( IntWritable .class);

return job.waitForCompletion( true) ? 0 : 1;
}

public static class Map extends Mapper {
private final static IntWritable one = new IntWritable( 1);
private Text word = new Text();

private static final Pattern WORD_BOUNDARY = Pattern .compile("\\s*\\b\\s*");

public void map( LongWritable offset, Text lineText, Context context)
throws IOException, InterruptedException {

String line = lineText.toString();
Text currentWord = new Text();

for ( String word : WORD_BOUNDARY .split(line)) {
if (word.isEmpty()) {
continue;
}
currentWord = new Text(word);
context.write(currentWord,one);
}
}
}

public static class Reduce extends Reducer {
@Override
public void reduce( Text word, Iterable counts, Context context)
throws IOException, InterruptedException {
int sum = 0;
for ( IntWritable count : counts) {
sum += count.get();
}
context.write(word, new IntWritable(sum));
}
}
}
Computers and Technology
1 answer:
stepladder [879]3 years ago
8 0

Answer and Explanation:

package PackageDemo;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.fs.Path;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.LongWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapreduce.Job;

import org.apache.hadoop.mapreduce.Mapper;

import org.apache.hadoop.mapreduce.Reducer;

import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;

import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

import org.apache.hadoop.util.GenericOptionsParser;

public class WordCount {

public static void main(String [] args) throws Exception

{

Configuration c=new Configuration();

String[] files=new GenericOptionsParser(c,args).getRemainingArgs();

Path input=new Path(files[0]);

Path output=new Path(files[1]);

Job j=new Job(c,"wordcount");

j.setJarByClass(WordCount.class);

j.setMapperClass(MapForWordCount.class);

j.setReducerClass(ReduceForWordCount.class);

j.setOutputKeyClass(Text.class);

j.setOutputValueClass(IntWritable.class);

FileInputFormat.addInputPath(j, input);

FileOutputFormat.setOutputPath(j, output);

System.exit(j.waitForCompletion(true)?0:1);

}

public static class MapForWordCount extends Mapper<LongWritable, Text, Text, IntWritable>{

public void map(LongWritable key, Text value, Context con) throws IOException, InterruptedException

{

String line = value.toString();

String[] words=line.split(",");

for(String word: words )

{

Text outputKey = new Text(word.toUpperCase().trim());

IntWritable outputValue = new IntWritable(1);

con.write(outputKey, outputValue);

}

}

}

public static class ReduceForWordCount extends Reducer<Text, IntWritable, Text, IntWritable>

{

public void reduce(Text word, Iterable<IntWritable> values, Context con) throws IOException, InterruptedException

{

int sum = 0;

for(IntWritable value : values)

{

sum += value.get();

}

con.write(word, new IntWritable(sum));

}

}

}

You might be interested in
Assume that cell F5 to F10 In a spreadsheet contains numeric value of salary earned by some workers and cell F12 contains the va
snow_tiger [21]

Answer:

=F5*$F$12+F5

Explanation:

If we want to increment the salaries in the cell F5, we must multiply the cell F5 by cell F12, and then we must sum that result.

If we want to drag the formula from the cell F5 to F10, we must use the dollar symbol $ to apply the same percent in our formula.

For example:

F12 = 5% = 0.05

F5 = 10,000

=F5*$F$12+F5

=10,000×0.05+10,000 = 10,500

5 0
3 years ago
Hordes of surreptitiously infiltrated computers, linked and controlled remotely, also known as zombie networks are known as:
icang [17]
~Hello there! ^_^

Your answer: Hordes of surreptitiously infiltrated computers, linked and controlled remotely, also known as zombie networks are known as..?

Your answer: Hordes of surreptitiously infiltrated computers, linked and controlled remotely, also known as zombie networks are known as botnets.

Hope this helps~





3 0
3 years ago
What is the maximum number of columns in a spreadsheet that you can sort in one instance in software like OpenOffice Calc?
MakcuM [25]
The correct answer is D
6 0
3 years ago
Read 2 more answers
Which of the following statements invokes the GetDiscount function, passing it the contents of two Decimal variables named decSa
ryzh [129]

Answer:

c. decDiscount = GetDiscount(decSales, decRate)                                                                                                                          

Explanation:

Option a. is incorrect because it is using Call word which is not a valid way to invoke a function.

Similarly option b. is also incorrect because it uses Call word to invoke function GetDiscount() which is not a valid way to call a function and also it is passing it the contents of three variables decSales, decRate and decDiscount and as mentioned in the question only two parameters are to be passed to GetDiscount() function.

Option c. is correct as it invokes the function GetDiscount() and passes it the contents of two variables decSales and decRate and assigns this to a variable decDiscount. For example if the GetDiscount() method has to calculate the discount using decSales and decRate then the resultant value of this computation is assigned to decDiscount. So whatever this function returns or computers is assigned to and stored in decDiscount variable. So this is a valid way to invoke a method.

3 0
3 years ago
Program for bit stuffing...?
Olegator [25]

Answer: Program for bit stuffing in C

#include<stdio.h>

      int main()

    {    

          int i=0,count=0;

          char data[50];

          printf("Enter the Bits: ");

          scanf("%s",data);            //entering the bits ie. 0,1  

          printf("Data Bits Before Bit Stuffing:%s",databits);

          printf("\nData Bits After Bit stuffing :");

          for(i=0; i<strlen(data); i++)

              {

              if(data[i]=='1')

                     count++;

              else

                     count=0;

                printf("%c",data[i]);

             if(count==4)

                {

                          printf("0");

                          count=0;

                 }

             }

    return 0;

 }

Explanation:

bit stuffing is the insertion of non-information bits during transmission of frames between sender and receiver. In the above program we are stuffing 0 bit after 4 consecutive 1's. So to count the number of 1's we have used a count variable. We have used a char array to store the data bits . We use a for loop to iterate through the data bits to stuff a 0 after 4 consecutive 1's.

4 0
3 years ago
Other questions:
  • What is the magnitude of the largest positive value you can place in a bool? a char? an int? a float?
    14·1 answer
  • Rikki has had several problems at work recently. Her printer isn't printing correctly, copies from the copy machine come out wit
    8·2 answers
  • Which of the following commands is more recommended while creating a bot?
    9·1 answer
  • What are the features of the Outline view in Word? Select three options.
    12·2 answers
  • 1) List at least five smaller behaviors you could break the complex behavior "brushing my teeth" into.
    14·2 answers
  • Can someone explain to me how to do circuit calculations
    11·1 answer
  • Write an algorithm which gets a number A, if it is even, prints even, and if it is odd prints odd.
    7·1 answer
  • What is company NDR?​
    11·1 answer
  • During the preventive maintenance phase of a project involving a hydraulic power system, an engineer must change a gasket on a p
    14·1 answer
  • Write a pseudocode that receives a positive number from the user, and then,
    14·1 answer
Add answer
Login
Not registered? Fast signup
Signup
Login Signup
Ask question!