1answer.
Ask question
Login Signup
Ask question
All categories
  • English
  • Mathematics
  • Social Studies
  • Business
  • History
  • Health
  • Geography
  • Biology
  • Physics
  • Chemistry
  • Computers and Technology
  • Arts
  • World Languages
  • Spanish
  • French
  • German
  • Advanced Placement (AP)
  • SAT
  • Medicine
  • Law
  • Engineering
lys-0071 [83]
3 years ago
6

Modify the WordCount program so it outputs the wordcount for each distinct word in each file. So the output of this DocWordCount

program should be of the form ‘word#####filename count’, where ‘#####’ serves as a delimiter between word and filename and tab serves as a delimiter between filename and count. Submit your source code in a file named DocWordCount.java.
Explanation: Consider two simple files file1.txt and file2.txt. $ echo "Hadoop is yellow Hadoop" > file1.txt $ echo "yellow Hadoop is an elephant" > file2.txt Running ‘DocWordCount.java’ on these two files will give an output similar to that below, where ##### is a delimiter.

Output of DocWordCount.java

yellow#####file2.txt 1

Hadoop#####file2.txt 1

is#####file2.txt 1

elephant#####file2.txt 1

yellow#####file1.txt 1

Hadoop#####file1.txt 2

is#####file1.txt 1

an#####file2.txt 1

Initial code that needs to be modified:

package org.myorg;

import java.io.IOException;
import java.util.regex.Pattern;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
import org.apache.log4j.Logger;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;


public class WordCount extends Configured implements Tool {

private static final Logger LOG = Logger .getLogger( WordCount.class);

public static void main( String[] args) throws Exception {
int res = ToolRunner .run( new WordCount(), args);
System .exit(res);
}

public int run( String[] args) throws Exception {
Job job = Job .getInstance(getConf(), " wordcount ");
job.setJarByClass( this .getClass());

FileInputFormat.addInputPaths(job, args[0]);
FileOutputFormat.setOutputPath(job, new Path(args[ 1]));
job.setMapperClass( Map .class);
job.setReducerClass( Reduce .class);
job.setOutputKeyClass( Text .class);
job.setOutputValueClass( IntWritable .class);

return job.waitForCompletion( true) ? 0 : 1;
}

public static class Map extends Mapper {
private final static IntWritable one = new IntWritable( 1);
private Text word = new Text();

private static final Pattern WORD_BOUNDARY = Pattern .compile("\\s*\\b\\s*");

public void map( LongWritable offset, Text lineText, Context context)
throws IOException, InterruptedException {

String line = lineText.toString();
Text currentWord = new Text();

for ( String word : WORD_BOUNDARY .split(line)) {
if (word.isEmpty()) {
continue;
}
currentWord = new Text(word);
context.write(currentWord,one);
}
}
}

public static class Reduce extends Reducer {
@Override
public void reduce( Text word, Iterable counts, Context context)
throws IOException, InterruptedException {
int sum = 0;
for ( IntWritable count : counts) {
sum += count.get();
}
context.write(word, new IntWritable(sum));
}
}
}
Computers and Technology
1 answer:
stepladder [879]3 years ago
8 0

Answer and Explanation:

package PackageDemo;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.fs.Path;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.LongWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapreduce.Job;

import org.apache.hadoop.mapreduce.Mapper;

import org.apache.hadoop.mapreduce.Reducer;

import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;

import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

import org.apache.hadoop.util.GenericOptionsParser;

public class WordCount {

public static void main(String [] args) throws Exception

{

Configuration c=new Configuration();

String[] files=new GenericOptionsParser(c,args).getRemainingArgs();

Path input=new Path(files[0]);

Path output=new Path(files[1]);

Job j=new Job(c,"wordcount");

j.setJarByClass(WordCount.class);

j.setMapperClass(MapForWordCount.class);

j.setReducerClass(ReduceForWordCount.class);

j.setOutputKeyClass(Text.class);

j.setOutputValueClass(IntWritable.class);

FileInputFormat.addInputPath(j, input);

FileOutputFormat.setOutputPath(j, output);

System.exit(j.waitForCompletion(true)?0:1);

}

public static class MapForWordCount extends Mapper<LongWritable, Text, Text, IntWritable>{

public void map(LongWritable key, Text value, Context con) throws IOException, InterruptedException

{

String line = value.toString();

String[] words=line.split(",");

for(String word: words )

{

Text outputKey = new Text(word.toUpperCase().trim());

IntWritable outputValue = new IntWritable(1);

con.write(outputKey, outputValue);

}

}

}

public static class ReduceForWordCount extends Reducer<Text, IntWritable, Text, IntWritable>

{

public void reduce(Text word, Iterable<IntWritable> values, Context con) throws IOException, InterruptedException

{

int sum = 0;

for(IntWritable value : values)

{

sum += value.get();

}

con.write(word, new IntWritable(sum));

}

}

}

You might be interested in
Please help me I don't understand. It's Python.
Vitek1552 [10]
It’s , c probably sorry if I’m wrong
8 0
3 years ago
Read 2 more answers
Write the method makeNames that creates and returns a String array of new names based on the method’s two parameter arrays, arra
Gwar [14]

Answer:

  1. import java.util.Arrays;
  2. public class Main {
  3.    public static void main(String[] args) {
  4.        String [] first = {"David", "Mike", "Katie", "Lucy"};
  5.        String [] middle = {"A", "B", "C", "D"};
  6.        String [] names = makeNames(first, middle);
  7.        System.out.println(Arrays.toString(names));
  8.    }
  9.    public static String [] makeNames(String [] array1, String [] array2){
  10.           if(array1.length == 0){
  11.               return array1;
  12.           }
  13.           if(array2.length == 0){
  14.               return array2;
  15.           }
  16.           String [] newNames = new String[array1.length];
  17.           for(int i=0; i < array1.length; i++){
  18.               newNames[i] = array1[i] + " " + array2[i];
  19.           }
  20.           return newNames;
  21.    }
  22. }

Explanation:

The solution code is written in Java.

Firstly, create the makeNames method by following the method signature as required by the question (Line 12). Check if any one the input string array is with size 0, return the another string array (Line 14 - 20). Else, create a string array, newNames (Line 22). Use a for loop to repeatedly concatenate the string from array1 with a single space " " and followed with the string from array2 and set it as item of the newNames array (Line 24-26). Lastly, return the newNames array (Line 28).

In the main program create two string array, first and middle, and pass the two arrays to the makeNames methods as arguments (Line 5-6). The returned array is assigned to names array (Line 7). Display the names array to terminal (Line 9) and we shall get the sample output: [David A, Mike B, Katie C, Lucy D]

5 0
3 years ago
Assume that you have an array of integers named arr. Which of these code segments print the same results?
tresset_1 [31]

Answer:

II and III only

Explanation:

In Code segment II, the output of the array will be started form arr[0] and ends at the arr[length]. Because loop starts from 0 and ends at length of array. This will print the full array data.

In code segment III, the output will be all values of array as loop starts form first index and ends at last index.

On the other hand I code segment prints all array values except last value of the array. As the loop shows that, it will terminate at second last value of the array.

4 0
3 years ago
Write the SQL statements that define the relational schema (tables)for this database. Assume that person_id, play_id, birth_year
Sunny_sXe [5.5K]

Answer:

  • SQL statement that defines table for Actor

CREATE TABLE Actor(

person_id integer primary key,

name varchar2(40) not null,

birth_year integer check ((birth_year) <= 2019)

);

  • SQL statement that defines table for Play

CREATE TABLE Play(

play_id integer primary key,

title varchar2(60) not null,

author varchar2(60) not null,

year_written integer check ((year_written) <= 2019)

);

  • SQL statement that defines table for Role

CREATE TABLE Role (

person_id integer,

character_name varchar2(60) not null,

play_id integer,

constraint fk_person foreign key (person_id) references actor(person_id),

constraint fk_play foreign key (play_id) references play(play_id),

primary key (person_id, character_name, play_id)

);

Explanation:

Other information that were not added to the question are as below:

The following database contains information about three tables i.e. actors, plays, and roles they performed.

Actor (person_id, name, birth_year)

Play (play_id, title, author, year_written)

Role (person_id, character_name, play_id)

Where: Actor is a table of actors, their names, and the year they were born. Each actor has a unique person_id, which is a key.

Play is a table of plays, giving the title, author, and year written for each play. Each play has a unique play_id, which is a key.

Role records which actors have performed which roles (characters) in which plays.

Attributes person_id and play_id are foreign keys to Actor and Play respectively.

All three attributes make up the key since it is possible for a single actor to play more than one character in the same play

Further Explanation:

In SQL, in order to define relational schema (Tables) for a database, we use CREATE TABLE statement to create a new table in a database.  The column parameters specify the names of the columns of the table.  The datatype parameter specifies the type of data the column can hold (varchar, integer, date)

4 0
3 years ago
Type the correct answer in the box. Use numerals instead of words. If necessary, use / for the fraction bar. What will be the ou
lord [1]

Answer:

2

Explanation:

The output of the Java program is 2. The public Vehicle class is defined with the class variable 'counter'. When a Vehicle class object is instantiated, the counter variable increments by one.

In the program, the two instances of the class are created, incrementing the counter variable to two, the print statement outputs 2 as the result of the program.

8 0
3 years ago
Other questions:
  • Mike is reading about machine-dependent programming languages. Which languages are machine-dependent programming languages?
    11·1 answer
  • A coworker asks your opinion about how to minimize ActiveX attacks while she browses the Internet using Internet Explorer. The c
    14·1 answer
  • What is my credit card billing zip code??
    15·1 answer
  • Can a computer will work more efficiently if you perform disk optimization
    9·1 answer
  • Create a story that depicts the events leading up to a vehicle crash fatality, the crash itself, and the emotional aftermath on
    9·1 answer
  • Use the drop-down menus to complete each sentence about the layers of the atmosphere.
    14·2 answers
  • Question 1 of 20 Gus has decided to organize his inbox on June 26 by using folders and deleting irrelevant messages. He creates
    14·2 answers
  • Which of the following is not part of the processes involved in data valida
    11·1 answer
  • The profile picture that you plan to use to market your professional brand on social media networks should feature you only.
    12·1 answer
  • How media platform drives globalization.<br> Advantages and disadvantages of using media platforms?
    7·1 answer
Add answer
Login
Not registered? Fast signup
Signup
Login Signup
Ask question!