Lab – MapReduce Programming Framework

Topic Progress:

Exercise-1 Write a Mapreduce program in java for wordcount and result must be in numerical ordering.

1.   Go to Eclipse IDE. Copy the wordcount.java file and create a new file named WordcountInt.java

2.   In this file we need to treat key as an IntWritable value. To do this follow the following steps:

Note: This program  can also be found in javaMR directory inside Documents directory with the name WordcountInt.

3. Change mapper as shown: (Output key & value type of mapper are changed to IntWritable and every word is mapped to be an IntWritable key.

4. Make changes in reducer as shown:

5. Make changes in main function call for output key class as shown:

6. Create Jar file.

7. Now run the jar file on bigdatafile-dihub.txt file as shown:

8.   Now you will see the results in numerical ordering

 

Exercise-2 Find the size of words in a given file using Mapreduce.

1. For this exercise create a simple text file (simpleFile.txt) as shown: (you can find this file in dataMR directory inside Documents directory.

gedit simpleFile.txt

2. Upload this file to hdfs.

3. Go to eclipse IDE and create a new program named WordSize.java

Note: You can find this complete program code in javaMR directory with the name WordSize.

4. Create different keys according to word length for mapper.

5. Create the mapper to write key and corresponding values to be further processed by reducer.

6. In reducer just sum the values of each key. Keep the same reducer code as of Wordcount program.

7. Export program to make a jar file – create jar file by giving it a name. 

8. Go to terminal and run the jar on simpleFile.txt over hdfs

9. Once the program execution is complete you can check the content of result file to see the outcome of program: