Numerical:-

Map Reduce :-

MovieLens Data

USER_ID MOVIE_ID RATING TIMESTAMP

196 242 3 881250949

186 302 3 891717742

196 377 1 878887116

244 51 2 880606923

166 346 1 886397596

186 474 4 884182806

186 265 2 881171488

Solution : –

Step 1 – First we have to map the values , it is happen in 1st phase of Map Reduce model.

196:242 ; 186:302 ; 196:377 ; 244:51 ; 166:346 ; 186:274 ; 186:265

Step 2 – After Mapping we have to shuffle and sort the values.

166:346 ; 186:302,274,265 ; 196:242,377 ; 244:51

Step 3 – After completion of step1 and step2 we have to reduce each key’s values.

Now, put all values together

Solution

CODE FOR MAPPER AND REDUCER TOGETHER:

Python3

from mrjob.job import MRJob
from mrjob.step import MRStep
 
 
class RatingsBreak(MRJob):
    def steps(self):
        return [
            MRstep(mapper=self.mapper_get_ratings,
                   reducer=self.reducer_count_ratings)
        ]
        # MAPPER CODE
 
    def mapper_get_ratings(self, _, line):
        (User_id, Movie_id, Rating, Timestamp) = line.split('/t')
        yield rating,
        # REDUCER CODE
 
    def reducer_count_ratings(self, key, values):
        yield key, sum(values)

Map Reduce and its Phases with numerical example.

Tags:

#Hadoop #MapReduce #Hadoop