YOUTUBE RECOMMENDATION SYSTEM -DATA SCIENCE

  • By
  • November 22, 2021
  • Data Science

For any company these days, the Recommendation system has become a vital part. Every company  wants to give a personalized experience to the user and for that Recommendation, systems are the  best choice.  

LET’S UNDERSTAND WHAT IS A RECOMMENDATION  SYSTEM-

Let’s say you want to buy a t-shirt from Amazon, you went to their website and type black t-shirt,  you will get something like this –

You will see some Black T-shirts on your screen, Simple right??? 

 

 

Now let’s say you liked some t-shirts on the first page and went inside to see them, lets say you  select the third t-shirt from the left (BLACK PANTHER ONE), you checked its reviews, ratings,  etc. 

 

 

Now you came back to the first page and select some different black t-shirts let’s say with a round  collar or maybe t-shirts with a particular brand etc.  

Now if you pay attention, Amazon is collecting every information, every click of yours, whenever  you are going to a particular brand or particular pattern, Amazon has started to know your likings,  disliking.  

It is the same like let’s say you have gone to the nearest market to shop for a t-shirt with a new  friend, the new friend did not know anything about your liking or disliking and he is just observing  you, he is noticing every action of yours.

For Free, Demo classes Call: 8983120543
Registration Link: Click Here!

  What patterns you are choosing?? What brands you are choosing?? What color are you opting for??  Etc etc.

Amazon is that unnecessary friend who is keeping a watch on you every time you are buying  something on its website.  

Now the question is why amazon is collecting every bit of information about you???  

The answer is very simple, Amazon wants you to recommend a product based on what you may like  or what you may buy from its website, it’s a very beautiful idea if you think about it.  

Let’s say you plan to buy a t-shirt as we saw earlier but while searching for the T-shirt you really  liked the black t-shirt and now just entered the page of the black t-shirt to see its price, reviews, etc,  you are not intended to buy that T-shirt, you are just watching it.  

Now when you entered that black T-shirt page you saw something like this, PRODUCTS  RELATED TO THIS ITEM  

If you like the black T-shirt with panther stripes and there are more chances that you may like  these t-shirts as well. 

What amazon is doing internally, is finding the 10 most similar T-shirts with the T-shirt you are  looking at because of the simple assumption that.  

“If you like this T-shirt then there are more chances that you will like similar T-shirts”  That’s how Amazon is selling its products to us.  

Just from the Recommendation system, Amazon got 40 billion dollars business, and it is a big big  number, and that is the reason Amazon is enhancing its Recommendation system day by day with  new technologies.  

Let’s build a simple YouTube Recommendation  System –

Before starting, let me make clear that the actual YouTube Recommendation system is much much  complex than what we gonna discuss here but my intention is to give you a flavor of how the  Recommendation system works internally.  

Let’s get started –

Now the question is what YouTube is recommending to us and why there is a need for a  recommendation system in YouTube?  

Now YouTube earns money by showing us ads in between the videos so the more the user will stay  on YouTube, the more company earns money, so basically, our time is their money.  So they want to see what we like to see, that is why they use the recommendation system.  Now as we saw on Amazon, the important thing is our data, so how YouTube is collecting our  data??  

Our YouTube history, our location, our name, our mail id, even our google search history, all of this  is owned by YouTube, that is a lot of data.  

Now for simplicity, let’s say YouTube only has our YouTube watch history and it knows what we  see on YouTube. 

For Free, Demo classes Call: 8983120543
Registration Link: Click Here!

Let’s see a solid example so that you can imagine things better –

We have a Data-matrix, user1, user2, user3, user4, user5 belongs to users of YouTube and vid1,  vid2, vid3, vid-4, vid-5 belongs to the videos available on YouTube, we are assuming here that only  5 users are there on YouTube and there are just 5 videos on YouTube, Now wherever in the matrix  there is 1, it means the user has seen that video, and whenever there is 0, it means the user has not  seen the video. for example 

USER-1 HAS SEEN VIDEO -1, VIDEO-2, VIDEO-4, AND VIDEO-5 BUT DID NOT SEE  VIDEO-3  

SIMILARLY, USER-5 HAS SEEN VIDEO-1 AND VIDEO-4 AND DID NOT SEE VIDEO-2,  VIDEO-3, AND VIDEO-5.  

Now let’s understand the problem statement –

  • Let’s say we have 5 videos, v1(cricket), v2(cooking), v3(workout), v4(cricket), and  v5(cricket).  
  • Let’s say we have a user-1 named ‘Rohan’, he watched v1(cricket), v2(cooking), v5(cricket)  out of 5 videos from last week, now the task is to recommend ‘Rohan’ some videos which  are similar to v1, v2, and v5.  
  • So out of v3(workout), v4(cricket), the Recommendation system should be able to pick v4  because it is most similar to v1 and v5.  
  • So if we can find ‘Similarity’ between videos based on the data we have, our problem will  be solved.  
  • Concept is very simple, “if Rohan has seen a cricket video multiple times in history, it is more  likely that he will see a cricket video in the future”. 
  • So if v4 is a cricket video then we can say that v1, v5, and v4 should be similar and most of the  users must have watched them together, taking this statement we will find the similarity.  

Now if there is a way by which we can find the similarity between the videos then our work will be  done, but we need to make sure that that similarity should come only from the data matrix we have.  So If I just use a simple intersection concept which we have studied in class-10th, refer to the image  below to understand union and intersection.  

Now from the above image we can say, the similarity between vi(video i) and Vj(video j) is defined  as the intersection of users who have watched both the videos and then we count the number of  users who have watched both the videos.  

This is a very simple way to understand the similarity function, there could be much better ways to  define it but in this article, we are simplifying things to understand Recommendation systems better.  Now, If we want to find similarities between video-1 and video-5(refer to the example), then we can  say that 3 users out of 5 have seen both the videos(u1+u2+u4), as both of them belong to cricket.  

similarity(v1,v5)={u1,u2,u4},size is 3, 

similarity score is 3 similarity(v1,v4)={u2,u3,u5}, size is 3, similarity score is 3 similarity(v1,v2)={u1},

similarity score is 1 similarity(v1,v3)={u4}, size is 1, similarity score is 1 

similarity(v5,v4)={u1,u2}, size is 2  

similarity(v5,v3)={u4}, size is 1  

similarity(v2,v4)={NULL }, size is 0(No user has watched v2 and v4 together) similarity(v2,v3)={NULL }, size is 0(No user has watched v2 and v3 together)  

By this we can say that, similarity between v1 and v5 is 3, which is the highest number and also  from the problem statement we knew that v1 and v5 both belongs to cricket so their similarity score  must be high.  

So if Rohan has not seen v3(workout)and v4(cricket) and the Recommendation system has to  decide that out of these 2 videos which video has to be recommended to Rohan, How it will  decide??  

It will see the similarity scores between the videos Rohan has watched, for example,  Rohan has watched v1, so it will check the similarity score between v1 and v4 which will be 3, then  it will check the similarity between v1 and v3 which will be 1  

Now Rohan also has watched v5, so the similarity between v5 and v4 will be 2, then it will check  the similarity between v5 and v3 which will be 1  

Now at last Rohan has watched v2 also, so the similarity between v2 and v4 will be 0, then It will  check the similarity between v2 and v3 which will be 0.  

So it is very clear that out of v3 and v4, the Recommendation system will choose v4 for Rohan as  its similarity scores with the videos Rohan has watched is higher than v3.  

For Free, Demo classes Call: 8983120543
Registration Link: Click Here!

I hope you guys got the flavor of Recommendation system. I want to clarify one thing again and  this is a very basic Recommendation system, actual systems are much complex.  

Author:

Nishesh Kumar

Call the Trainer and Book your free demo Class  Call now!!!
| SevenMentor Pvt Ltd.

© Copyright 2021 | Sevenmentor Pvt Ltd.

Submit Comment

Your email address will not be published. Required fields are marked *

*
*