View on GitHub

Netflix-Data-Analysis

An exploratory data analysis of Netflix movies and TV shows which I created after watching lessons from Jovian

Project title: Netflix-Data-Analysis

The link for the Project:

Description

This project, the data is about Netflix Movies and TV Series, in which I have used libraries like panda, matplotlib and seaborn that I got to learn from my course of Udacity. In this project I will explore number of Netflix movie titles which are either Movie or TV Show, most tiltes come under which rating, ost Movie/ TV Show titles are given which rating and which country had most releases. . I will write code to import the data and answer interesting questions about it by computing descriptive statistics.

About the Dataset

The given dataset was taken from the dataset bundle present in Kaggle Datasets, Refer to this link This data is freely available on Kaggle https://www.kaggle.com/shivamb/netflix-shows

With this dataset I am trying to visualize sort of movies or shows Netflix produces, in which countries it invests. Hope you will enjoy the visualization provided by me.

The name of the Dataset used for this projects is netflix_titles.csv. There are 6234 rows in the netflix_tiltes.csv file each row containing data about a Movie/TV show.

I have used Python 3 for this analysis. The Libraries/Packages I used in this projects are as follows:

numpy (as np is one of the very famous packages for working with arrays in python)
pandas (Is greatly used in analysis of data and making dataframe)
matplotlib (Lets make our Analyzation fun and interative with the visualization library matplotlib)
seaborn (Adding more colours into matplotlib visualization)

Inferences and Conclusion

The analysis gives an overview of the movies and TV shows. The above observation contains a lot of informations about a rating of each type to where the most netflix type is realeased.

With that, we’ve come to the end of this analysis. If you are a binge watcher I am sure you have enjoyed the analysis. The following are conclusions drawn from the analysis. Hope you enjoyed!!

Movies are more popular type of content than TV Shows
TV-MA rating is the most for which Movies are made
United States is the country with most releases
Longest Movie is of 312 mins and Shortest Movie is of 3 mins
Longest TV Show is of 9 seasons and Shortest TV show is of 1 season

References and Future Work

Future Work

There are lot of scopes of improvement and/or addition in this project in future, with the data provided and adding extra datasets we can do:

Knowing about the rating of Movies and TV shows from Wikipedia
Which director produced most Movies/TV shows
In which year was the most content released of netflix
What type of content was in which category

Refereces

Netflix Titles Dataset: https://www.kaggle.com/shivamb/netflix-shows
Kaggle Datasets (Choose Dataset of your choice): https://www.kaggle.com/datasets
Pandas user guide: https://pandas.pydata.org/docs/user_guide/index.html
Matplotlib user guide: https://matplotlib.org/3.3.1/users/index.html
Seaborn user guide & tutorial: https://seaborn.pydata.org/tutorial.html
Data analysis guide https://jovian.ml/aakashns/python-pandas-data-analysis
Stackoverflow Community (Get answers of any problems): https://stackoverflow.com/questions
Python solutions in Geeksforgeeks (Solutions made easy): https://www.geeksforgeeks.org/python-programming-language/
Opendatasets Python library (Choosing and using datasets in python made easy): https://github.com/JovianML/opendatasets