Data comes in lots of forms and is used for lots of purposes. A very common form of data is the table, consisting of information organised into rows and columns, and tabular data is often available for download as CSV, for example from Kaggle. (CSV stands for comma separated values and it is a widely used and open data form.) In this challenge, you should write code to download and import a large dataset. Once you have imported it, see if you can see any interesting patterns in the data. Can you display this information in a graphical way that makes it easier to understand it? You could use a standard data visualisation approach such as a bar chart or a scatter graph, or perhaps you can find a more interesting approach. Try sketching some ideas on paper before deciding how to code it up.
- What is the ‘best’ Pokemon? Dataset: https://www.kaggle.com/rounakbanik/pokemon
- What are the 10 most popular UK musical artists and what are the 10 least popular? Dataset: https://www.kaggle.com/pieca111/music-artists-popularity
- What are the weather trends recorded in recent years at Edinburgh Airport? Dataset: https://en.tutiempo.net/climate/ws-31600.html
We have suggested a couple of datasets in the examples. Alternatively, see if you can find something cool in this list of fun datasets on Kaggle: https://www.kaggle.com/rtatman/fun-beginner-friendly-datasets
Exploratory data analysis in Python: https://nbviewer.jupyter.org/github/Tanu-N-Prabhu/Python/blob/master/Exploratory_data_Analysis.ipynb
And these intro videos on Coding Train are helplful: https://thecodingtrain.com/Courses/data-and-apis/