Making data understandable

Data comes in lots of forms and is used for lots of purposes. A very common form of data is the table, consisting of information organised into rows and columns, and tabular data is often available for download as CSV, for example from Kaggle. (CSV stands for comma separated values and it is a widely used and open data form.) In this challenge, you should write code to download and import a large dataset. Once you have imported it, see if you can see any interesting patterns in the data. Can you display this information in a graphical way that makes it easier to understand it? You could use a standard data visualisation approach such as a bar chart or a scatter graph, or perhaps you can find a more interesting approach. Try sketching some ideas on paper before deciding how to code it up.

Example projects

Resources

We have suggested a couple of datasets in the examples. Alternatively, see if you can find something cool in this list of fun datasets on Kaggle: https://www.kaggle.com/rtatman/fun-beginner-friendly-datasets

Pandas (https://pandas.pydata.org/) is an excellent Python-based tool for importing data in various formats. See https://pythonbasics.org/read-csv-with-pandas/

Exploratory data analysis in Python: https://nbviewer.jupyter.org/github/Tanu-N-Prabhu/Python/blob/master/Exploratory_data_Analysis.ipynb

Data visualisation is not Python’s strongest point, although seaborn (https://seaborn.pydata.org/) is worth looking at. By contrast, JavaScript makes it easy to show graphics in the browser. For starting points, have a look at these libraries:

And these intro videos on Coding Train are helplful: https://thecodingtrain.com/Courses/data-and-apis/