more about Pandas入门 DataFrame的常用函数
Prepping Data
Let’s download, import and clean our primary Canadian Immigration dataset using pandas read_excel() method for any visualization.
1 | df_can = pd.read_excel('https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DV0101EN-SkillsNetwork/Data%20Files/Canada.xlsx', |
Waffle Charts
A waffle chart is an interesting visualization that is normally created to display progress toward goals. It is commonly an effective option when you are trying to add interesting visualization features to a visual that consists mainly of cells, such as an Excel dashboard.
1 | import matplotlib as mpl |
Unfortunately, unlike R, waffle charts are not built into any of the Python visualization libraries. Therefore, we will learn how to create them from scratch.
Let’s revisit the previous case study about Denmark, Norway, and Sweden.
1 | # let's create a new dataframe for these three countries |
Step 1. The first step into creating a waffle chart is determing the proportion of each category with respect to the total.
1 | # compute the proportion of each category with respect to the total |
Denmark: 0.32255663965602777
Norway: 0.1924094592359848
Sweden: 0.48503390110798744Step 2. The second step is defining the overall size of the waffle chart.
1 | width = 40 # width of chart |
Total number of tiles is 400Step 3. The third step is using the proportion of each category to determe it respective number of tiles
1 | # compute the number of tiles for each catagory |
Denmark: 129
Norway: 77
Sweden: 194Based on the calculated proportions, Denmark will occupy 129 tiles of the waffle chart, Norway will occupy 77 tiles, and Sweden will occupy 194 tiles.
Step 4. The fourth step is creating a matrix that resembles the waffle chart and populating it.
1 | # initialize the waffle chart as an empty matrix |
Waffle chart populated!Let’s take a peek at how the matrix looks like.
1 | waffle_chart |
array([[1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 2., 2., 2.,
2., 2., 2., 2., 2., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3.,
3., 3., 3., 3., 3., 3., 3., 3.],
[1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 2., 2., 2.,
2., 2., 2., 2., 2., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3.,
3., 3., 3., 3., 3., 3., 3., 3.],
[1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 2., 2., 2.,
2., 2., 2., 2., 2., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3.,
3., 3., 3., 3., 3., 3., 3., 3.],
[1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 2., 2., 2.,
2., 2., 2., 2., 2., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3.,
3., 3., 3., 3., 3., 3., 3., 3.],
[1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 2., 2., 2.,
2., 2., 2., 2., 2., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3.,
3., 3., 3., 3., 3., 3., 3., 3.],
[1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 2., 2., 2.,
2., 2., 2., 2., 2., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3.,
3., 3., 3., 3., 3., 3., 3., 3.],
[1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 2., 2., 2.,
2., 2., 2., 2., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3.,
3., 3., 3., 3., 3., 3., 3., 3.],
[1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 2., 2., 2.,
2., 2., 2., 2., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3.,
3., 3., 3., 3., 3., 3., 3., 3.],
[1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 2., 2., 2.,
2., 2., 2., 2., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3.,
3., 3., 3., 3., 3., 3., 3., 3.],
[1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 2., 2., 2., 2.,
2., 2., 2., 2., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3.,
3., 3., 3., 3., 3., 3., 3., 3.]])As expected, the matrix consists of three categories and the total number of each category’s instances matches the total number of tiles allocated to each category.
Step 5. Map the waffle chart matrix into a visual.
1 | # instantiate a new figure object |
Step 6. Prettify the chart.
1 | # instantiate a new figure object |

Step 7. Create a legend and add it to chart.
1 | # instantiate a new figure object |

And there you go! What a good looking delicious waffle chart, don’t you think?
Function packed
Now it would very inefficient to repeat these seven steps every time we wish to create a waffle chart. So let’s combine all seven steps into one function called create_waffle_chart. This function would take the following parameters as input:
- categories: Unique categories or classes in dataframe.
- values: Values corresponding to categories or classes.
- height: Defined height of waffle chart.
- width: Defined width of waffle chart.
- colormap: Colormap class
- value_sign: In order to make our function more generalizable, we will add this parameter to address signs that could be associated with a value such as %, $, and so on. value_sign has a default value of empty string.
1 | def create_waffle_chart(categories, values, height, width, colormap, value_sign=''): |
Now to create a waffle chart, all we have to do is call the function create_waffle_chart. Let’s define the input parameters:
1 | width = 40 # width of chart |
And now let’s call our function to create a waffle chart.
1 | create_waffle_chart(categories, values, height, width, colormap) |
Total number of tiles is 400
Denmark: 129
Norway: 77
Sweden: 194
There seems to be a new Python package for generating waffle charts called PyWaffle, but it looks like the repository is still being built. But feel free to check it out and play with it.