How to merge columns, value count them and then plot the results?

Question

How do I get from a dataframe with multiple columns that have similar values and need to be merged:

df1 = pd.DataFrame({'firstcolumn':['ab', 'ca', 'da', 'ta','la'],
'secondcolumn':['ab', 'ca', 'ta', 'da', 'sa'], 'index':[2011,2012,2011,2012,2012]})

To a crosstab that tells me for each year how many values were collected?

Index ab ca da ta sa la
2011 2  0  1  1  0  0
2012 0  2  1  1  1  1

Also, how could then plot the table?

I'm looking at your desired example output, and it doesn't make any sense. There are two counts of 'ca' but they are both in 2012, not 2011. The counts of 'ta' similarly should be in 2011, not 2012, if I am reading your dataset correctly. If I am not, please explain. — kingledion, Jan 10 '18 at 17:54
Thank you, I modified accordingly to reflect the output. Let me know if now it makes sense. My objective is to organise multiple categorical variables (first and second columns) into a value count per year. — Nicola, Jan 10 '18 at 20:45

Nicola · Accepted Answer · 2018-01-13T16:10:02.657

1

It can be done like:

import pandas as pd
melted = pd.melt(df1, id_vars=["index"], var_name="Var", value_name="Score").dropna()
table=pd.crosstab(index=melted['index'], columns=melted['Score'])
%matplotlib inline
table.plot.bar() #for simple axes subplots

edited Jan 13 '18 at 16:10

answered Jan 10 '18 at 21:12

Nicola

121
7

How to merge columns, value count them and then plot the results?

1 Answers1