5

Its a similar question to

Export pandas to dictionary by combining multiple row values

But in this case I want something different.

from pandas import DataFrame

df = DataFrame([
           ['A', 123, 1], 
           ['B', 345, 5], 
           ['C', 712, 4],
           ['B', 768, 2], 
           ['B', 768, 3], 
           ['A', 123, 9], 
           ['C', 178, 6], 
           ['C', 178, 5],  
           ['A', 321, 3]], 
           columns=['maingroup', 'subgroup', 'selectedCol'])

I'd want the output to be:

{
 'A': {'123':[1, 9], '321':[3]},
 'B': {'345':[5], '768':[2, 3]},
 'C': {'712':[4], '178':[6, 5]}
}

acb
  • 111
  • 1
  • 1
  • 6

2 Answers2

6

Using dict comprehension with nested groupby:

d = {k: f.groupby('subgroup')['selectedCol'].apply(list).to_dict()
     for k, f in df.groupby('maingroup')}

Output:

{'A': {123: [1, 9], 321: [3]},
 'B': {345: [5], 768: [2, 3]},
 'C': {178: [6, 5], 712: [4]}}
acb
  • 111
  • 1
  • 1
  • 6
3

This is a bit complicated, but maybe someone has a better solution. In the meantime here we go:

df = df.groupby(['subgroup']).agg({'selectedCol': list, 'maingroup': 'first'})
df = df.groupby(['maingroup']).agg(dict)
df.to_json(orient='columns')

I did in two steps:

  • first merging the selectedCol to get a list:

enter image description here

  • then create the 2nd level dictionary

enter image description here

  • extract to json

enter image description here

There might be a cleverer way to do this by playing around with the orient parameter in the to_json method.

Edit: which part of the code fails? 1, 2 or 3? My notebook screenshot. check your version:

enter image description here

or maybe try this:

df.groupby(['subgroup']).agg({'selectedCol': list, 'maingroup': 'first'}).groupby(['maingroup']).agg(dict)
RonsenbergVI
  • 979
  • 3
  • 10