Export pandas to dictionary by combining multiple row values

Question

I have a pandas dataframe df that looks like this

name    value1     value2
A       123         1
B       345         5
C       712         4
B       768         2
A       318         9
C       178         6
A       321         3

I want to convert this into a dictionary with name as a key and list of dictionaries (value1 key and value2 value) for all values that are in name

So, the output would look like this

{
 'A': [{'123':1}, {'318':9}, {'321':3}],
 'B': [{'345':5}, {'768':2}],
 'C': [{'712':4}, {'178':6}]
}

So, far I have managed to get a dictionary with name as key and list of only one of the values as a list by doing

df.set_index('name').transpose().to_dict(orient='list')

How do I get my desired output? Is there a way to aggregate all the values for the same name column and get them in the form I want?

Any chance you could copy and paste the code which you used to create `df`? — ignoring_gravity, May 29 '18 at 16:20
@Lupacante The dict is created from a file. so this is as it is read from it. The data needs to be in a format like that for some further processing which is sadly on in my control. — sfactor, May 29 '18 at 16:24

score 10 · Accepted Answer · answered May 29 '18 at 16:29

Does this do what you want it to?

from pandas import DataFrame

df = DataFrame([['A', 123, 1], ['B', 345, 5], ['C', 712, 4], ['B', 768, 2], ['A', 318, 9], ['C', 178, 6], ['A', 321, 3]], columns=['name', 'value1', 'value2'])

d = {}
for i in df['name'].unique():
    d[i] = [{df['value1'][j]: df['value2'][j]} for j in df[df['name']==i].index]

This returns

  Out[89]: 
{'A': [{123: 1}, {318: 9}, {321: 3}],
 'B': [{345: 5}, {768: 2}],
 'C': [{712: 4}, {178: 6}]}

Great this is exactly what I needed. Thanks a lot. Accepting. — sfactor, May 29 '18 at 16:35

Aditya · Answer 2 · 2021-03-14T09:41:29.623

4

The to_dict() method sets the column names as dictionary keys so you'll need to reshape your DataFrame slightly. Setting the 'ID' column as the index and then transposing the DataFrame is one way to achieve this.

The same can be done with the following line:

>>> df.set_index('ID').T.to_dict('list')
{'p': [1, 3, 2], 'q': [4, 3, 2], 'r': [4, 0, 9]}

Better to use the groupby,

df.groupby('name')[['value1','value2']].apply(lambda g: g.values.tolist()).to_dict()

edited Mar 14 '21 at 09:41

answered May 29 '18 at 16:21

Aditya

2,440
2
15
34

I've already done this if you read the last part of the question. My requirement is a little different. – sfactor May 29 '18 at 16:25
I am working on that – Aditya May 29 '18 at 16:26
I guess passing the Orient = index and a little bit of modification will get you what you want – Aditya May 29 '18 at 16:28

score 3 · Answer 3 · edited Aug 26 '18 at 18:19

3

df.groupby('name')[['value1','value2']].apply(lambda g: g.values.tolist()).to_dict()

if you need a list of tuples explicitly:

df.groupby('name')[['value1','value2']].apply(lambda g: list(map(tuple, g.values.tolist()))).to_dict()

edited Aug 26 '18 at 18:19

Stephen Rauch

1,783
11
21
34

answered Aug 26 '18 at 18:13

nemo

31
1

This is faster than the accepted solution for large datasets. – cyram Apr 02 '20 at 09:36

score 0 · Answer 4 · answered Apr 02 '20 at 10:07

Building on @nemo's answer (above) which will be faster than the accepted solution, this will give the same output that you want:

def formatRecords(g):
    keys = ['value1', 'value2']
    result = []
    for item in g.values.tolist():
        item = dict(zip(keys, item))
        result.append(item)
    return result

df_dict = df.groupby('name').apply(lambda g: formatRecords(g)).to_dict()

Export pandas to dictionary by combining multiple row values

4 Answers4

Linked