Create a pandas DataFrame from a dict
Updated:
It is a simple task to turn your dictionary into a pandas dataframe, however, everytime I need to do that I find my self going back to docos and stackoverflow to remember how! So I decided to write a short instructions page for future reference.
dictionaries
Often data comes to you in a form of a dictionary object. There are types of dictionaries based on orientation that I deal with in this post:
- Type 1: dictionary item holds value pairs to be converted to rows, example
dict1 = { 'a': 1, 'b': 2, 'c': 3}
- Type 2: dictionary items are columns, example
dict2 = {'name': ['a', 'b', 'c'], 'value' = [1, 2, 3]}
type 1: items are rows
In the first case, I want to convert each item, key-value pair, in the dictionary into a row in the dataframe. Use the dictionary items()
method to get a list of key-value pairs in tuples, specify column names when creating the dataframe.
Type 1 dictionaries tend to be long, and are usually faced when interacting with libraries, like for example getting the properties of nodes in a graph, with each key being the node name, the value is the value of property in question.
import pandas as pd
dict1 = { 'a': 1, 'b': 2, 'c': 3}
df1 = pd.DataFrame(dict1.items(), columns=['name', 'value'])
print(df1)
output:
name value
0 a 1
1 b 2
2 c 3
type 2: items are columms
This type has a columnar structure, the dictionary in this case is usually not long (depending on number of columns). Dictionary values store arrays that map to column values, with the dictionary key mapping to the column name in the dataframe.
In this case you just need to pass in the dictionary object as is when creating the dataframe, there is no need to specify the column names as they are already specified by the keys of the dictionary.
import pandas as pd
dict2 = { 'name': ['a', 'b', 'c'], 'value': [1 ,2 ,3]}
df2 = pd.DataFrame(dict2)
print(df2)
output:
name value
0 a 1
1 b 2
2 c 3
- mutaz
.
Comments