对pandas中to_dict的用法详解_Python

简介：pandas 中的to_dict 可以对DataFrame类型的数据进行转换

可以选择六种的转换类型，分别对应于参数 ‘dict', ‘list', ‘series', ‘split', ‘records', ‘index'，下面逐一介绍每种的用法

									Help on method to_dict in module pandas.core.frame:

									to_dict(orient='dict') method of pandas.core.frame.DataFrame instance

									 Convert DataFrame to dictionary.

									 Parameters

									 ----------

									 orient : str {'dict', 'list', 'series', 'split', 'records', 'index'}

									 Determines the type of the values of the dictionary.

									 - dict (default) : dict like {column -> {index -> value}}

									 - list : dict like {column -> [values]}

									 - series : dict like {column -> Series(values)}

									 - split : dict like

									  {index -> [index], columns -> [columns], data -> [values]}

									 - records : list like

									  [{column -> value}, ... , {column -> value}]

									 - index : dict like {index -> {column -> value}}

									  .. versionadded:: 0.17.0

									 Abbreviations are allowed. `s` indicates `series` and `sp`

									 indicates `split`.

									 Returns

									 -------

									 result : dict like {column -> {index -> value}}

1、选择参数orient='dict'

dict也是默认的参数，下面的data数据类型为DataFrame结构, 会形成 {column -> {index -> value}}这样的结构的字典，可以看成是一种双重字典结构

- 单独提取每列的值及其索引，然后组合成一个字典

- 再将上述的列属性作为关键字（key），值（values）为上述的字典

查询方式为：data_dict[key1][key2]

- data_dict 为参数选择orient='dict'时的数据名

- key1 为列属性的键值（外层）

- key2 为内层字典对应的键值

									data 

									Out[9]: 

									 pclass age embarked   home.dest sex

									1086 3rd 31.194181 UNKNOWN   UNKNOWN male

									12 1st 31.194181 Cherbourg   Paris, France female

									1036 3rd 31.194181 UNKNOWN   UNKNOWN male

									833 3rd 32.000000 Southampton Foresvik, Norway Portland, ND male

									1108 3rd 31.194181 UNKNOWN   UNKNOWN male

									562 2nd 41.000000 Cherbourg   New York, NY male

									437 2nd 48.000000 Southampton Somerset / Bernardsville, NJ female

									663 3rd 26.000000 Southampton   UNKNOWN male

									669 3rd 19.000000 Southampton   England male

									507 2nd 31.194181 Southampton  Petworth, Sussex male

									In[10]: data_dict=data.to_dict(orient= 'dict')

									In[11]: data_dict

									Out[11]: 

									{'age': {12: 31.19418104265403,

									 437: 48.0,

									 507: 31.19418104265403,

									 562: 41.0,

									 663: 26.0,

									 669: 19.0,

									 833: 32.0,

									 1036: 31.19418104265403,

									 1086: 31.19418104265403,

									 1108: 31.19418104265403},

									 'embarked': {12: 'Cherbourg',

									 437: 'Southampton',

									 507: 'Southampton',

									 562: 'Cherbourg',

									 663: 'Southampton',

									 669: 'Southampton',

									 833: 'Southampton',

									 1036: 'UNKNOWN',

									 1086: 'UNKNOWN',

									 1108: 'UNKNOWN'},

									 'home.dest': {12: 'Paris, France',

									 437: 'Somerset / Bernardsville, NJ',

									 507: 'Petworth, Sussex',

									 562: 'New York, NY',

									 663: 'UNKNOWN',

									 669: 'England',

									 833: 'Foresvik, Norway Portland, ND',

									 1036: 'UNKNOWN',

									 1086: 'UNKNOWN',

									 1108: 'UNKNOWN'},

									 'pclass': {12: '1st',

									 437: '2nd',

									 507: '2nd',

									 562: '2nd',

									 663: '3rd',

									 669: '3rd',

									 833: '3rd',

									 1036: '3rd',

									 1086: '3rd',

									 1108: '3rd'},

									 'sex': {12: 'female',

									 437: 'female',

									 507: 'male',

									 562: 'male',

									 663: 'male',

									 669: 'male',

									 833: 'male',

									 1036: 'male',

									 1086: 'male',

									 1108: 'male'}}

2、当关键字orient=' list' 时

和1中比较相似，只不过内层变成了一个列表，结构为{column -> [values]}

查询方式为： data_list[keys][index]

data_list 为关键字orient='list' 时对应的数据名

keys 为列属性的键值，如本例中的'age' , ‘embarked'等

index 为整型索引，从0开始到最后

									In[19]: data_list=data.to_dict(orient='list')

									In[20]: data_list

									Out[20]: 

									{'age': [31.19418104265403,

									 31.19418104265403,

									 31.19418104265403,

									 32.0,

									 31.19418104265403,

									 41.0,

									 48.0,

									 26.0,

									 19.0,

									 31.19418104265403],

									 'embarked': ['UNKNOWN',

									 'Cherbourg',

									 'UNKNOWN',

									 'Southampton',

									 'UNKNOWN',

									 'Cherbourg',

									 'Southampton',

									 'Southampton',

									 'Southampton',

									 'Southampton'],

									 'home.dest': ['UNKNOWN',

									 'Paris, France',

									 'UNKNOWN',

									 'Foresvik, Norway Portland, ND',

									 'UNKNOWN',

									 'New York, NY',

									 'Somerset / Bernardsville, NJ',

									 'UNKNOWN',

									 'England',

									 'Petworth, Sussex'],

									 'pclass': ['3rd',

									 '1st',

									 '3rd',

									 '3rd',

									 '3rd',

									 '2nd',

									 '2nd',

									 '3rd',

									 '3rd',

									 '2nd'],

									 'sex': ['male',

									 'female',

									 'male',

									 'male',

									 'male',

									 'male',

									 'female',

									 'male',

									 'male',

									 'male']}

3、关键字参数orient='series'

形成结构{column -> Series(values)}

调用格式为：data_series[key1][key2]或data_dict[key1]

data_series 为数据对应的名字

key1 为列属性的键值，如本例中的'age' , ‘embarked'等

key2 使用数据原始的索引（可选）

									In[21]: data_series=data.to_dict(orient='series')

									In[22]: data_series

									Out[22]: 

									{'age': 1086 31.194181

									 12 31.194181

									 1036 31.194181

									 833 32.000000

									 1108 31.194181

									 562 41.000000

									 437 48.000000

									 663 26.000000

									 669 19.000000

									 507 31.194181

									 Name: age, dtype: float64, 'embarked': 1086 UNKNOWN

									 12 Cherbourg

									 1036 UNKNOWN

									 833 Southampton

									 1108 UNKNOWN

									 562 Cherbourg

									 437 Southampton

									 663 Southampton

									 669 Southampton

									 507 Southampton

									 Name: embarked, dtype: object, 'home.dest': 1086    UNKNOWN

									 12   Paris, France

									 1036    UNKNOWN

									 833 Foresvik, Norway Portland, ND

									 1108    UNKNOWN

									 562   New York, NY

									 437 Somerset / Bernardsville, NJ

									 663    UNKNOWN

									 669    England

									 507   Petworth, Sussex

									 Name: home.dest, dtype: object, 'pclass': 1086 3rd

									 12 1st

									 1036 3rd

									 833 3rd

									 1108 3rd

									 562 2nd

									 437 2nd

									 663 3rd

									 669 3rd

									 507 2nd

									 Name: pclass, dtype: object, 'sex': 1086 male

									 12 female

									 1036 male

									 833 male

									 1108 male

									 562 male

									 437 female

									 663 male

									 669 male

									 507 male

									 Name: sex, dtype: object}

4、关键字参数orient='split'

形成{index -> [index], columns -> [columns], data -> [values]}的结构，是将数据、索引、属性名单独脱离出来构成字典

调用方式有 data_split[‘index'],data_split[‘data'],data_split[‘columns']

									data_split=data.to_dict(orient='split')

									data_split

									Out[38]: 

									{'columns': ['pclass', 'age', 'embarked', 'home.dest', 'sex'],

									 'data': [['3rd', 31.19418104265403, 'UNKNOWN', 'UNKNOWN', 'male'],

									 ['1st', 31.19418104265403, 'Cherbourg', 'Paris, France', 'female'],

									 ['3rd', 31.19418104265403, 'UNKNOWN', 'UNKNOWN', 'male'],

									 ['3rd', 32.0, 'Southampton', 'Foresvik, Norway Portland, ND', 'male'],

									 ['3rd', 31.19418104265403, 'UNKNOWN', 'UNKNOWN', 'male'],

									 ['2nd', 41.0, 'Cherbourg', 'New York, NY', 'male'],

									 ['2nd', 48.0, 'Southampton', 'Somerset / Bernardsville, NJ', 'female'],

									 ['3rd', 26.0, 'Southampton', 'UNKNOWN', 'male'],

									 ['3rd', 19.0, 'Southampton', 'England', 'male'],

									 ['2nd', 31.19418104265403, 'Southampton', 'Petworth, Sussex', 'male']],

									 'index': [1086, 12, 1036, 833, 1108, 562, 437, 663, 669, 507]}

5、当关键字orient='records' 时

形成[{column -> value}, … , {column -> value}]的结构

整体构成一个列表，内层是将原始数据的每行提取出来形成字典

调用格式为data_records[index][key1]

									data_records=data.to_dict(orient='records')

									data_records

									Out[41]: 

									[{'age': 31.19418104265403,

									 'embarked': 'UNKNOWN',

									 'home.dest': 'UNKNOWN',

									 'pclass': '3rd',

									 'sex': 'male'},

									 {'age': 31.19418104265403,

									 'embarked': 'Cherbourg',

									 'home.dest': 'Paris, France',

									 'pclass': '1st',

									 'sex': 'female'},

									 {'age': 31.19418104265403,

									 'embarked': 'UNKNOWN',

									 'home.dest': 'UNKNOWN',

									 'pclass': '3rd',

									 'sex': 'male'},

									 {'age': 32.0,

									 'embarked': 'Southampton',

									 'home.dest': 'Foresvik, Norway Portland, ND',

									 'pclass': '3rd',

									 'sex': 'male'},

									 {'age': 31.19418104265403,

									 'embarked': 'UNKNOWN',

									 'home.dest': 'UNKNOWN',

									 'pclass': '3rd',

									 'sex': 'male'},

									 {'age': 41.0,

									 'embarked': 'Cherbourg',

									 'home.dest': 'New York, NY',

									 'pclass': '2nd',

									 'sex': 'male'},

									 {'age': 48.0,

									 'embarked': 'Southampton',

									 'home.dest': 'Somerset / Bernardsville, NJ',

									 'pclass': '2nd',

									 'sex': 'female'},

									 {'age': 26.0,

									 'embarked': 'Southampton',

									 'home.dest': 'UNKNOWN',

									 'pclass': '3rd',

									 'sex': 'male'},

									 {'age': 19.0,

									 'embarked': 'Southampton',

									 'home.dest': 'England',

									 'pclass': '3rd',

									 'sex': 'male'},

									 {'age': 31.19418104265403,

									 'embarked': 'Southampton',

									 'home.dest': 'Petworth, Sussex',

									 'pclass': '2nd',

									 'sex': 'male'}]

6、当关键字orient='index' 时

形成{index -> {column -> value}}的结构，调用格式正好和'dict' 对应的反过来，请读者自己思考

									data_index=data.to_dict(orient='index')

									data_index

									Out[43]: 

									{12: {'age': 31.19418104265403,

									 'embarked': 'Cherbourg',

									 'home.dest': 'Paris, France',

									 'pclass': '1st',

									 'sex': 'female'},

									 437: {'age': 48.0,

									 'embarked': 'Southampton',

									 'home.dest': 'Somerset / Bernardsville, NJ',

									 'pclass': '2nd',

									 'sex': 'female'},

									 507: {'age': 31.19418104265403,

									 'embarked': 'Southampton',

									 'home.dest': 'Petworth, Sussex',

									 'pclass': '2nd',

									 'sex': 'male'},

									 562: {'age': 41.0,

									 'embarked': 'Cherbourg',

									 'home.dest': 'New York, NY',

									 'pclass': '2nd',

									 'sex': 'male'},

									 663: {'age': 26.0,

									 'embarked': 'Southampton',

									 'home.dest': 'UNKNOWN',

									 'pclass': '3rd',

									 'sex': 'male'},

									 669: {'age': 19.0,

									 'embarked': 'Southampton',

									 'home.dest': 'England',

									 'pclass': '3rd',

									 'sex': 'male'},

									 833: {'age': 32.0,

									 'embarked': 'Southampton',

									 'home.dest': 'Foresvik, Norway Portland, ND',

									 'pclass': '3rd',

									 'sex': 'male'},

									 1036: {'age': 31.19418104265403,

									 'embarked': 'UNKNOWN',

									 'home.dest': 'UNKNOWN',

									 'pclass': '3rd',

									 'sex': 'male'},

									 1086: {'age': 31.19418104265403,

									 'embarked': 'UNKNOWN',

									 'home.dest': 'UNKNOWN',

									 'pclass': '3rd',

									 'sex': 'male'},

									 1108: {'age': 31.19418104265403,

									 'embarked': 'UNKNOWN',

									 'home.dest': 'UNKNOWN',

									 'pclass': '3rd',

									 'sex': 'male'}}