引言
dataframe
是pandas的数据类型;
ndarray
是numpy的数据类型;
list
和dict
是python的数据类型;
series
是pandas的一种数据类型,Series是一个定长的,有序的字典,因为它把索引和值映射起来了。
通过以下例子,可以更加清楚它们的数据表示。
1. list to others
import numpy as np
import pandas as pd
from pandas import Series, DataFrame
# list
data = [[2000, 'Ohino', 1.5],[2001, 'Ohino', 1.7],[2002, 'Ohino', 3.6],[2001, 'Nevada', 2.4],[2002, 'Nevada', 2.9]] # type(data) 为 list# list to series
ser = Series(data, index = ['one', 'two', 'three', 'four', 'five'])# list to dataframe
df = DataFrame(data, index = ['one', 'two', 'three', 'four', 'five'], columns = ['year', 'state', 'pop'])# list to array
ndarray = np.array(data)
运行结果:
# Series
one [2000, Ohino, 1.5]
two [2001, Ohino, 1.7]
three [2002, Ohino, 3.6]
four [2001, Nevada, 2.4]
five [2002, Nevada, 2.9]
dtype: object
# dataframeyear state pop
one 2000 Ohino 1.5
two 2001 Ohino 1.7
three 2002 Ohino 3.6
four 2001 Nevada 2.4
five 2002 Nevada 2.9
# ndarray
[['2000' 'Ohino' '1.5']['2001' 'Ohino' '1.7']['2002' 'Ohino' '3.6']['2001' 'Nevada' '2.4']['2002' 'Nevada' '2.9']]
2. ndarray to others
# array to dataframe
pd = DataFrame(ndarray, index = ['one', 'two', 'three', 'four',
'five'],
columns = ['year', 'state', 'pop'])
# ndarray to list
mylist = ndarray.tolist()
3. dict to others
import numpy as np
import pandas as pd
from pandas import Series, DataFrame# dict
data = {
'name': ['Li', 'Zhang', 'Wang'],'year': [2000, 2001, 2002]} # type(data) 为 dict
# dict to series
# 若不指定 index,data 的 key 充当 Series 的 index
ser = Series(data)
print('ser\n', ser)# dict to dataframe
# 若不指定 columns,data 的 key 充当 DataFrame 的 columns
df = DataFrame(data)
print('df\n', df)
4. Series to others
如果把DataFrame取一列就是Series格式了。
# series to np array
# 需要pandas version 0.24以上
arr = ser.to_numpy()
# 或者
arr = np.array(ser)
# Series转换成dict
dt = ser.to_dict()
5. DataFrame to others
# dataframe
data = DataFrame(np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]), columns=['a', 'b', 'c'])
print(data)# dataframe to array
arr = data.values
print('arr\n', arr)
print(type(arr))# dataframe to dict
dict = data.to_dict()
print(dict)
DataFrame.to_dict(self, orient='dict', into=<class 'dict'>)
还可以转换成 list,series
等:
orient : str {
‘dict’, ‘list’, ‘series’, ‘split’, ‘records’, ‘index’}
Determines the type of the values of the dictionary.‘dict’ (default) : dict like {
column -> {
index -> value}}
‘list’ : dict like {
column -> [values]}
‘series’ : dict like {
column -> Series(values)}
‘split’ : dict like {
‘index’ -> [index], ‘columns’ -> [columns], ‘data’ -> [values]}
‘records’ : list like [{
column -> value}, … , {
column -> value}]
‘index’ : dict like {
index -> {
column -> value}}
Abbreviations are allowed. s indicates series and sp indicates split.
最近开通了个公众号,主要分享python原理与应用,推荐系统,风控等算法相关的内容,感兴趣的伙伴可以关注下。
公众号相关的学习资料会上传到QQ群596506387,欢迎关注。
参考:
- Yam_ List, Dict, Array, Series, DataFrame 相互转换;
- dataframe to_dict() ;
- pandas dataframe;
- pandas Series;