使用函数获取每列中的平均值/中位数/众数/四分位数/分位数_python

我是 jupyter notebook 的新手，想知道如何在函数中获取列的分位数：

数据框：

num_likes | num_post | ... | 
464.0     | 142.0    | ... |
364.0     | 125.0    | ... |
487.0     | 106.0    | ... |
258.0     | 123.0    | ... |
125.0     | 103.0    | ... |

我的功能：

def myFunction(x):
    q22 = dataframe["num_likes"].quantile(0.22)
    q45 = dataframe["num_likes"].quantile(0.45)
    qc = q45 - q22
    k = 3

    if x >= q45 + k * qc:
        return q45 + k * qc
    elif x <= q22 - k * qc:
        return q22 - k * qc

现在，由于我不知道如何获得它，我最终为我拥有的每一列运行了该函数。 另外，我尝试运行它，但它似乎不起作用

data["num_likes"].apply(lambda x : myFunction(x))[:5]

此外，结果似乎是错误的，因为我没有看到任何回报

    num_likes | num_post | ... | 
    NaN       | None     | ... |
    NaN       | None     | ... |
    NaN       | None     | ... |
    NaN       | None     | ... |
    NaN       | None     | ... |

你得到None的原因是因为你的if-elseif块的路径没有返回 true 所以myFunction返回None 。 你是说if-else吗？

除此之外，为了清理你拥有的东西，我会做一些不同的事情。 首先 q22、q45 和 qc 只需要计算一次（基于上述逻辑），这些可以传递到函数中，而不是每次在函数中计算。 其次，在这种情况下您不需要创建lambda ， apply ( ) 接受一个 python 可调用（您的函数），并且可以传递如下附加参数。

df = pd.DataFrame({
    'num_likes': [464.0, 364.0, 487.0, 258.0, 125.0],
    'num_post': [142.0, 125.0, 106.0, 123.0, 103.0]
})

def myFunction(x, q22, q45, qc):
    k = 3

    if x >= q45 + k * qc:
        return q45 + k * qc
    elif x <= q22 - k * qc:
        return q22 - k * qc
    else:
        return -1

q22 = df["num_likes"].quantile(0.22)
q45 = df["num_likes"].quantile(0.45)
qc = q45 - q22

# pass additional arguments in an tuple, they will be passed to myFunction
df.num_likes.apply(myFunction, args=(q22, q45, qc))

# this will return a series which can be assigned to new column
# 0   -1
# 1   -1
# 2   -1
# 3   -1
# 4   -1
# Name: num_likes, dtype: int64

使用函数获取每列中的平均值/中位数/众数/四分位数/分位数

问题描述

1楼