python - Pandas groupby aggregation with percentages - OStack Q&A-Knowledge Sharing Community

I have the following dataframe:

import pandas as pd
import numpy as np
np.random.seed(123)
n = 10
df = pd.DataFrame({"val": np.random.randint(1, 10, n), 
                   "cat": np.random.choice(["X", "Y", "Z"], n)})

   val cat
0    3   Z
1    3   X
2    7   Y
3    2   Z
4    4   Y
5    7   X
6    2   X
7    1   X
8    2   X
9    1   Y

I want to know the percentage each category X, Y, and Z has of the entire val column sum. I can aggregate df like this:

total_sum = df.val.sum()
#32
s = df.groupby("cat").val.sum().div(total_sum)*100

#this is the desired result in % of total val
cat
X    46.875  #15/32
Y    37.500  #12/32
Z    15.625  #5/32
Name: val, dtype: float64

However, I find it rather surprising that pandas seemingly does not have a percentage/frequency function something like df.groupby("cat").val.freq() instead of df.groupby("cat").val.sum() or df.groupby("cat").val.mean(). I assumed this is a common operation, and Series.value_counts has implemented this with normalize=True - but for groupby aggregation, I cannot find anything similar. Am I missing here something or is there indeed no out-of-the-box function?

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

Categories

python - Pandas groupby aggregation with percentages

python - Pandas groupby aggregation with percentages

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags