Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
209 views
in Technique[技术] by (71.8m points)

python - Assigning a value to a DataFrame column based on the value of a random variable

I have a DataFrame like this:

    df = pd.DataFrame(columns=['count', 'color'])

For each row that has count > 0, I want to assign 'red' to color if

    np.random.binomial(1,prob)==1

I know how to do it with a for loop. I also know that, if there wasn't this condition, I could assign the red color without a for loop, in this way:

    df.loc[df['count']>0, ['color']]='red'

Is it possible to have both the filter on count and the condition on prob without the for loop?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You can do this:

reds = np.where(d['counts']>1)[0]            # indices of red elements
probs = np.random.binomial(1,0.2, len(reds)) # probs to be assigned to red elements

# assign
d.loc[reds, "color"] = ['red' if i==1 else 'blue' for i in probs]

I am assuming data like the one generated here.

a = np.random.randint(0,10,100)
b = ['blue']*100
d = pd.DataFrame(np.vstack([a,b]).T, columns=['counts','color'])
d.loc[:,'counts'] = d.loc[:,'counts'].astype(int)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...