Each Answer to this Q is separated by one/two green lines.
patient_id test_result has_cancer 0 79452 Negative False 1 81667 Positive True 2 76297 Negative False 3 36593 Negative False 4 53717 Negative False 5 67134 Negative False 6 40436 Negative False
how to count False or True in a column , in python?
I had been trying:
# number of patients with cancer number_of_patients_with_cancer= (df["has_cancer"]==True).count() print(number_of_patients_with_cancer)
So you need
df.col_name.value_counts() Out: False 6 True 1 Name: has_cancer, dtype: int64
has_cancer has NaNs:
false_count = (~df.has_cancer).sum()
has_cancer does not have NaNs, you can optimise by not having to negate the masks beforehand.
false_count = len(df) - df.has_cancer.sum()
And similarly, if you want just the count of True values, that is
true_count = df.has_cancer.sum()
If you want both, it is
fc, tc = df.has_cancer.value_counts().sort_index().tolist()
0 True 1 False 2 False 3 False 4 False 5 False 6 False 7 False 8 False 9 False
If the panda series above is called example
Then this code outputs 1 since there is only one
True value in the series. To get the count of
len(example) - example.sum()
number_of_patients_with_cancer = df.has_cancer[df.has_cancer==True].count()
Just sum the column for a count of the Trues. False is just a special case of 0 and True a special case of 1. The False count would be your row count minus that. Unless you’ve got
na‘s in there.
Consider your above data frame as a df
True_Count = df[df.has_cancer == True] len(True_Count)