这是我的数据集的dtypes
,df.dtypes
customer_id int64
device_id object
...
email object
email_counts object
...
white_collar_count object
dtype: object
我正在将所有内容转换为数字df = df.convert_objects(convert_numeric=True)
然后 df.dtypes
是
customer_id int64
device_id float64
...
email float64
email_counts float64
...
white_collar_count float64
dtype: object
我想在 e-mail
和 device_id
上做异常(exception),所以 df.dtypes
是
customer_id int64
device_id object
...
email object
email_counts float64
...
white_collar_count float64
dtype: object
使用difference
用于过滤掉列表中的列:
feature_exist = pd.DataFrame({'A':list('abcdef'),
'B':[4,5,4,5,5,4],
'C':[7,8,9,4,2,3],
'D':[1,3,5,7,1,0],
'email':[5,3,6,9,2,4],
'F':list('aaabbb')}).astype(str)
print (feature_exist)
A B C D email F
0 a 4 7 1 5 a
1 b 5 8 3 3 a
2 c 4 9 5 6 a
3 d 5 4 7 9 b
4 e 5 2 1 2 b
5 f 4 3 0 4 b
cols = feature_exist.columns.difference(['email'])
feature_exist[cols] = feature_exist[cols].convert_objects(convert_numeric=True)
print (feature_exist.dtypes)
A object
B int64
C int64
D int64
email object
F object
dtype: object
我是一名优秀的程序员,十分优秀!