gpt4 book ai didi

python - 如何删除除日期列之外的所有行都是 NaN 的地方?

转载 作者:行者123 更新时间:2023-12-04 08:54:14 25 4
gpt4 key购买 nike

我试图从我的 csv 文件中删除 NaN 值,但我只想删除所有列都为空的行。我要删除的行的图片附在下面。
文件链接:https://filebin.net/ou93iqiinss02l0g
enter image description here
基本上如果列 B、C、D、E、F、G、H 是 NaN,我删除整行
我尝试使用以下代码,但它删除了所有内容

import pandas as pd

df = pd.read_csv("testing.csv")
df = df.dropna(thresh = 7)
最终结果将如下所示
enter image description here
数据
,Open,High,Low,Close,Adj Close,Volume,Singapore
2015-10-01,2795.399902,3104.719971,2765.439941,2998.350098,2998.350098,0.0,
2015-11-01,2976.719971,3043.850098,2843.949951,2855.939941,2855.939941,0.0,
2015-12-01,2862.790039,2911.439941,2793.389893,2882.72998,2882.72998,0.0,
2016-01-01,2889.22998,2890.209961,2529.01001,2629.110107,2629.110107,0.0,
2016-02-01,2637.050049,2684.790039,2528.439941,2666.51001,2666.51001,0.0,
2016-03-01,2666.709961,2906.800049,2654.97998,2840.899902,2840.899902,0.0,
2016-04-01,2820.659912,2964.100098,2783.419922,2838.52002,2838.52002,158708700.0,
2016-05-01,2842.860107,2848.899902,2713.469971,2791.060059,2791.060059,0.0,
2016-06-01,2787.98999,2881.919922,2703.47998,2840.929932,2840.929932,0.0,
2016-07-01,2848.449951,2958.899902,2830.0,2868.689941,2868.689941,0.0,
2016-08-01,2875.590088,2898.27002,2810.8798829999996,2820.590088,2820.590088,0.0,
2016-09-01,2821.929932,2911.840088,2791.3798829999996,2869.469971,2869.469971,0.0,
2016-10-01,2879.850098,2901.72998,2783.330078,2813.8701170000004,2813.8701170000004,0.0,
2016-11-01,2814.080078,2915.419922,2760.969971,2905.169922,2905.169922,0.0,
2016-12-01,2913.649902,2980.77002,2857.909912,2880.76001,2880.76001,0.0,
2017-01-01,2887.0,3065.1298829999996,2869.659912,3046.800049,3046.800049,0.0,
2017-02-01,3045.939941,3138.969971,3030.649902,3096.610107,3096.610107,4018227800.0,
2017-03-01,3106.300049,3188.02002,3104.330078,3175.110107,3175.110107,5462555700.0,
2017-04-01,3180.27002,3189.810059,3113.899902,3175.439941,3175.439941,4292226700.0,
2017-05-01,3183.429932,3275.389893,3183.409912,3210.820068,3210.820068,5080433500.0,
2017-06-01,3214.1201170000004,3270.919922,3196.48999,3226.47998,3226.47998,4414015100.0,
2017-07-01,3228.909912,3354.709961,3196.139893,3329.52002,3329.52002,5085548600.0,
2017-08-01,3321.5,3349.090088,3244.22998,3277.26001,3277.26001,4856835500.0,
2017-09-01,3274.389893,3275.139893,3193.409912,3219.909912,3219.909912,3840282400.0,
2017-10-01,3233.949951,3392.149902,3230.810059,3374.080078,3374.080078,4261116400.0,
2017-11-01,3377.1899409999996,3449.320068,3341.300049,3433.540039,3433.540039,4789747800.0,
2017-12-01,3441.850098,3469.360107,3370.219971,3402.919922,3402.919922,3386126700.0,
2018-01-01,3406.4799799999996,3611.6899409999996,3403.8701170000004,3533.98999,3533.98999,4727173600.0,
2018-02-01,3536.929932,3574.5900880000004,3340.550049,3517.9399409999996,3517.9399409999996,6143735500.0,
2018-03-01,3493.4399409999996,3555.9799799999996,3382.780029,3427.969971,3427.969971,4963081900.0,
2018-04-01,3439.040039,3628.429932,3338.959961,3613.929932,3613.929932,4599803900.0,
2018-05-01,3624.1999509999996,3641.649902,3428.179932,3428.179932,3428.179932,5918362800.0,
2018-06-01,3423.5,3492.3400880000004,3237.77002,3268.699951,3268.699951,5500961400.0,
2018-07-01,3277.429932,3341.419922,3176.26001,3319.850098,3319.850098,5029346600.0,
2018-08-01,3331.050049,3347.97998,3187.830078,3213.47998,3213.47998,5005791600.0,
2018-09-01,3209.969971,3265.01001,3102.72998,3257.050049,3257.050049,4158150600.0,
2018-10-01,3262.429932,3272.8798829999996,2955.679932,3018.800049,3018.800049,5516696000.0,
2018-11-01,3045.679932,3132.419922,3007.310059,3117.610107,3117.610107,4457632700.0,
2018-12-01,3154.219971,3192.8798829999996,3000.449951,3068.76001,3068.76001,3627597800.0,
2019-01-01,3072.98999,3250.27002,2993.419922,3190.169922,3190.169922,4467841200.0,
2019-02-01,3194.219971,3286.080078,3174.0,3212.689941,3212.689941,3786000800.0,
2019-03-01,3210.840088,3251.719971,3156.790039,3212.8798829999996,3212.8798829999996,4128594600.0,
2019-04-01,3229.110107,3415.179932,3227.6201170000004,3400.1999509999996,3400.1999509999996,4447727600.0,
2019-05-01,3389.5200200000004,3397.179932,3110.51001,3117.76001,3117.76001,4319537800.0,
2019-06-01,3111.51001,3336.080078,3104.030029,3321.610107,3321.610107,4160448600.0,
2019-07-01,3339.580078,3386.649902,3299.889893,3300.75,3300.75,4489792100.0,
2019-08-01,3282.790039,3311.26001,3040.159912,3106.52002,3106.52002,5146051500.0,
2019-09-01,3092.25,3216.8701170000004,3074.040039,3119.98999,3119.98999,4116898900.0,
2019-10-01,3130.110107,3235.23999,3068.830078,3229.8798829999996,3229.8798829999996,4402690200.0,
2019-11-01,3227.600098,3285.719971,3182.050049,3193.919922,3193.919922,7055882400.0,
2019-12-01,3198.27002,3239.23999,3144.070068,3222.830078,3222.830078,4536740600.0,
2020-01-01,3230.47998,3283.889893,3144.100098,3153.72998,3153.72998,4951167700.0,
2020-02-01,3131.02002,3233.860107,3008.459961,3011.080078,3011.080078,5320489700.0,
2020-02-21,,,,,,,24.0
2020-02-25,,,,,,,
2020-02-28,,,,,,,22.0
2020-03-01,2988.350098,3047.790039,2208.419922,2481.22998,2481.22998,7767702900.0,
2020-03-02,,,,,,,
2020-03-03,,,,,,,
2020-03-06,,,,,,,23.0
2020-03-10,,,,,,,
2020-03-13,,,,,,,21.0
2020-03-17,,,,,,,
2020-03-20,,,,,,,24.0
2020-03-23,,,,,,,
2020-03-24,,,,,,,
2020-03-27,,,,,,,27.0
2020-03-30,,,,,,,
2020-03-31,,,,,,,
2020-04-01,2468.169922,2671.580078,2380.840088,2624.22998,2624.22998,7238328000.0,
2020-04-03,,,,,,,37.0
2020-04-06,,,,,,,
2020-04-07,,,,,,,
2020-04-10,,,,,,,73.0
2020-04-13,,,,,,,
2020-04-14,,,,,,,
2020-04-17,,,,,,,85.0
2020-04-20,,,,,,,
2020-04-21,,,,,,,
2020-04-24,,,,,,,90.0
2020-04-27,,,,,,,
2020-04-28,,,,,,,
2020-05-01,2555.669922,2611.73999,2489.939941,2510.75,2510.75,7367276100.0,90.0
2020-05-05,,,,,,,
2020-05-15,,,,,,,
2020-05-21,,,,,,,
2020-05-22,,,,,,,92.0
2020-05-25,,,,,,,
2020-05-26,,,,,,,
2020-05-30,,,,,,,
2020-06-01,2519.419922,2839.389893,2516.459961,2589.909912,2589.909912,8396435700.0,
2020-06-05,,,,,,,89.0
2020-06-08,,,,,,,
2020-06-15,,,,,,,
2020-06-16,,,,,,,
2020-06-19,,,,,,,92.0
2020-06-22,,,,,,,
2020-06-25,,,,,,,
2020-07-01,2604.080078,2707.669922,2511.02002,2529.820068,2529.820068,4876221500.0,
2020-07-03,,,,,,,
2020-07-06,,,,,,,
2020-07-07,,,,,,,90.0
2020-07-12,,,,,,,
2020-07-14,,,,,,,
2020-07-20,,,,,,,92.0
2020-07-26,,,,,,,
2020-07-27,,,,,,,
2020-07-31,,,,,,,
2020-08-01,2522.530029,2602.330078,2478.389893,2532.51001,2532.51001,6347053700.0,
2020-08-03,,,,,,,88.0
2020-08-07,,,,,,,
2020-08-10,,,,,,,
2020-08-12,,,,,,,
2020-08-14,,,,,,,90.0
2020-08-17,,,,,,,
2020-08-25,,,,,,,
2020-08-28,,,,,,,90.0
2020-08-31,,,,,,,
2020-09-01,2521.810059,2546.8701170000004,2476.820068,2490.090088,2490.090088,2000718800.0,
2020-09-11,2481.080078,2492.419922,2476.820068,2490.090088,2490.090088,0.0,

最佳答案

  • 使用 pandas.read_csv , 与 parse_datesindex_col设置为索引 0 处的未命名日期列。
  • .dropnahow='all' ,这将删除完全是 NaN 的任何行.不考虑索引,这就是将日期列设置为索引的原因。
  • 日期在技术上不必解析为日期时间,但这是财务数据,因此它应该采用正确的日期时间格式进行时间序列分析,并且因为它会正确绘制。日期列必须是索引才能轻松 .dropna以这种方式。

  • df = pd.read_csv('testing.csv', parse_dates=[0], index_col=0)

    # drop na
    df = df.dropna(how='all')

    # save file
    df.to_csv('test_updated.csv', index=True)

    # display(df)
    Open High Low Close Adj Close Volume Singapore
    2015-10-01 2795.39990 3104.71997 2765.43994 2998.35010 2998.35010 0.00000e+00 NaN
    2015-11-01 2976.71997 3043.85010 2843.94995 2855.93994 2855.93994 0.00000e+00 NaN
    2015-12-01 2862.79004 2911.43994 2793.38989 2882.72998 2882.72998 0.00000e+00 NaN
    2016-01-01 2889.22998 2890.20996 2529.01001 2629.11011 2629.11011 0.00000e+00 NaN
    2016-02-01 2637.05005 2684.79004 2528.43994 2666.51001 2666.51001 0.00000e+00 NaN
    2016-03-01 2666.70996 2906.80005 2654.97998 2840.89990 2840.89990 0.00000e+00 NaN
    2016-04-01 2820.65991 2964.10010 2783.41992 2838.52002 2838.52002 1.58709e+08 NaN
    2016-05-01 2842.86011 2848.89990 2713.46997 2791.06006 2791.06006 0.00000e+00 NaN
    2016-06-01 2787.98999 2881.91992 2703.47998 2840.92993 2840.92993 0.00000e+00 NaN
    2016-07-01 2848.44995 2958.89990 2830.00000 2868.68994 2868.68994 0.00000e+00 NaN
    2016-08-01 2875.59009 2898.27002 2810.87988 2820.59009 2820.59009 0.00000e+00 NaN
    2016-09-01 2821.92993 2911.84009 2791.37988 2869.46997 2869.46997 0.00000e+00 NaN
    2016-10-01 2879.85010 2901.72998 2783.33008 2813.87012 2813.87012 0.00000e+00 NaN
    2016-11-01 2814.08008 2915.41992 2760.96997 2905.16992 2905.16992 0.00000e+00 NaN
    2016-12-01 2913.64990 2980.77002 2857.90991 2880.76001 2880.76001 0.00000e+00 NaN
    2017-01-01 2887.00000 3065.12988 2869.65991 3046.80005 3046.80005 0.00000e+00 NaN
    2017-02-01 3045.93994 3138.96997 3030.64990 3096.61011 3096.61011 4.01823e+09 NaN
    2017-03-01 3106.30005 3188.02002 3104.33008 3175.11011 3175.11011 5.46256e+09 NaN
    2017-04-01 3180.27002 3189.81006 3113.89990 3175.43994 3175.43994 4.29223e+09 NaN
    2017-05-01 3183.42993 3275.38989 3183.40991 3210.82007 3210.82007 5.08043e+09 NaN
    2017-06-01 3214.12012 3270.91992 3196.48999 3226.47998 3226.47998 4.41402e+09 NaN
    2017-07-01 3228.90991 3354.70996 3196.13989 3329.52002 3329.52002 5.08555e+09 NaN
    2017-08-01 3321.50000 3349.09009 3244.22998 3277.26001 3277.26001 4.85684e+09 NaN
    2017-09-01 3274.38989 3275.13989 3193.40991 3219.90991 3219.90991 3.84028e+09 NaN
    2017-10-01 3233.94995 3392.14990 3230.81006 3374.08008 3374.08008 4.26112e+09 NaN
    2017-11-01 3377.18994 3449.32007 3341.30005 3433.54004 3433.54004 4.78975e+09 NaN
    2017-12-01 3441.85010 3469.36011 3370.21997 3402.91992 3402.91992 3.38613e+09 NaN
    2018-01-01 3406.47998 3611.68994 3403.87012 3533.98999 3533.98999 4.72717e+09 NaN
    2018-02-01 3536.92993 3574.59009 3340.55005 3517.93994 3517.93994 6.14374e+09 NaN
    2018-03-01 3493.43994 3555.97998 3382.78003 3427.96997 3427.96997 4.96308e+09 NaN
    2018-04-01 3439.04004 3628.42993 3338.95996 3613.92993 3613.92993 4.59980e+09 NaN
    2018-05-01 3624.19995 3641.64990 3428.17993 3428.17993 3428.17993 5.91836e+09 NaN
    2018-06-01 3423.50000 3492.34009 3237.77002 3268.69995 3268.69995 5.50096e+09 NaN
    2018-07-01 3277.42993 3341.41992 3176.26001 3319.85010 3319.85010 5.02935e+09 NaN
    2018-08-01 3331.05005 3347.97998 3187.83008 3213.47998 3213.47998 5.00579e+09 NaN
    2018-09-01 3209.96997 3265.01001 3102.72998 3257.05005 3257.05005 4.15815e+09 NaN
    2018-10-01 3262.42993 3272.87988 2955.67993 3018.80005 3018.80005 5.51670e+09 NaN
    2018-11-01 3045.67993 3132.41992 3007.31006 3117.61011 3117.61011 4.45763e+09 NaN
    2018-12-01 3154.21997 3192.87988 3000.44995 3068.76001 3068.76001 3.62760e+09 NaN
    2019-01-01 3072.98999 3250.27002 2993.41992 3190.16992 3190.16992 4.46784e+09 NaN
    2019-02-01 3194.21997 3286.08008 3174.00000 3212.68994 3212.68994 3.78600e+09 NaN
    2019-03-01 3210.84009 3251.71997 3156.79004 3212.87988 3212.87988 4.12859e+09 NaN
    2019-04-01 3229.11011 3415.17993 3227.62012 3400.19995 3400.19995 4.44773e+09 NaN
    2019-05-01 3389.52002 3397.17993 3110.51001 3117.76001 3117.76001 4.31954e+09 NaN
    2019-06-01 3111.51001 3336.08008 3104.03003 3321.61011 3321.61011 4.16045e+09 NaN
    2019-07-01 3339.58008 3386.64990 3299.88989 3300.75000 3300.75000 4.48979e+09 NaN
    2019-08-01 3282.79004 3311.26001 3040.15991 3106.52002 3106.52002 5.14605e+09 NaN
    2019-09-01 3092.25000 3216.87012 3074.04004 3119.98999 3119.98999 4.11690e+09 NaN
    2019-10-01 3130.11011 3235.23999 3068.83008 3229.87988 3229.87988 4.40269e+09 NaN
    2019-11-01 3227.60010 3285.71997 3182.05005 3193.91992 3193.91992 7.05588e+09 NaN
    2019-12-01 3198.27002 3239.23999 3144.07007 3222.83008 3222.83008 4.53674e+09 NaN
    2020-01-01 3230.47998 3283.88989 3144.10010 3153.72998 3153.72998 4.95117e+09 NaN
    2020-02-01 3131.02002 3233.86011 3008.45996 3011.08008 3011.08008 5.32049e+09 NaN
    2020-02-21 NaN NaN NaN NaN NaN NaN 24.0
    2020-02-28 NaN NaN NaN NaN NaN NaN 22.0
    2020-03-01 2988.35010 3047.79004 2208.41992 2481.22998 2481.22998 7.76770e+09 NaN
    2020-03-06 NaN NaN NaN NaN NaN NaN 23.0
    2020-03-13 NaN NaN NaN NaN NaN NaN 21.0
    2020-03-20 NaN NaN NaN NaN NaN NaN 24.0
    2020-03-27 NaN NaN NaN NaN NaN NaN 27.0
    2020-04-01 2468.16992 2671.58008 2380.84009 2624.22998 2624.22998 7.23833e+09 NaN
    2020-04-03 NaN NaN NaN NaN NaN NaN 37.0
    2020-04-10 NaN NaN NaN NaN NaN NaN 73.0
    2020-04-17 NaN NaN NaN NaN NaN NaN 85.0
    2020-04-24 NaN NaN NaN NaN NaN NaN 90.0
    2020-05-01 2555.66992 2611.73999 2489.93994 2510.75000 2510.75000 7.36728e+09 90.0
    2020-05-22 NaN NaN NaN NaN NaN NaN 92.0
    2020-06-01 2519.41992 2839.38989 2516.45996 2589.90991 2589.90991 8.39644e+09 NaN
    2020-06-05 NaN NaN NaN NaN NaN NaN 89.0
    2020-06-19 NaN NaN NaN NaN NaN NaN 92.0
    2020-07-01 2604.08008 2707.66992 2511.02002 2529.82007 2529.82007 4.87622e+09 NaN
    2020-07-07 NaN NaN NaN NaN NaN NaN 90.0
    2020-07-20 NaN NaN NaN NaN NaN NaN 92.0
    2020-08-01 2522.53003 2602.33008 2478.38989 2532.51001 2532.51001 6.34705e+09 NaN
    2020-08-03 NaN NaN NaN NaN NaN NaN 88.0
    2020-08-14 NaN NaN NaN NaN NaN NaN 90.0
    2020-08-28 NaN NaN NaN NaN NaN NaN 90.0
    2020-09-01 2521.81006 2546.87012 2476.82007 2490.09009 2490.09009 2.00072e+09 NaN
    2020-09-11 2481.08008 2492.41992 2476.82007 2490.09009 2490.09009 0.00000e+00 NaN
    绘图
  • 该图使用 pandas.DataFrame.plot , 使用 matplotlib作为默认绘图引擎
  • 请注意,这不是在 NaN 值之间画线,所以 dropna添加用于绘制情节。

  • 不作图 Volume值,因为比例(y 值)要大得多。
  • 'Singapore'单独绘制,因为它的值较低且数据点很少,所以它看起来像一条线图很有趣。

  • import matplotlib.pyplot as plt

    fig, (ax1, ax2) = plt.subplots(nrows=2, figsize=(9, 10))

    df[['Open', 'High', 'Low', 'Close', 'Adj Close']].dropna().plot(ax=ax1)
    ax2.scatter(df.index, 'Singapore', data=df, label='Singapore')
    ax2.legend()
    plt.show()
    enter image description here

    关于python - 如何删除除日期列之外的所有行都是 NaN 的地方?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/63922425/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com