Python: Pandas Dataframe -- Convert String Time Column in mm:ss Format to Total Minutes in Float Format(Python：Pandas Dataframe--将mm：ss格式的字符串时间列转换为浮点格式的总分钟数)-6ren

Python: Pandas Dataframe -- Convert String Time Column in mm:ss Format to Total Minutes in Float Format(Python：Pandas Dataframe--将mm：ss格式的字符串时间列转换为浮点格式的总分钟数)

转载作者：bug小助手更新时间：2023-10-25 21:27:58

Let's say I have a python dataframe with a time related column called "Time". Inside this column there are strings that represent minutes and seconds. For example, the first row value 125:19 represents 125 minutes and 19 seconds. Its datatype is a string.

假设我有一个具有与时间相关的列“time”的Python DataFrame。在该列中有表示分钟和秒的字符串。例如，第一行值125：19表示125分19秒。它的数据类型是一个字符串。

I want to convert this value to total minutes in a new column "Time_minutes". So 125:19 should become 125.316666666667 which should be a float datatype.

我想在一个新的列“Time_Minents”中将该值转换为总分钟数。因此，125：19应该变成125.316666666667，它应该是浮点数据类型。

Along a similar vein if the value is 0:00 then the corresponding "Time_minutes" column should show 0 (float datatype).

同样，如果值为0：00，则相应的“time_minins”列应该显示0(浮点型数据类型)。

I've done this in SQL using lambdas and index functions. But is there an easier/more straightforward way to do this in python?

我已经使用lambdas和索引函数在SQL中做到了这一点。但是，有没有一种更简单/更直接的方法来实现这一点呢？

更多回答

优秀答案推荐

One of possible solution, use .str.split:

一种可能的解决方案是使用.str.Split：

df["Converted"] = (s := df["Time"].str.split(":")).str[0].astype(float) + (s.str[1].astype(float) / 60)
print(df)

Prints:

打印：

     Time   Converted
0  125:19  125.316667
1    0:00    0.000000
2    0:30    0.500000

Option 1

选项1

If performance is a concern and you are certain that each string ends with ":ss", you can slice Series.str with [:-3] and [-2:] respectively, apply Series.astype for conversion to float and chain Series.div for the second instance for division by 60.

如果性能是个问题，并且您确定每个字符串都以“：ss”结尾，则可以分别使用[：-3]和[-2：]对Series.str进行切片，将Series.astype应用于Float，并为第二个实例应用Chain Series.div以除以60。

import pandas as pd

data = {'Time': ['123:19','0:00','0:30']}
df = pd.DataFrame(data)
                          
df['Time_minutes'] = (df['Time'].str[:-3].astype(float) +
                      df['Time'].str[-2:].astype(float).div(60))

df
     Time  Time_minutes
0  123:19    123.316667
1    0:00      0.000000
2    0:30      0.500000

This will be faster than any option with Series.split.

这将比使用Series.Split的任何选项都要快。

Option 2

备选案文2

Alternatively, relying on Series.split, you can set the expand parameter to True, which will return the result as a pd.DataFrame. Now, you can divide by [1, 60], leaving the first column (i.e., the integer (or "minutes") part) unchanged through division by 1, and then apply df.sum on axis=1.

或者，根据Series.Split，您可以将Expand参数设置为True，这将以pd.DataFrame形式返回结果。现在，您可以除以[1，60]，通过除以1来保持第一列(即，整数(或“分钟”)部分)不变，然后在轴上应用df.sum=1。

df['Time_minutes'] = (df['Time'].str.split(':', expand=True)
                      .astype(float).div([1, 60]).sum(axis=1))

Option 3

备选方案3

A slightly faster variation on "Option 2" would be to apply df.pipe to the result of Series.split with expand=True and work with its column 0 and 1 inside a lambda function.

“选项2”的一个稍微快一点的变化是，将df.tube应用于Series.Split的结果，并使用Expand=True，并在lambda函数中使用它的列0和1。

df['Time_minutes'] = (df['Time'].str.split(':', expand=True)
                      .pipe(lambda x: x[0].astype(float) + 
                            x[1].astype(float).div(60)))

In both cases you would benefit from avoiding the need to create an intermediate variable, such as s in the answer by @AndrejKesely. Both options are also marginally faster.

在这两种情况下，您都将受益于避免创建中间变量的需要，例如@AndrejKesely在答案中的S。这两种选择也都略快一些。

Performance comparison

性能比较

import timeit

mysetup = """
import pandas as pd
import numpy as np

np.random.seed(1)

data = {'Time': (np.random.rand(1_000)*100).round(2)}
df = pd.DataFrame(data)
df['Time'] = (df['Time'].apply(lambda x: "{:.2f}".format(x))
              .str.replace('.',':', regex=False))
"""

func_dict = {'Option 1 (slice)': "df['Time'].str[:-3].astype(float) + df['Time'].str[-2:].astype(float).div(60)",
             'Option 2 (expand)': "df['Time'].str.split(':', expand=True).astype(float).div([1, 60]).sum(axis=1)",
             'Option 3 (expand-pipe)': "df['Time'].str.split(':', expand=True).pipe(lambda x: x[0].astype(float) + x[1].astype(float).div(60))",
             'Option 4 (intermediate var)': '(s := df["Time"].str.split(":")).str[0].astype(float) + (s.str[1].astype(float) / 60)'}

for k, v in func_dict.items():
    print(f"{k}: {timeit.timeit(setup=mysetup, stmt=v, number=1_000)}")

# in seconds
Option 1 (slice): 1.1033934000879526
Option 2 (expand): 1.5235498000402004
Option 3 (expand-pipe): 1.456193899968639
Option 4 (intermediate var): 1.8184985001571476

更多回答

文章推荐： Quaternion.Lerp rotate too soon(四元数.Lerp轮换太快)

file-format - GNAT-GVD : not in executable format: File format not recognized
我在运行 GNU Visual Debugger 1.2.6 的 XP 虚拟机上尝试打开 Ada 文件 (.adb)，但不断出现以下错误: not in executable format: File
二郎 - io :format 's result/(formatting with io_lib:format/2)
我正在尝试获取 io:format/1 的输出结果。我知道io_lib中也有类似的函数，io_lib:format/2，但是输出不一样。事实上，它根本没有做任何事情。如果我尝试绑定(bind) i
format - .clang-format 有没有办法在一行函数之前中断？
我在 documentation 中找不到任何内容, 甚至 BreakBeforeBraces: Allman格式化我已经拆分的单行函数 void foo() { bar(); } 我想要类似的东西
formatting - Format.fprintf 中框的缩进
请考虑函数f: open Format let rec f i = match i with | x when x () | i -> pp_open_hovbox std_form
format - cl :format 列表中每三个单词后换行
如何在列表中的每三个参数后添加一个回车符(使用 ~%)？例如，我现在: (format nil "~{~a ~}" (list '"one" '"two" '"three" '"four" '"fi
java - 继承代码 : To format or not to format?
关闭。这个问题是opinion-based .它目前不接受答案。想要改进这个问题？更新问题，以便 editing this post 可以用事实和引用来回答它. 关闭 6 年前。 Improve
c - fprintf，错误 : format not a string literal and no format arguments [-Werror=format-security
当我尝试在 Ubuntu 上编译 fprintf(stderr,Usage) 时，我遇到了这个错误: error: format not a string literal and no format
python - OpenCV 错误 : Unsupported format or combination of formats (Unsupported combination of input and output formats) in getRectSubPix
运行 cv2.getRectSubPix(img, (5,5), (0,0)) 抛出错误: OpenCV Error: Unsupported format or combination of for
android - cocos2d-x-2.1.4 : error: format not a string literal and no format arguments [-Werror=format-security]
我正在 cocos2d-x-2.1.4 上开发游戏，但是，当我尝试在 Android 上构建它时，它失败并出现错误:格式不是字符串文字且没有格式参数 [-Werror=format-安全] 在文件 C
clang-format - .clang-format 末尾的省略号是什么意思？
运行时: $ clang-format -style=Google -dump-config > .clang-format 文件后附有省略号 (...)。 TabWidth: 8 Us
d - `std.format.format!` 的纯版本？
我想在纯函数中将 double 型转换为字符串。我很困惑为什么这不是纯粹的: wstring fromNumber(double n) pure { import std.format;
format - Common Lisp `format` 实现
Common Lisp 的 format 是否有一个特别容易阅读的实现？我找到了 SBCL's version ，但由于 SBCL 以性能 Common Lisp 实现而著称，我想知道是否有一个更
java - 如何使用 "Out.Format(Format,String Var)"错误 "The method format(String, String) is undefined for the type JspWrite"在 JSP 页面格式化字符串？
嗨，我正在尝试在 JSP 页面上格式化字符串，它给了我错误，正如我在标题中提到的，我的代码是， String header=""; header = 12-29-2011 15;
clang-format - 你能设置 clang-format 的行长吗？
clang-format 将我的行拆分为 80 列。有没有办法让停止断线？ documentation似乎没有解决这个问题。最佳答案负责它的配置选项称为 ColumnLimit .您可以通过将其设
Angular库编译时错误 "Invariant violated: No format-path or format"
我有一个Angular 11项目，试图集成SpreadJS Designer，但在ngcc步骤Compiling @grapecity/spread-sheets-designer-angular :
clang-format - 如何在C99中将指定的初始值设定项与 clang-format 对齐？
我正在使用 clang-format 4.0.0来对齐我的个人项目。我将以下配置用于 clang-format 。 Language: Cpp BreakBeforeBraces: A
C程序警告: format not a string literal and no format arguments
我正在使用- char str[200]; ... sprintf(str,"%s", val) msg(str); sprintf(str, "%s: %s",timestr,"\n recv -"
c# - string.format(format,doubleValue) ，精度丢失
我有这个 double 值: var value = 52.30298270000003 当我将它转换为 string 时，它失去了它的精度: var str = string.Format("{0}
format - 如何在 Lisp 中使用 FORMAT 输出波形符？
关闭。这个问题不符合Stack Overflow guidelines .它目前不接受答案。这个问题似乎与 help center 中定义的范围内的编程无关。 . 关闭 8 年前。 Improve
clang-format - 在#pragma 中左对齐冒号 clang-format
如何使用 clang-format 始终将冒号左对齐。我不希望它被禁用:1234，但禁用:1234。 #pragma warning(disable: 1234) 最佳答案我猜你需要这个。 Spac

bug小助手

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

Python: Pandas Dataframe -- Convert String Time Column in mm:ss Format to Total Minutes in Float Format(Python：Pandas Dataframe--将mm：ss格式的字符串时间列转换为浮点格式的总分钟数)