gpt4 book ai didi

python - 在字符串上调用 str() 的成本?

转载 作者:太空狗 更新时间:2023-10-29 18:20:52 25 4
gpt4 key购买 nike

在已经是字符串的对象上调用 str 函数的成本(如果有)是多少?这里的用例是规范化不同类型的对象数组并将它们转换为字符串,天真地可以这样实现:

def arr_2_strarr(arr):
return [str(val) for val in arr]

但是如果 str() 导致过多的开销,而我的 arr 主要包含字符串,我可能会考虑使用:

def arr_2_strarr2(arr):
return [str(val) if not isinstance(val, basestring) else val for val in arr]

有什么建议吗?

最佳答案

在字符串对象上调用 str 非常简单:它只返回原始字符串对象。显式调用 isinstance 肯定会更慢。

如果您想在真实数据上对此进行测试,请查看 timeit 模块。

顺便说一句,你应该从第二个版本中删除 not

[val if isinstance(val, basestring) else str(val) for val in arr]

你可以通过缓存 str 稍微加快速度:

def arr_2_strarr(arr, str=str):
return [str(val) for val in arr]

快乐的微优化。 :)


为什么要缓存 str?好吧,每次你使用一个名字,Python 都必须寻找它。如果您在函数内部,它首先会在本地命名空间中查找,如果找不到名称,则会在全局变量中查找。尽管 str 是内置的,但它仍然“存在于”全局命名空间中;将所有内置函数“导入”到每个函数中效率很低。通过做

def arr_2_strarr(arr, str=str)

我们创建了一个本地名称 str 绑定(bind)到内置的 str 类型,因为它是搜索和绑定(bind)过程发生一次的默认参数,当执行函数定义,而不是每次调用函数时都执行。

所以每次我们调用 arr_2_strarr 时,解释器都会立即找到本地 str,这将节省少量时间。


下面是一些比较各种策略的timeit 代码。它在 Python 2 和 Python 3 上运行,尽管在 Python 3 上它用 str 代替 basestr,因为 basestr 在 Python 3 中不存在.

此代码首先使用整数数据在各种大小的列表上运行函数,然后使用通过将整数数据转换为字符串创建的字符串数据。

每一行输出都给出了在 3 次重复中执行给定循环次数的时间,从最快到最慢排序。正如 timeit repeat docs 提到的,每次运行中要查看的主要数字是最小的。

给定列表大小和类型上所有函数的结果也从最快到最慢排序。

''' Compare the speeds of direct string conversion
with testing first via isinstance

See https://stackoverflow.com/q/44439323/4014959

Written by PM 2Ring 2017.06.09

Python 2 / 3 compatible
'''

from __future__ import print_function, division
from timeit import Timer
import sys

# Python 3 doesn't have basestring
if sys.version_info[0] > 2:
basestring = str

# The functions to test
def plain(arr):
return [str(val) for val in arr]

def cached(arr, str=str):
return [str(val) for val in arr]

def teststr(arr):
return [val if isinstance(val, str) else str(val) for val in arr]

def testbase(arr):
return [val if isinstance(val, basestring) else str(val) for val in arr]

def testbasenot(arr):
return [str(val) if not isinstance(val, basestring) else val for val in arr]

funcs = (
plain,
cached,
teststr,
testbase,
testbasenot,
)

# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

def verify(arr):
results = [func(arr) for func in funcs]
first, results = results[0], results[1:]
return all(first == u for u in results)

def time_test(loops, reps):
''' Print timing stats for all the functions '''
timings = []
for func in funcs:
fname = func.__name__
setup = 'from __main__ import arr, ' + fname
cmd = fname + '(arr)'
t = Timer(cmd, setup)
result = t.repeat(reps, loops)
result.sort()
timings.append((result, fname))

timings.sort()
for result, fname in timings:
print('{0:12} {1}'.format(fname, result))

# Check that all functions return the same results
if 0:
print('Testing all functions')
arr = list(range(10))
print(arr, verify(arr))
arr = list('abcdefghij')
print(arr, verify(arr))

# Do the timing tests
reps = 3
loops = 1 << 16
for i in range(1, 11):
n = 1 << i
# Build a data array of integers
arr = range(n)
print('\n{0}: Size={1}, Loops={2}'.format(i, n, loops))
print('* Integer')
time_test(loops, reps)

# Convert the data array contents to strings
arr = cached(arr)
print('\n* String')
time_test(loops, reps)
loops >>= 1

典型的 Python 2 输出

1: Size=2, Loops=65536
* Integer
cached [0.17268610000610352, 0.19634914398193359, 0.2058720588684082]
plain [0.17906594276428223, 0.18797492980957031, 0.24009895324707031]
teststr [0.32513308525085449, 0.33270597457885742, 0.35080599784851074]
testbasenot [0.32793092727661133, 0.33176803588867188, 0.33498501777648926]
testbase [0.32964491844177246, 0.33154511451721191, 0.33760714530944824]

* String
cached [0.1619560718536377, 0.1628870964050293, 0.16448402404785156]
teststr [0.16335082054138184, 0.16484308242797852, 0.17012500762939453]
plain [0.16956901550292969, 0.1711430549621582, 0.18457293510437012]
testbase [0.22378706932067871, 0.2255101203918457, 0.22593879699707031]
testbasenot [0.22855901718139648, 0.22941207885742188, 0.23271608352661133]

2: Size=4, Loops=32768
* Integer
cached [0.12796807289123535, 0.12807202339172363, 0.12817001342773438]
plain [0.13622713088989258, 0.14297294616699219, 0.14868402481079102]
teststr [0.27701020240783691, 0.27812099456787109, 0.2795259952545166]
testbasenot [0.27815794944763184, 0.28220701217651367, 0.29373884201049805]
testbase [0.2804868221282959, 0.28186416625976562, 0.31699705123901367]

* String
cached [0.12131500244140625, 0.12241697311401367, 0.13379192352294922]
teststr [0.12839889526367188, 0.1314079761505127, 0.14053797721862793]
plain [0.13051795959472656, 0.14696002006530762, 0.18504786491394043]
testbase [0.18404412269592285, 0.1844489574432373, 0.19633579254150391]
testbasenot [0.18416285514831543, 0.18494606018066406, 0.18553614616394043]

3: Size=8, Loops=16384
* Integer
cached [0.10957002639770508, 0.11252093315124512, 0.11768913269042969]
plain [0.11848998069763184, 0.11958003044128418, 0.1292269229888916]
testbase [0.26231694221496582, 0.26471304893493652, 0.26625895500183105]
teststr [0.26410102844238281, 0.2641758918762207, 0.26569199562072754]
testbasenot [0.26910495758056641, 0.26967120170593262, 0.2741539478302002]

* String
cached [0.102294921875, 0.10357999801635742, 0.1050269603729248]
teststr [0.10852217674255371, 0.10861611366271973, 0.1127161979675293]
plain [0.11173510551452637, 0.11183404922485352, 0.12115597724914551]
testbasenot [0.16488981246948242, 0.16509699821472168, 0.16648602485656738]
testbase [0.16622614860534668, 0.16688108444213867, 0.16962814331054688]

4: Size=16, Loops=8192
* Integer
cached [0.10548806190490723, 0.10568594932556152, 0.10611891746520996]
plain [0.11526799201965332, 0.1160120964050293, 0.12486004829406738]
teststr [0.25309896469116211, 0.25549888610839844, 0.25838899612426758]
testbasenot [0.25410699844360352, 0.27252411842346191, 0.32510590553283691]
testbase [0.25414609909057617, 0.26968812942504883, 0.27393984794616699]

* String
cached [0.092885017395019531, 0.096045970916748047, 0.10643196105957031]
teststr [0.098433017730712891, 0.098783016204833984, 0.10051798820495605]
plain [0.10081005096435547, 0.10222005844116211, 0.12018895149230957]
testbasenot [0.15373396873474121, 0.15472292900085449, 0.15676999092102051]
testbase [0.15490198135375977, 0.15572404861450195, 0.15599799156188965]

5: Size=32, Loops=4096
* Integer
cached [0.10568094253540039, 0.10743498802185059, 0.1115870475769043]
plain [0.1163330078125, 0.11633419990539551, 0.12796401977539062]
teststr [0.25122308731079102, 0.26527810096740723, 0.26579189300537109]
testbase [0.25309586524963379, 0.25563716888427734, 0.25917816162109375]
testbasenot [0.25465011596679688, 0.25907588005065918, 0.26110982894897461]

* String
cached [0.085406064987182617, 0.086378097534179688, 0.08651280403137207]
teststr [0.092473983764648438, 0.09324193000793457, 0.093439817428588867]
plain [0.096549034118652344, 0.097501993179321289, 0.10462403297424316]
testbase [0.14794015884399414, 0.14966106414794922, 0.15016818046569824]
testbasenot [0.14796280860900879, 0.14940309524536133, 0.15308189392089844]

6: Size=64, Loops=2048
* Integer
cached [0.10838603973388672, 0.1089630126953125, 0.11129999160766602]
plain [0.11764693260192871, 0.11851096153259277, 0.12583494186401367]
teststr [0.2550208568572998, 0.25540995597839355, 0.26316595077514648]
testbase [0.25723910331726074, 0.25930881500244141, 0.26207089424133301]
testbasenot [0.25864100456237793, 0.25901007652282715, 0.26875495910644531]

* String
cached [0.086635112762451172, 0.087384939193725586, 0.099885940551757812]
plain [0.096493959426879883, 0.12469196319580078, 0.13684391975402832]
teststr [0.096681118011474609, 0.098448991775512695, 0.10569310188293457]
testbase [0.14573216438293457, 0.14696693420410156, 0.14700508117675781]
testbasenot [0.14776277542114258, 0.14852094650268555, 0.15462112426757812]

7: Size=128, Loops=1024
* Integer
cached [0.10915207862854004, 0.11011981964111328, 0.1127631664276123]
plain [0.11721491813659668, 0.11830401420593262, 0.1254270076751709]
testbase [0.25789499282836914, 0.26130795478820801, 0.26179313659667969]
teststr [0.25840306282043457, 0.25889492034912109, 0.26300287246704102]
testbasenot [0.26443600654602051, 0.26498103141784668, 0.26691412925720215]

* String
cached [0.083537101745605469, 0.084954023361206055, 0.086431980133056641]
teststr [0.091158866882324219, 0.09123992919921875, 0.091590166091918945]
plain [0.091225862503051758, 0.092115163803100586, 0.099261045455932617]
testbase [0.14569401741027832, 0.14622306823730469, 0.14650607109069824]
testbasenot [0.14774990081787109, 0.14930200576782227, 0.15020990371704102]

8: Size=256, Loops=512
* Integer
cached [0.10824894905090332, 0.10865211486816406, 0.10895800590515137]
plain [0.11750102043151855, 0.12690877914428711, 0.12890195846557617]
teststr [0.25457501411437988, 0.25542402267456055, 0.25692200660705566]
testbasenot [0.25513482093811035, 0.25664496421813965, 0.25999689102172852]
testbase [0.25680398941040039, 0.25924396514892578, 0.26179695129394531]

* String
cached [0.080662012100219727, 0.081827878952026367, 0.081900119781494141]
teststr [0.089673995971679688, 0.097939014434814453, 0.15471792221069336]
plain [0.094327926635742188, 0.095342159271240234, 0.097375154495239258]
testbasenot [0.14262199401855469, 0.14278602600097656, 0.14302182197570801]
testbase [0.14464497566223145, 0.14674210548400879, 0.16207790374755859]

9: Size=512, Loops=256
* Integer
cached [0.10789299011230469, 0.1092069149017334, 0.110015869140625]
plain [0.11702799797058105, 0.1181950569152832, 0.12698101997375488]
testbase [0.25504207611083984, 0.25520896911621094, 0.25734806060791016]
testbasenot [0.25715017318725586, 0.25747489929199219, 0.25850796699523926]
teststr [0.25783085823059082, 0.25882315635681152, 0.26154208183288574]

* String
cached [0.078849077224731445, 0.079813003540039062, 0.084489107131958008]
teststr [0.086745977401733398, 0.087059974670410156, 0.087485074996948242]
plain [0.088322877883911133, 0.088804960250854492, 0.097378969192504883]
testbasenot [0.14128994941711426, 0.14266705513000488, 0.1427910327911377]
testbase [0.14152097702026367, 0.14231991767883301, 0.14392399787902832]

10: Size=1024, Loops=128
* Integer
cached [0.10892415046691895, 0.11003899574279785, 0.11008000373840332]
plain [0.1192779541015625, 0.12048506736755371, 0.12956619262695312]
teststr [0.25335502624511719, 0.25642204284667969, 0.25892996788024902]
testbase [0.25525593757629395, 0.25550699234008789, 0.25794696807861328]
testbasenot [0.25932693481445312, 0.25960803031921387, 0.26134610176086426]

* String
cached [0.078451156616210938, 0.080369949340820312, 0.080511093139648438]
teststr [0.084844112396240234, 0.085949897766113281, 0.096578836441040039]
plain [0.086302042007446289, 0.087638139724731445, 0.096364974975585938]
testbase [0.14068913459777832, 0.14274501800537109, 0.15559101104736328]
testbasenot [0.14075493812561035, 0.15553092956542969, 0.19578790664672852]

典型的 python3 输出

1: Size=2, Loops=65536
* Integer
plain [0.2957206170030986, 0.2959696320031071, 0.2991539639988332]
cached [0.3058611470005417, 0.30598287599787, 0.3073535650000849]
testbase [0.38803433800057974, 0.39307209699836676, 0.393392562000372]
testbasenot [0.3888578799997049, 0.3951267439988442, 0.42909636100011994]
teststr [0.41290506400036975, 0.41541150199918775, 0.4488242949992127]

* String
testbase [0.23906823500146857, 0.23946705200069118, 0.24624350399972172]
testbasenot [0.24037985899849446, 0.24200722000023234, 0.2462738950016501]
plain [0.25742501500280923, 0.2644229819998145, 0.26711930600140477]
teststr [0.2635171010006161, 0.3559218000009423, 0.3784064870014845]
cached [0.2687887559986848, 0.2711959320004098, 0.38138879500183975]

2: Size=4, Loops=32768
* Integer
cached [0.21332427200104576, 0.21363574399947538, 0.21528891600246425]
plain [0.22395663199858973, 0.22762144099760917, 0.23422862100051134]
testbasenot [0.31939790100295795, 0.32413787499899627, 0.32422161499926005]
testbase [0.3209382370005187, 0.3213516770010756, 0.3215230670029996]
teststr [0.3372085839982901, 0.33786465500088525, 0.33847540900023887]

* String
testbasenot [0.17031173299983493, 0.17143720199965173, 0.17724975699820789]
testbase [0.170390128998406, 0.17118954800025676, 0.18865150499914307]
cached [0.18190538799899514, 0.18262020299880533, 0.183105569001782]
plain [0.18666503399799694, 0.18781541300268145, 0.1955128590016102]
teststr [0.18973677000030875, 0.19112570400102413, 0.19168143299975782]

3: Size=8, Loops=16384
* Integer
cached [0.17012267099926248, 0.18160372200145503, 0.2275817529989581]
plain [0.1890079689983395, 0.1963043950017891, 0.2016476179996971]
testbasenot [0.28168991999700665, 0.2821743839995179, 0.286649605997809]
testbase [0.28295213199817226, 0.28760008400058723, 0.2906435440017958]
teststr [0.2958552290001535, 0.2989299110013235, 0.31747390199961956]

* String
testbase [0.13354753000021446, 0.13377505199969164, 0.14039257600234123]
cached [0.1352838150014577, 0.1353432000032626, 0.13798289999976987]
testbasenot [0.14252334699995117, 0.14301740500013693, 0.1445914210016781]
plain [0.15130633899752866, 0.15166569000211894, 0.1616801599993778]
teststr [0.15267008800219628, 0.1545946529986395, 0.15590016200076207]

4: Size=16, Loops=8192
* Integer
cached [0.144755126999371, 0.14782401300180936, 0.1484048439997423]
plain [0.1726092749995587, 0.1740606339990336, 0.1815100200001325]
testbase [0.26685525399807375, 0.27029573199979495, 0.2716258750006091]
testbasenot [0.2702714350016322, 0.2723204169997189, 0.27288546099953237]
teststr [0.28515160999813816, 0.28523068700087606, 0.2878553769987775]

* String
cached [0.11515368599793874, 0.11579233700103941, 0.11688366999806021]
testbase [0.12178990400207113, 0.13090817400006927, 0.13304468899877975]
testbasenot [0.13121789299839293, 0.14976675499929115, 0.1521548589989834]
teststr [0.13410512400150765, 0.1354981399999815, 0.147247362001508]
plain [0.13691626099898713, 0.1384456069972657, 0.1426525679999031]

5: Size=32, Loops=4096
* Integer
cached [0.13246865899782279, 0.13320018100057496, 0.134628559997509]
plain [0.1636957459995756, 0.16763203899972723, 0.1752369269997871]
testbase [0.26010187700012466, 0.2606812570011243, 0.2647345440018398]
testbasenot [0.2620696090016281, 0.26230394700178294, 0.26258907899682526]
teststr [0.27685887300322065, 0.2787095199964824, 0.28293989099984174]

* String
cached [0.10246079200078384, 0.10416977099885116, 0.10755630499988911]
testbasenot [0.10829716499938513, 0.10918466699877172, 0.10935586699997657]
testbase [0.11739019699962228, 0.11808202800239087, 0.11899654000080773]
plain [0.12601002500014147, 0.12718953500007046, 0.13454839599944535]
teststr [0.13366336599938222, 0.13407608800116577, 0.13510101700012456]

6: Size=64, Loops=2048
* Integer
cached [0.12591946799875586, 0.127094235002005, 0.13223557899982552]
plain [0.160616523000499, 0.16232994500023779, 0.1691623620026803]
testbase [0.2534341589998803, 0.2556092949998856, 0.2571690379991196]
testbasenot [0.2560774869998568, 0.2574564010028553, 0.2606996459981019]
teststr [0.268248238000524, 0.2702014210008201, 0.27107579600124154]

* String
cached [0.09791737100022146, 0.09819723300097394, 0.10752435399990645]
testbasenot [0.1057888709983672, 0.10588572099732119, 0.16173565400094958]
testbase [0.10636284599968349, 0.1179599219976808, 0.12130766799964476]
plain [0.12285572399923694, 0.12589510299949325, 0.13114397300159908]
teststr [0.13122114399811835, 0.13273253399893292, 0.14575592999972287]

7: Size=128, Loops=1024
* Integer
cached [0.12404713899741182, 0.12496110600113752, 0.12496385000122245]
plain [0.15980284800025402, 0.16046370399999432, 0.16711239899814245]
testbasenot [0.25531527800194453, 0.25563639699976193, 0.2586420219995489]
testbase [0.25544935799916857, 0.2558138679996773, 0.257172014000389]
teststr [0.2699256220003008, 0.2712909309993847, 0.27702098800000385]

* String
cached [0.09376715399776003, 0.09393715400074143, 0.09975314399707713]
testbasenot [0.10510071799944853, 0.10511873200084665, 0.10523289399861824]
testbase [0.11240010600158712, 0.11325187799957348, 0.11632439300228725]
plain [0.12139380200096639, 0.12202585699924384, 0.1315958569975919]
teststr [0.12834531499902369, 0.12949470400053542, 0.12955383699954837]

8: Size=256, Loops=512
* Integer
cached [0.12225364700134378, 0.12283446399669629, 0.1285843859986926]
plain [0.15971405900199898, 0.16198832800000673, 0.16777605400056927]
testbase [0.2507534860014857, 0.2527904779999517, 0.25378678199922433]
testbasenot [0.25323686200135853, 0.2547167230004561, 0.25919888999851537]
teststr [0.2652072370001406, 0.2658402630004275, 0.2674206650008273]

* String
cached [0.0906629850032914, 0.0985801380011253, 0.09929232800277532]
testbase [0.10155730300175492, 0.1042869699995208, 0.11276149599871133]
testbasenot [0.10197166099897004, 0.11451221999959671, 0.15595895300066331]
plain [0.11898361400017166, 0.12018223199993372, 0.12760113599870238]
teststr [0.12645652200080804, 0.12671815700014122, 0.14095144699967932]

9: Size=512, Loops=256
* Integer
cached [0.12672984500022721, 0.1462409830019169, 0.2653043659993273]
plain [0.161721200998727, 0.17296033000093303, 0.19699998799842433]
testbase [0.25432757399903494, 0.25851125400004094, 0.258548003002943]
testbasenot [0.25619441399976495, 0.25656893900304567, 0.25998359599907417]
teststr [0.2719232039999042, 0.2744571339972026, 0.2751794379983039]

* String
cached [0.08841608199873008, 0.08848714099804056, 0.09124958899701596]
testbasenot [0.09962382599769626, 0.10016373899998143, 0.10028601600060938]
testbase [0.10713129000214394, 0.10752918499929365, 0.10952026399900205]
plain [0.1163020489984774, 0.12190789400119684, 0.1264930679972167]
teststr [0.1242994140011433, 0.12458201900153654, 0.12523995000083232]

10: Size=1024, Loops=128
* Integer
cached [0.12827690600170172, 0.1294701549995807, 0.13387694999983069]
plain [0.16636216699771467, 0.16866590399877168, 0.17549873600000865]
testbasenot [0.25435296399882645, 0.25515673799964134, 0.2605281959986314]
testbase [0.26351416900070035, 0.26398584699927596, 0.2651360300005763]
teststr [0.26816077799958293, 0.26908816800278146, 0.2715630999991845]

* String
cached [0.08827024300262565, 0.09090095799911069, 0.09729095900183893]
testbase [0.10063145499952952, 0.1010660120009561, 0.10904535399822635]
testbasenot [0.10313185999984853, 0.11444468399713514, 0.14796407999892836]
plain [0.11569941500056302, 0.11579339799936861, 0.12615105800068704]
teststr [0.12353994099976262, 0.12515813500067452, 0.13752399999793852]

这些计时是在一台相当老旧的 32 位单核 2GHz 机器上执行的,该机器具有 2GB RAM,运行在 Linux 的 Debian 衍生产品上。我使用的是 Python 2.6.6 和 Python 3.6.0。您的结果可能会有所不同。 ;) 无论如何,这些结果只能用作粗略的指导。 timeit 可以很好地只为我们想要计时的东西计时,但它无法控制其他也想使用 CPU 的进程。

关于python - 在字符串上调用 str() 的成本?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44439323/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com