gpt4 book ai didi

python - 创建具有特殊设置功能的自定义 Counter 对象

转载 作者:太空宇宙 更新时间:2023-11-03 15:25:26 24 4
gpt4 key购买 nike

来自Adding a single character to add keys in Counter ,@AshwiniChaudhary 给出了一个很好的答案,使用不同的 set() 函数创建一个新的 Counter 对象:

from collections import Counter

class CustomCounter(Counter):
def __setitem__(self, key, value):
if len(key) > 1 and not key.endswith(u"\uE000"):
key += u"\uE000"
super(CustomCounter, self).__setitem__(key, value)

为了允许用户定义的 char/str 附加到键,我尝试过:

from collections import Counter, defaultdict

class AppendedStrCounter(Counter):
def __init__(self, str_to_append):
self._appended_str = str_to_append
super(AppendedStrCounter, self).__init__()
def __setitem__(self, key, value):
if len(key) > 1 and not key.endswith(self._appended_str):
key += self._appended_str
super(AppendedStrCounter, self).__setitem__(tuple(key), value)

但它返回一个空计数器:

>>> class AppendedStrCounter(Counter):
... def __init__(self, str_to_append):
... self._appended_str = str_to_append
... super(AppendedStrCounter, self).__init__()
... def __setitem__(self, key, value):
... if len(key) > 1 and not key.endswith(self._appended_str):
... key += self._appended_str
... super(AppendedStrCounter, self).__setitem__(tuple(key), value)
...
>>> AppendedStrCounter('foo bar bar blah'.split())
AppendedStrCounter()

那是因为我在 __init__() 中缺少 iter:

from collections import Counter, defaultdict

class AppendedStrCounter(Counter):
def __init__(self, iter, str_to_append):
self._appended_str = str_to_append
super(AppendedStrCounter, self).__init__(iter)
def __setitem__(self, key, value):
if len(key) > 1 and not key.endswith(self._appended_str):
key += self._appended_str
super(AppendedStrCounter, self).__setitem__(tuple(key), value)

[输出]:

>>> AppendedStrCounter('foo bar bar blah'.split(), u'\ue000')
AppendedStrCounter({('f', 'o', 'o', '\ue000'): 1, ('b', 'a', 'r', '\ue000'): 1, ('b', 'l', 'a', 'h', '\ue000'): 1})

但是'bar'的值是错误的,它应该是2而不是1。

__init__() 中使用 iter 是初始化 Counter 的正确方法吗?

最佳答案

正如指出的那样 Felix's comment , collections.Counter没有记录其 __init__ 方法如何添加键或设置值,仅记录它的作用。由于它没有明确设计用于子类化,因此最明智的做法是它的子类化。

collections.abc模块的存在是为了提供Python内置类型的易于子类化的抽象类,包括dict(MutableMapping,用 ABC 术语来说)。因此,如果您需要的只是“类似计数器的类”(与“满足 isinstanceissubclass 等内置函数的 Counter 子类相反),您可以创建自己的 MutableMapping,它有一个 Counter,然后是“中间人”初始化程序以及 Counter 添加到典型的字典:

import collections
import collections.abc


def _identity(s):
'''
Default mutator function.
'''
return s


class CustomCounter(collections.abc.MutableMapping):
'''
Overrides the 5 methods of a MutableMapping:
__getitem__, __setitem__, __delitem__, __iter__, __len__

...and the 3 non-Mapping methods of Counter:
elements, most_common, subtract
'''

def __init__(self, values=None, *, mutator=_identity):
self._mutator = mutator
if values is None:
self._counter = collections.Counter()
else:
values = (self._mutator(v) for v in values)
self._counter = collections.Counter(values)
return

def __getitem__(self, item):
return self._counter[self._mutator(item)]

def __setitem__(self, item, value):
self._counter[self._mutator(item)] = value
return

def __delitem__(self, item):
del self._counter[self._mutator(item)]
return

def __iter__(self):
return iter(self._counter)

def __len__(self):
return len(self._counter)

def __repr__(self):
return ''.join([
self.__class__.__name__,
'(',
repr(dict(self._counter)),
')'
])

def elements(self):
return self._counter.elements()

def most_common(self, n):
return self._counter.most_common(n)

def subtract(self, values):
if isinstance(values, collections.abc.Mapping):
values = {self._mutator(k): v for k, v in values.items()}
return self._counter.subtract(values)
else:
values = (self._mutator(v) for v in values)
return self._counter.subtract(values)


def main():
def mutator(s):
# Asterisks are easier to print than '\ue000'.
return '*' + s + '*'

words = 'the lazy fox jumps over the brown dog'.split()

# Test None (allowed by collections.Counter).
ctr_none = CustomCounter(None)
assert 0 == len(ctr_none)

# Test typical dict and collections.Counter methods.
ctr = CustomCounter(words, mutator=mutator)
print(ctr)
assert 1 == ctr['dog']
assert 2 == ctr['the']
assert 7 == len(ctr)
del(ctr['lazy'])
assert 6 == len(ctr)
ctr.subtract(['jumps', 'dog'])
assert 0 == ctr['dog']
assert 6 == len(ctr)
ctr.subtract({'the': 5, 'bogus': 100})
assert -3 == ctr['the']
assert -100 == ctr['bogus']
assert 7 == len(ctr)
return


if "__main__" == __name__:
main()

输出(换行,以便于阅读):

CustomCounter({
'*brown*': 1,
'*lazy*': 1,
'*the*': 2,
'*over*': 1,
'*jumps*': 1,
'*fox*': 1,
'*dog*': 1
})

我向初始化器添加了一个仅关键字参数,mutator,以存储将现实世界中的任何内容转换为“突变”计数版本的函数。请注意,这可能意味着 CustomCounter 不再存储“可哈希对象”,而是“不会使更改器(mutator)崩溃的可哈希对象”。

此外,如果标准库的 Counter 获得新方法,您必须更新 CustomCounter 来“覆盖”它们。(你也许可以通过使用来解决这个问题 __getattr__将任何未知属性传递给 self._counter,但参数中的任何键都将以原始的“未变异”形式传递给 Counter

最后,正如我之前指出的,如果其他代码专门寻找的话,它实际上并不是 collections.Counter 的子类。

关于python - 创建具有特殊设置功能的自定义 Counter 对象,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43196793/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com