- html - 出于某种原因,IE8 对我的 Sass 文件中继承的 html5 CSS 不友好?
- JMeter 在响应断言中使用 span 标签的问题
- html - 在 :hover and :active? 上具有不同效果的 CSS 动画
- html - 相对于居中的 html 内容固定的 CSS 重复背景?
假设我以 Parquet 格式保存了这个数据框
import numpy as np
import pandas as pd
data = pd.DataFrame(dict(
a=[1.0, 2.0, 3.0, 4.0, 5.0, 6.0],
b=[1.0, 1.0, 1.0, np.NaN, 0.0, np.NaN],
c=[0.9, np.NaN, 1.0, 0.0, 0.0, 0.0]
))
data.to_parquet('data.parquet')
以及告诉我应该使用哪些值进行插补的字典。然后我可以写一个预处理函数。
import tensorflow as tf
impute_dictionary = dict(b=1.0, c=0.0)
def preprocessing_fn(inputs):
outputs = inputs.copy()
for key, value in impute_dictionary.items():
outputs[key] = tf.where(
tf.math.is_nan(outputs[key]),
tf.constant(value, shape=outputs[key].shape),
outputs[key]
)
return outputs
并在 Apache Beam 管道中使用它
import tempfile
import apache_beam as beam
import tensorflow_transform.beam as tft_beam
from tensorflow_transform.tf_metadata import dataset_metadata, schema_utils
temp = tempfile.gettempdir()
RAW_DATA_FEATURE_SPEC = dict(
[(name, tf.io.FixedLenFeature([], tf.float32)) for name in ['a', 'b', 'c']]
)
RAW_DATA_METADATA = dataset_metadata.DatasetMetadata(schema_utils.schema_from_feature_spec(RAW_DATA_FEATURE_SPEC))
with beam.Pipeline() as pipeline:
with tft_beam.Context(temp_dir=tempfile.mkdtemp()):
raw_data = pipeline | 'ReadTrainData' >> beam.io.ReadFromParquet('data.parquet')
raw_dataset = (raw_data, RAW_DATA_METADATA)
transformed_dataset, transform_fn = (raw_dataset | tft_beam.AnalyzeAndTransformDataset(preprocessing_fn))
transformed_data, transformed_metadata = transformed_dataset
transformed_data_coder = tft.coders.ExampleProtoCoder(transformed_metadata.schema)
我收到此错误:TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'
看来 outputs[key].shape
是 (None,)
有什么建议吗?
包版本:
tensorflow==2.1.0
tensorflow-transform==0.21.0
pandas==1.0.0
numpy==1.18.1
apache-beam==2.19.0
完整错误信息:
WARNING: Logging before flag parsing goes to stderr.
W0204 10:36:03.793034 140735593104256 interactive_environment.py:113] Dependencies required for Interactive Beam PCollection visualization are not available, please use: `pip install apache-beam[interactive]` to install necessary dependencies to enable all data visualization features.
W0204 10:36:03.793169 140735593104256 interactive_environment.py:125] You have limited Interactive Beam features since your ipython kernel is not connected any notebook frontend.
W0204 10:36:03.929135 140735593104256 impl.py:360] Tensorflow version (2.1.0) found. Note that Tensorflow Transform support for TF 2.0 is currently in beta, and features such as tf.function may not work as intended.
W0204 10:36:03.929914 140735593104256 impl.py:360] Tensorflow version (2.1.0) found. Note that Tensorflow Transform support for TF 2.0 is currently in beta, and features such as tf.function may not work as intended.
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-5-465b3f61784c> in <module>
17 raw_data = pipeline | 'ReadTrainData' >> beam.io.ReadFromParquet('data.parquet')
18 raw_dataset = (raw_data, RAW_DATA_METADATA)
---> 19 transformed_dataset, transform_fn = (raw_dataset | tft_beam.AnalyzeAndTransformDataset(preprocessing_fn))
20 transformed_data, transformed_metadata = transformed_dataset
21 transformed_data_coder = tft.coders.ExampleProtoCoder(transformed_metadata.schema)
/usr/local/lib/python3.7/site-packages/apache_beam/transforms/ptransform.py in __ror__(self, left, label)
547 pvalueish = _SetInputPValues().visit(pvalueish, replacements)
548 self.pipeline = p
--> 549 result = p.apply(self, pvalueish, label)
550 if deferred:
551 return result
/usr/local/lib/python3.7/site-packages/apache_beam/pipeline.py in apply(self, transform, pvalueish, label)
575 transform.type_check_inputs(pvalueish)
576
--> 577 pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
578
579 if type_options is not None and type_options.pipeline_type_check:
/usr/local/lib/python3.7/site-packages/apache_beam/runners/runner.py in apply(self, transform, input, options)
193 m = getattr(self, 'apply_%s' % cls.__name__, None)
194 if m:
--> 195 return m(transform, input, options)
196 raise NotImplementedError(
197 'Execution of [%s] not implemented in runner %s.' % (transform, self))
/usr/local/lib/python3.7/site-packages/apache_beam/runners/runner.py in apply_PTransform(self, transform, input, options)
223 def apply_PTransform(self, transform, input, options):
224 # The base case of apply is to call the transform's expand.
--> 225 return transform.expand(input)
226
227 def run_transform(self,
/usr/local/lib/python3.7/site-packages/tensorflow_transform/beam/impl.py in expand(self, dataset)
861 # e.g. caching the values of expensive computations done in AnalyzeDataset.
862 transform_fn = (
--> 863 dataset | 'AnalyzeDataset' >> AnalyzeDataset(self._preprocessing_fn))
864
865 if Context.get_use_deep_copy_optimization():
/usr/local/lib/python3.7/site-packages/apache_beam/transforms/ptransform.py in __ror__(self, pvalueish, _unused)
987
988 def __ror__(self, pvalueish, _unused=None):
--> 989 return self.transform.__ror__(pvalueish, self.label)
990
991 def expand(self, pvalue):
/usr/local/lib/python3.7/site-packages/apache_beam/transforms/ptransform.py in __ror__(self, left, label)
547 pvalueish = _SetInputPValues().visit(pvalueish, replacements)
548 self.pipeline = p
--> 549 result = p.apply(self, pvalueish, label)
550 if deferred:
551 return result
/usr/local/lib/python3.7/site-packages/apache_beam/pipeline.py in apply(self, transform, pvalueish, label)
534 try:
535 old_label, transform.label = transform.label, label
--> 536 return self.apply(transform, pvalueish)
537 finally:
538 transform.label = old_label
/usr/local/lib/python3.7/site-packages/apache_beam/pipeline.py in apply(self, transform, pvalueish, label)
575 transform.type_check_inputs(pvalueish)
576
--> 577 pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
578
579 if type_options is not None and type_options.pipeline_type_check:
/usr/local/lib/python3.7/site-packages/apache_beam/runners/runner.py in apply(self, transform, input, options)
193 m = getattr(self, 'apply_%s' % cls.__name__, None)
194 if m:
--> 195 return m(transform, input, options)
196 raise NotImplementedError(
197 'Execution of [%s] not implemented in runner %s.' % (transform, self))
/usr/local/lib/python3.7/site-packages/apache_beam/runners/runner.py in apply_PTransform(self, transform, input, options)
223 def apply_PTransform(self, transform, input, options):
224 # The base case of apply is to call the transform's expand.
--> 225 return transform.expand(input)
226
227 def run_transform(self,
/usr/local/lib/python3.7/site-packages/tensorflow_transform/beam/impl.py in expand(self, dataset)
808 input_values, input_metadata = dataset
809 result, cache = super(AnalyzeDataset, self).expand((input_values, None,
--> 810 None, input_metadata))
811 assert not cache
812 return result
/usr/local/lib/python3.7/site-packages/tensorflow_transform/beam/impl.py in expand(self, dataset)
681 copied_inputs = impl_helper.copy_tensors(input_signature)
682
--> 683 output_signature = self._preprocessing_fn(copied_inputs)
684
685 # At this point we check that the preprocessing_fn has at least one
<ipython-input-2-205d9abf4136> in preprocessing_fn(inputs)
9 outputs[key] = tf.where(
10 tf.math.is_nan(outputs[key]),
---> 11 tf.constant(value, shape=outputs[key].shape),
12 outputs[key]
13 )
/usr/local/lib/python3.7/site-packages/tensorflow_core/python/framework/constant_op.py in constant(value, dtype, shape, name)
256 """
257 return _constant_impl(value, dtype, shape, name, verify_shape=False,
--> 258 allow_broadcast=True)
259
260
/usr/local/lib/python3.7/site-packages/tensorflow_core/python/framework/constant_op.py in _constant_impl(value, dtype, shape, name, verify_shape, allow_broadcast)
294 tensor_util.make_tensor_proto(
295 value, dtype=dtype, shape=shape, verify_shape=verify_shape,
--> 296 allow_broadcast=allow_broadcast))
297 dtype_value = attr_value_pb2.AttrValue(type=tensor_value.tensor.dtype)
298 const_tensor = g._create_op_internal( # pylint: disable=protected-access
/usr/local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py in make_tensor_proto(values, dtype, shape, verify_shape, allow_broadcast)
446 # If shape is None, numpy.prod returns None when dtype is not set, but
447 # raises exception when dtype is set to np.int64
--> 448 if shape is not None and np.prod(shape, dtype=np.int64) == 0:
449 nparray = np.empty(shape, dtype=np_dt)
450 else:
<__array_function__ internals> in prod(*args, **kwargs)
/usr/local/lib/python3.7/site-packages/numpy/core/fromnumeric.py in prod(a, axis, dtype, out, keepdims, initial, where)
2960 """
2961 return _wrapreduction(a, np.multiply, 'prod', axis, dtype, out,
-> 2962 keepdims=keepdims, initial=initial, where=where)
2963
2964
/usr/local/lib/python3.7/site-packages/numpy/core/fromnumeric.py in _wrapreduction(obj, ufunc, method, axis, dtype, out, **kwargs)
88 return reduction(axis=axis, out=out, **passkwargs)
89
---> 90 return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
91
92
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'
最佳答案
问题是我在 tf.constant(value, shape=outputs[key].shape)
中设置了形状。我应该只使用 tf.constant(value, dtype=tf.float32)
。
关于python - TensorFlow: TypeError: int() 参数必须是字符串、类字节对象或数字,而不是 'NoneType',我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60048397/
如何使用 SPListCollection.Add(String, String, String, String, Int32, String, SPListTemplate.QuickLaunchO
我刚刚开始使用 C++ 并且对 C# 有一些经验,所以我有一些一般的编程经验。然而,似乎我马上就被击落了。我试过在谷歌上寻找,以免浪费任何人的时间,但没有结果。 int main(int argc,
这个问题已经有答案了: In Java 8 how do I transform a Map to another Map using a lambda? (8 个回答) Convert a Map>
我正在使用 node + typescript 和集成的 swagger 进行 API 调用。我 Swagger 提出以下要求 http://localhost:3033/employees/sear
我是 C++ 容器模板的新手。我收集了一些记录。每条记录都有一个唯一的名称,以及一个字段/值对列表。将按名称访问记录。字段/值对的顺序很重要。因此我设计如下: typedef string
我需要这两种方法,但j2me没有,我找到了一个replaceall();但这是 replaceall(string,string,string); 第二个方法是SringBuffer但在j2me中它没
If string is an alias of String in the .net framework为什么会发生这种情况,我应该如何解释它: type JustAString = string
我有两个列表(或字符串):一个大,另一个小。 我想检查较大的(A)是否包含小的(B)。 我的期望如下: 案例 1. B 是 A 的子集 A = [1,2,3] B = [1,2] contains(A
我有一个似乎无法解决的小问题。 这里...我有一个像这样创建的输入... var input = $(''); 如果我这样做......一切都很好 $(this).append(input); 如果我
我有以下代码片段 string[] lines = objects.Split(new string[] { "\r\n", "\n" }, StringSplitOptions.No
这可能真的很简单,但我已经坚持了一段时间了。 我正在尝试输出一个字符串,然后输出一个带有两位小数的 double ,后跟另一个字符串,这是我的代码。 System.out.printf("成本:%.2
以下是 Cloud Firestore 列表查询中的示例之一 citiesRef.where("state", ">=", "CA").where("state", "= 字符串,我们在Stack O
我正在尝试检查一个字符串是否包含在另一个字符串中。后面的代码非常简单。我怎样才能在 jquery 中做到这一点? function deleteRow(locName, locID) { if
这个问题在这里已经有了答案: How to implement big int in C++ (14 个答案) 关闭 9 年前。 我有 2 个字符串,都只包含数字。这些数字大于 uint64_t 的
我有一个带有自定义转换器的 Dozer 映射: com.xyz.Customer com.xyz.CustomerDAO customerName
这个问题在这里已经有了答案: How do I compare strings in Java? (23 个回答) 关闭 6 年前。 我想了解字符串池的工作原理以及一个字符串等于另一个字符串的规则是
我已阅读 this问题和其他一些问题。但它们与我的问题有些无关 对于 UILabel 如果你不指定 ? 或 ! 你会得到这样的错误: @IBOutlet property has non-option
这两种方法中哪一种在理论上更快,为什么? (指向字符串的指针必须是常量。) destination[count] 和 *destination++ 之间的确切区别是什么? destination[co
This question already has answers here: Closed 11 years ago. Possible Duplicates: Is String.Format a
我有一个Stream一个文件的,现在我想将相同的单词组合成 Map这很重要,这个词在 Stream 中出现的频率. 我知道我必须使用 collect(Collectors.groupingBy(..)
我是一名优秀的程序员,十分优秀!