- android - 多次调用 OnPrimaryClipChangedListener
- android - 无法更新 RecyclerView 中的 TextView 字段
- android.database.CursorIndexOutOfBoundsException : Index 0 requested, 光标大小为 0
- android - 使用 AppCompat 时,我们是否需要明确指定其 UI 组件(Spinner、EditText)颜色
我有一个 SFrame
,其外观与 sf.print_rows(10)
类似:
+--------------+---------------+-------+-------------------------------+
| Dataset | Domain | Score | Sent1 |
+--------------+---------------+-------+-------------------------------+
| STS2012-gold | surprise.OnWN | 5.0 | render one language in ano... |
| STS2012-gold | surprise.OnWN | 3.25 | nations unified by shared ... |
| STS2012-gold | surprise.OnWN | 3.25 | convert into absorbable su... |
| STS2012-gold | surprise.OnWN | 4.0 | devote or adapt exclusivel... |
| STS2012-gold | surprise.OnWN | 3.25 | elevated wooden porch of a... |
| STS2012-gold | surprise.OnWN | 4.0 | either half of an archery bow |
| STS2012-gold | surprise.OnWN | 3.333 | a removable device that is... |
| STS2012-gold | surprise.OnWN | 4.75 | restrict or confine |
| STS2012-gold | surprise.OnWN | 0.5 | orient, be positioned |
| STS2012-gold | surprise.OnWN | 4.75 | Bring back to life, return... |
+--------------+---------------+-------+-------------------------------+
+-------------------------------+-------------------------------+
| Sent2 | Sent1_tokenized |
+-------------------------------+-------------------------------+
| restate (words) from one l... | [render, one, language, in... |
| a group of nations having ... | [nations, unified, by, sha... |
| soften or disintegrate by ... | [convert, into, absorbable... |
| devote oneself to a specia... | [devote, or, adapt, exclus... |
| a porch that resembles the... | [elevated, wooden, porch, ... |
| either of the two halves o... | [either, half, of, an, arc... |
| a supplementary part or ac... | [a, removable, device, tha... |
| place limits on (extent or... | [restrict, or, confine] |
| be opposite. | [orient,, be, positioned] |
| cause to become alive again. | [Bring, back, to, life,, r... |
+-------------------------------+-------------------------------+
+-------------------------------+-----------+-----------+----------------------+
| Sent2_tokenized | Sent1_len | Sent2_len | NGRAM-cosChar2ngrams |
+-------------------------------+-----------+-----------+----------------------+
| [restate, (words), from, o... | 6 | 8 | 0.82090085 |
| [a, group, of, nations, ha... | 8 | 7 | 0.53250804 |
| [soften, or, disintegrate,... | 11 | 11 | 0.43274232 |
| [devote, oneself, to, a, s... | 10 | 8 | 0.47759567 |
| [a, porch, that, resembles... | 6 | 9 | 0.38885689 |
| [either, of, the, two, hal... | 6 | 12 | 0.55555556 |
| [a, supplementary, part, o... | 10 | 5 | 0.44963552 |
| [place, limits, on, (exten... | 3 | 6 | 0.27124449 |
| [be, opposite.] | 3 | 2 | 0.43528575 |
| [cause, to, become, alive,... | 8 | 5 | 0.37047929 |
+-------------------------------+-----------+-----------+----------------------+
+----------------------+----------------------+----------------------+
| NGRAM-cosChar3ngrams | NGRAM-cosChar4ngrams | NGRAM-cosChar5ngrams |
+----------------------+----------------------+----------------------+
| 0.74964917 | 0.71490469 | 0.67925959 |
| 0.36701702 | 0.28941438 | 0.23635427 |
| 0.25899951 | 0.21053227 | 0.17058877 |
| 0.26248718 | 0.20518234 | 0.14285714 |
| 0.17107978 | 0.12049505 | 0.09320546 |
| 0.40754381 | 0.24715577 | 0.11547005 |
| 0.21997067 | 0.17554945 | 0.15450786 |
| 0.13284223 | 0.09284767 | 0.048795 |
| 0.31426968 | 0.17149859 | 0.09449112 |
| 0.0632772 | 0.03402069 | 0.0 |
+----------------------+----------------------+----------------------+
+---------------------+---------------------+---------------------+---------------------+
[19097 rows x 134 columns]
但是当我尝试使用 sf.save('trainers.csv', format='csv')
将其保存到 csv 中时,它会抛出错误:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-23-f82bcb3fa197> in <module>()
----> 1 sts.save('trainers.csv', format='csv')
/usr/local/lib/python2.7/dist-packages/graphlab/data_structures/sframe.pyc in save(self, filename, format)
2924 self.export_json(url)
2925 else:
-> 2926 raise ValueError("Unsupported format: {}".format(format))
2927
2928 def export_csv(self, filename, delimiter=',', line_terminator='\n',
/usr/local/lib/python2.7/dist-packages/graphlab/cython/context.pyc in __exit__(self, exc_type, exc_value, traceback)
47 if not self.show_cython_trace:
48 # To hide cython trace, we re-raise from here
---> 49 raise exc_type(exc_value)
50 else:
51 # To show the full trace, we do nothing and let exception propagate
RuntimeError: Runtime Exception. Traceback (most recent call last):
File "<ipython-input-5-e29b4d4eba06>", line 20, in <lambda>
ZeroDivisionError: division by zero
我打印了n号。一次一行,例如sf.print_rows(10)
、sf.print_rows(100)
和 sf.print_rows(129)
处抛出错误:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-24-13550768dbcd> in <module>()
----> 1 sts.print_rows(129)
/usr/local/lib/python2.7/dist-packages/graphlab/data_structures/sframe.pyc in print_rows(self, num_rows, num_columns, max_column_width, max_row_width, output_file)
2226 max_row_width = max(max_row_width, max_column_width + 1)
2227
-> 2228 printed_sf = self._imagecols_to_stringcols(num_rows)
2229 row_of_tables = printed_sf.__get_pretty_tables__(wrap_text=False,
2230 max_rows_to_display=num_rows,
/usr/local/lib/python2.7/dist-packages/graphlab/data_structures/sframe.pyc in _imagecols_to_stringcols(self, num_rows)
2250 if t in image_column_names:
2251 printed_sf[t] = self[t].astype(str)
-> 2252 return printed_sf.head(num_rows)
2253
2254 def __str_impl__(self, num_rows=10, footer=True):
/usr/local/lib/python2.7/dist-packages/graphlab/data_structures/sframe.pyc in head(self, n)
2454 tail, print_rows
2455 """
-> 2456 return SFrame(_proxy=self.__proxy__.head(n))
2457
2458 def to_dataframe(self):
graphlab/cython/cy_sframe.pyx in graphlab.cython.cy_sframe.UnitySFrameProxy.head()
graphlab/cython/cy_sframe.pyx in graphlab.cython.cy_sframe.UnitySFrameProxy.head()
RuntimeError: Runtime Exception. Traceback (most recent call last):
File "<ipython-input-5-e29b4d4eba06>", line 20, in <lambda>
ZeroDivisionError: division by zero
所以我做了一个sf.fillna(c, 0)
:
for c in sts.column_names():
sts = sts.fillna(c, 0)
它抛出另一个错误:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-26-e63cf73308dd> in <module>()
1 for c in sts.column_names():
----> 2 sts = sts.fillna(c, 0)
/usr/local/lib/python2.7/dist-packages/graphlab/data_structures/sframe.pyc in fillna(self, column, value)
5652 raise TypeError("Must give column name as a str")
5653 ret = self[self.column_names()]
-> 5654 ret[column] = ret[column].fillna(value)
5655 return ret
5656
/usr/local/lib/python2.7/dist-packages/graphlab/data_structures/sarray.pyc in fillna(self, value)
2439
2440 with cython_context():
-> 2441 return SArray(_proxy = self.__proxy__.fill_missing_values(value))
2442
2443 def topk_index(self, topk=10, reverse=False):
/usr/local/lib/python2.7/dist-packages/graphlab/cython/context.pyc in __exit__(self, exc_type, exc_value, traceback)
47 if not self.show_cython_trace:
48 # To hide cython trace, we re-raise from here
---> 49 raise exc_type(exc_value)
50 else:
51 # To show the full trace, we do nothing and let exception propagate
RuntimeError: Runtime Exception. Default value must be convertible to column type
如何查找在 Graphlab SFrame 中保存时引发错误的特定行?
如何修复这一行?我可以用 fillna()
替换行中有问题的列吗?我无法真正使用 dropna()
丢弃这些行,因为我需要跟踪有问题的行。
但即使使用 dropna()
,我最终得到的是:
sf.dropna()
sf.save('trainers.csv', format='csv')
如何找到这些给我错误或 ZeroDivisionErrors 的行?以及如何纠正它们或用零填充这些列?
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-28-f82bcb3fa197> in <module>()
----> 1 sts.save('trainers.csv', format='csv')
/usr/local/lib/python2.7/dist-packages/graphlab/data_structures/sframe.pyc in save(self, filename, format)
2924 self.export_json(url)
2925 else:
-> 2926 raise ValueError("Unsupported format: {}".format(format))
2927
2928 def export_csv(self, filename, delimiter=',', line_terminator='\n',
/usr/local/lib/python2.7/dist-packages/graphlab/cython/context.pyc in __exit__(self, exc_type, exc_value, traceback)
47 if not self.show_cython_trace:
48 # To hide cython trace, we re-raise from here
---> 49 raise exc_type(exc_value)
50 else:
51 # To show the full trace, we do nothing and let exception propagate
RuntimeError: Runtime Exception. Traceback (most recent call last):
File "<ipython-input-5-e29b4d4eba06>", line 20, in <lambda>
ZeroDivisionError: division by zero
奇怪的是,当我尝试使用以下命令迭代 SFrame 时,我无法迭代 SFrame:
for i in sf:
print i
它抛出此错误:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-29-d2d0035d7bbe> in <module>()
----> 1 for i in sts:
2 print i
/usr/local/lib/python2.7/dist-packages/graphlab/data_structures/sframe.pyc in generator()
3712 def generator():
3713 elems_at_a_time = 262144
-> 3714 self.__proxy__.begin_iterator()
3715 ret = self.__proxy__.iterator_get_next(elems_at_a_time)
3716 column_names = self.column_names()
graphlab/cython/cy_sframe.pyx in graphlab.cython.cy_sframe.UnitySFrameProxy.begin_iterator()
graphlab/cython/cy_sframe.pyx in graphlab.cython.cy_sframe.UnitySFrameProxy.begin_iterator()
RuntimeError: Runtime Exception. Traceback (most recent call last):
File "<ipython-input-5-e29b4d4eba06>", line 10, in <lambda>
TypeError: 'NoneType' object is not iterable
事情变得更奇怪了,我无法使用 sf[num] 检索特定行,但我可以执行子 SFrame,然后检索特定的 num 行。所以这个:
print sf[25]
中断和抛出:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-62-6bc8898704c0> in <module>()
----> 1 print sts[25]
/usr/local/lib/python2.7/dist-packages/graphlab/data_structures/sframe.pyc in __getitem__(self, key)
3595 ub = min(sf_len, lb + block_size)
3596
-> 3597 val_list = list(SFrame(_proxy = self.__proxy__.copy_range(lb, 1, ub)))
3598 self._cache["getitem_cache"] = (lb, ub, val_list)
3599 return val_list[key - lb]
graphlab/cython/cy_sframe.pyx in graphlab.cython.cy_sframe.UnitySFrameProxy.copy_range()
graphlab/cython/cy_sframe.pyx in graphlab.cython.cy_sframe.UnitySFrameProxy.copy_range()
RuntimeError: Runtime Exception. Traceback (most recent call last):
File "<ipython-input-5-e29b4d4eba06>", line 10, in <lambda>
TypeError: 'NoneType' object is not iterable
但是当我尝试提取子集然后打印时,它起作用了。下面的代码检索之前使用上面的代码抛出错误的第 25 个元素:
x = sf[:30]
print x[25]
前面带有 sf[25]
的代码抛出 NoneType
是否有原因? sf[0]
到 sf[24]
有效,但任何高于 25 的值都无效。
显然,以这种方式迭代 SFrame 并将其转储为 str sorta 有效:
fout = open('superbad.txt', 'w')
sflen = len(sf)
i = 0
while i < sflen:
m = i+100 if i+100 < sflen else sflen
x = sf[i:m]
for j in x:
fout.write(str(j) +'\n\n')
这很奇怪。 为什么分块迭代并转储到字符串有效?
最佳答案
问题是运行应用时出现除零错误(在保存上方的某个位置)
RuntimeError: Runtime Exception. Traceback (most recent call last):
File "<ipython-input-5-e29b4d4eba06>", line 20, in <lambda>
ZeroDivisionError: division by zero
发生这种情况是因为惰性求值 ( https://en.wikipedia.org/wiki/Lazy_evaluation )。作为示例,假设我从具有单列的 SFrame 开始
sf = gl.SFrame({'x': range(10000, -1, -1)})
sf['x'].apply(lambda x: 1.0/x)
此时,SFrame 的最后一行包含 1.0/0
值,这是一个错误,但尚未对其进行评估。 save 方法会触发具体化,即数据中所有行的实际计算,然后导致错误发生。您可以通过调用 __materialize__
sf.__materialize__()
这会导致发生以下错误。
RuntimeError: Runtime Exception. Traceback (most recent call last):
File "<ipython-input-55-5af90e232e2d>", line 1, in <lambda>
ZeroDivisionError: float division by zero
惰性评估和查询规划作为性能优化非常重要,也是 SFrame 快速且可扩展的原因之一。不幸的是,跟踪错误是它的烦恼之一,但是一旦您了解它的工作原理,您就会习惯它。
head()
函数不会触发完整的具体化,因此您可以在任意多的行上执行它,直到发现错误为止。
关于python - 如何找到在 Graphlab SFrame 中保存时引发错误的特定行?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34654901/
我正在使用 graphlab 库中的 sframes。我需要按行进行一些计算。此外,如果我能够转置 sframe,则 sframe 结构在我的情况下会更有意义。 有什么办法可以做到吗?还是可以在我可以
我有一个大约 20GB 的庞大数据集。我已经使用 graphlab.SFrame.read_csv() 读取了数据。我有一个日期列,它被读取为格式为 yyyy-dd-mm 的字符串。但我希望将该列作为
拜托,谁能告诉我,我如何从 SFrame 中的每个值中取对数,graphlab(或 DataFrame,pandas)列,而不遍历 SFrame 列的整个长度? 我对类似的功能特别感兴趣,比如 Gro
我对如何选择 SFrame 数组中的特定行感到困惑。我可以在此处选择第一行: sf +-------------------------------+ | X1
我已加入 coursera 上的机器学习类(class)。我在执行以下命令时遇到问题: sales = graphlab.SFrame('home_data.gl/') 错误如下: IOErr
我正在使用 graphlab 和 sframes 在 ipython 笔记本中构建重复订单报告。我有一个 csv 文件,其中包含大约 10 万行数据,其中包含 user_id、user_email、u
我需要将 SFrame 列转换为列表。 输入: `+---------+ | word | +---------+ | love | | loves | |
在 graphlab 中,我有以下 SFrame 调用 train: import graphlab train = graphlab.read_csv('clean_train.csv') trai
我正在浏览 Graphlab 文档,我正在尝试弄清楚如何复制 pandas 功能,如果 na 值被中值、均值或模式等替换...在 Pandas 中,您只需通过以下方式执行此操作:df.dropna()
我有一个制表符分隔的文件: $ echo -e 'abc\txyz\t0.9\nefg\txyz\t0.3\nlmn\topq\t0.23\nabc\tjkl\t0.5\n' > test.txt $
我有这样一个 sframe: +---------+------+-------------------------------+-----------+------------------+ | t
任何人都可以,请告诉我,我如何绘制 SFrame (甚至更好 SArray )或将此类型转换为 python 中的某些常见类型。例如,当我尝试将 SArray 转换为 Pandas 对象时: pand
如何对 SFrame graphlab 的一列中的所有值求和。我试着查看官方文档,它只针对 SaArray( doc )没有任何例子。 最佳答案 >>> import graphlab as gl >
给定一个 Graphlab SFrame: +-------+------------+---------+-----------+ | Store | Date | Sales |
有一个 SFrame,其中的列具有 dict 元素。 import graphlab import numpy as np a = graphlab.SFrame({'col1':[{'oshan':
我有以下代码,其中使用循环提取一些信息并使用这些信息创建一个新矩阵。但是,由于我使用的是循环,因此该代码需要很长时间才能完成。 我想知道是否有更好的方法通过使用 GraphLab 的 SFrame 或
我正在尝试对充满数据的 s 帧使用简单的应用。这是针对其中一列的简单数据转换,应用一个接受文本输入并将其拆分为列表的函数。这是函数及其调用/输出: In [1]: def count_word
我有两列字符串。让我们说 col1 和 col2现在我们如何使用 graphlab SFrame 将 col1 和 col2 的内容合并到 col3 中? col1 col2 23 33 42
我有一个 SFrame,其外观与 sf.print_rows(10) 类似: +--------------+---------------+-------+---------------------
我想创建一个SFrame来自 NumPy 数组。 我具体想要的是: np.arange(16).reshape(4, 4) => +----+----+----+----+ | 0 | 1 | 2
我是一名优秀的程序员,十分优秀!