- ubuntu12.04环境下使用kvm ioctl接口实现最简单的虚拟机
- Ubuntu 通过无线网络安装Ubuntu Server启动系统后连接无线网络的方法
- 在Ubuntu上搭建网桥的方法
- ubuntu 虚拟机上网方式及相关配置详解
CFSDN坚持开源创造价值,我们致力于搭建一个资源共享平台,让每一个IT人在这里找到属于你的精彩世界.
这篇CFSDN的博客文章win7 x64系统中安装Scrapy的方法由作者收集整理,如果你对这篇文章有兴趣,记得点赞哟.
scrapy是用python开发的爬虫框架,从网上查了安装方法,感觉都说的挺复杂,而且很多教程都很有年头了,于是记录了自己的安装过程.
首先安装python,地址:https://www.python.org/downloads/release/python-2710/,注意根据你的系统下64位(windows x86-64 msi installer)还是32位的(windows x86 msi installer).
现在是python3.6的天下了,建议大家安装python3版本.
装完以后就可以安装scrapy了,推荐使用pip方式安装,因为scrapy需要调用很多额外的库,pip会全部帮你安装好,不需要你在到处翻找了.
pip在python安装完后就已经有了,不需要额外安装,下面只要按照scrapy官网推荐的方法在命令提示符中输入pip installscrapy(图1),然后只需静静等待即可大功告成.
图1 。
装完以后可以敲入命令pip list看看已安装的库(图2),出来很多啊,pip真是好东西.
图2 。
现在试下看看建个爬虫项目,按照说明文档键入命令scrapy startproject tutorial,目录已经出来(图3),看来是没问题了。但为了验证是否安装成功,还得跑一下看看,第一次创建项目的时候,系统会提示可以跑个例子看看(图4)。按照提示键入命令 。
1
|
|
图3 。
图4 。
scrapy genspider example example.com创建一个爬虫,再键入命令scrapy crawl example 。
运行爬虫,结果如下(图5),报错了,貌似是缺少win32api,立即上网下了一个(http://sourceforge.net/projects/pywin32/files/pywin32/build%20219/), 。
图5 。
下的时候注意对应的python版本。win32api装好以后再运行一次爬虫(图6),这次成功了,应该是没问题了.
图6 。
总结一下,其实刚开始网上找资料的时候看到上面写的要先装这个库那个库的时候心中很忐忑,结果发现不是很复杂,大多数问题pip都给解决了,剩下的就是具体问题具体研究,不过也没碰到很复杂解决不了的问题。另外吐下槽就是网上的教程互抄的太厉害,看着一搜一堆,其实多数都大同小异,真正有价值的没几个,没大腿抱就是辛苦呀.
最后说一下,scrapy目前还不支持python3.x版本,我用的是python2.7,如果你碰到莫名其妙的问题时请先看看自己有没有装错python版本.
下面是其他网友补充的文章 。
环境 。
windows7 64位 python2.7.6 64位 。
python的安装:
1
2
3
|
c:usersdministrator>python
python
2.7
.
6
(default, nov
10
2013
,
19
:
24
:
24
) [msc v.
1500
64
bit (amd64)] on win
32
|
easy_install的安装 。
保存ez_setup.py至本地,如D盘(如果失效了,可以参考下http://www.zzvips.com/article/157401.html 。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
|
#!/usr/bin/env python
"""
setuptools bootstrapping installer.
maintained at https://github.com/pypa/setuptools/tree/bootstrap.
run this script to install or upgrade setuptools.
this method is deprecated. check https://github.com/pypa/setuptools/issues/581 for more details.
"""
import
os
import
shutil
import
sys
import
tempfile
import
zipfile
import
optparse
import
subprocess
import
platform
import
textwrap
import
contextlib
from
distutils
import
log
try
:
from
urllib.request
import
urlopen
except
importerror:
from
urllib2
import
urlopen
try
:
from
site
import
user_site
except
importerror:
user_site
=
none
# 33.1.1 is the last version that supports setuptools self upgrade/installation.
default_version
=
"33.1.1"
default_url
=
"https://pypi.io/packages/source/s/setuptools/"
default_save_dir
=
os.curdir
default_deprecation_message
=
"ez_setup.py is deprecated and when using it setuptools will be pinned to {0} since it's the last version that supports setuptools self upgrade/installation, check https://github.com/pypa/setuptools/issues/581 for more info; use pip to install setuptools"
meaningful_invalid_zip_err_msg
=
'maybe {0} is corrupted, delete it and try again.'
log.warn(default_deprecation_message.
format
(default_version))
def
_python_cmd(
*
args):
"""
execute a command.
return true if the command succeeded.
"""
args
=
(sys.executable,)
+
args
return
subprocess.call(args)
=
=
0
def
_install(archive_filename, install_args
=
()):
"""install setuptools."""
with archive_context(archive_filename):
# installing
log.warn(
'installing setuptools'
)
if
not
_python_cmd(
'setup.py'
,
'install'
,
*
install_args):
log.warn(
'something went wrong during the installation.'
)
log.warn(
'see the error message above.'
)
# exitcode will be 2
return
2
def
_build_egg(egg, archive_filename, to_dir):
"""build setuptools egg."""
with archive_context(archive_filename):
# building an egg
log.warn(
'building a setuptools egg in %s'
, to_dir)
_python_cmd(
'setup.py'
,
'-q'
,
'bdist_egg'
,
'--dist-dir'
, to_dir)
# returning the result
log.warn(egg)
if
not
os.path.exists(egg):
raise
ioerror(
'could not build the egg.'
)
class
contextualzipfile(zipfile.zipfile):
"""supplement zipfile class to support context manager for python 2.6."""
def
__enter__(
self
):
return
self
def
__exit__(
self
,
type
, value, traceback):
self
.close()
def
__new__(
cls
,
*
args,
*
*
kwargs):
"""construct a zipfile or contextualzipfile as appropriate."""
if
hasattr
(zipfile.zipfile,
'__exit__'
):
return
zipfile.zipfile(
*
args,
*
*
kwargs)
return
super
(contextualzipfile,
cls
).__new__(
cls
)
@contextlib
.contextmanager
def
archive_context(filename):
"""
unzip filename to a temporary directory, set to the cwd.
the unzipped target is cleaned up after.
"""
tmpdir
=
tempfile.mkdtemp()
log.warn(
'extracting in %s'
, tmpdir)
old_wd
=
os.getcwd()
try
:
os.chdir(tmpdir)
try
:
with contextualzipfile(filename) as archive:
archive.extractall()
except
zipfile.badzipfile as err:
if
not
err.args:
err.args
=
('', )
err.args
=
err.args
+
(
meaningful_invalid_zip_err_msg.
format
(filename),
)
raise
# going in the directory
subdir
=
os.path.join(tmpdir, os.listdir(tmpdir)[
0
])
os.chdir(subdir)
log.warn(
'now working in %s'
, subdir)
yield
finally
:
os.chdir(old_wd)
shutil.rmtree(tmpdir)
def
_do_download(version, download_base, to_dir, download_delay):
"""download setuptools."""
py_desig
=
'py{sys.version_info[0]}.{sys.version_info[1]}'
.
format
(sys
=
sys)
tp
=
'setuptools-{version}-{py_desig}.egg'
egg
=
os.path.join(to_dir, tp.
format
(
*
*
locals
()))
if
not
os.path.exists(egg):
archive
=
download_setuptools(version, download_base,
to_dir, download_delay)
_build_egg(egg, archive, to_dir)
sys.path.insert(
0
, egg)
# remove previously-imported pkg_resources if present (see
# https://bitbucket.org/pypa/setuptools/pull-request/7/ for details).
if
'pkg_resources'
in
sys.modules:
_unload_pkg_resources()
import
setuptools
setuptools.bootstrap_install_from
=
egg
def
use_setuptools(
version
=
default_version, download_base
=
default_url,
to_dir
=
default_save_dir, download_delay
=
15
):
"""
ensure that a setuptools version is installed.
return none. raise systemexit if the requested version
or later cannot be installed.
"""
to_dir
=
os.path.abspath(to_dir)
# prior to importing, capture the module state for
# representative modules.
rep_modules
=
'pkg_resources'
,
'setuptools'
imported
=
set
(sys.modules).intersection(rep_modules)
try
:
import
pkg_resources
pkg_resources.require(
"setuptools>="
+
version)
# a suitable version is already installed
return
except
importerror:
# pkg_resources not available; setuptools is not installed; download
pass
except
pkg_resources.distributionnotfound:
# no version of setuptools was found; allow download
pass
except
pkg_resources.versionconflict as vc_err:
if
imported:
_conflict_bail(vc_err, version)
# otherwise, unload pkg_resources to allow the downloaded version to
# take precedence.
del
pkg_resources
_unload_pkg_resources()
return
_do_download(version, download_base, to_dir, download_delay)
def
_conflict_bail(vc_err, version):
"""
setuptools was imported prior to invocation, so it is
unsafe to unload it. bail out.
"""
conflict_tmpl
=
textwrap.dedent(
"""
the required version of setuptools (>={version}) is not available,
and can't be installed while this script is running. please
install a more recent version first, using
'easy_install -u setuptools'.
(currently using {vc_err.args[0]!r})
"""
)
msg
=
conflict_tmpl.
format
(
*
*
locals
())
sys.stderr.write(msg)
sys.exit(
2
)
def
_unload_pkg_resources():
sys.meta_path
=
[
importer
for
importer
in
sys.meta_path
if
importer.__class__.__module__ !
=
'pkg_resources.extern'
]
del_modules
=
[
name
for
name
in
sys.modules
if
name.startswith(
'pkg_resources'
)
]
for
mod_name
in
del_modules:
del
sys.modules[mod_name]
def
_clean_check(cmd, target):
"""
run the command to download target.
if the command fails, clean up before re-raising the error.
"""
try
:
subprocess.check_call(cmd)
except
subprocess.calledprocesserror:
if
os.access(target, os.f_ok):
os.unlink(target)
raise
def
download_file_powershell(url, target):
"""
download the file at url to target using powershell.
powershell will validate trust.
raise an exception if the command cannot complete.
"""
target
=
os.path.abspath(target)
ps_cmd
=
(
"[system.net.webrequest]::defaultwebproxy.credentials = "
"[system.net.credentialcache]::defaultcredentials; "
'(new-object system.net.webclient).downloadfile("%(url)s", "%(target)s")'
%
locals
()
)
cmd
=
[
'powershell'
,
'-command'
,
ps_cmd,
]
_clean_check(cmd, target)
def
has_powershell():
"""determine if powershell is available."""
if
platform.system() !
=
'windows'
:
return
false
cmd
=
[
'powershell'
,
'-command'
,
'echo test'
]
with
open
(os.path.devnull,
'wb'
) as devnull:
try
:
subprocess.check_call(cmd, stdout
=
devnull, stderr
=
devnull)
except
exception:
return
false
return
true
download_file_powershell.viable
=
has_powershell
def
download_file_curl(url, target):
cmd
=
[
'curl'
, url,
'--location'
,
'--silent'
,
'--output'
, target]
_clean_check(cmd, target)
def
has_curl():
cmd
=
[
'curl'
,
'--version'
]
with
open
(os.path.devnull,
'wb'
) as devnull:
try
:
subprocess.check_call(cmd, stdout
=
devnull, stderr
=
devnull)
except
exception:
return
false
return
true
download_file_curl.viable
=
has_curl
def
download_file_wget(url, target):
cmd
=
[
'wget'
, url,
'--quiet'
,
'--output-document'
, target]
_clean_check(cmd, target)
def
has_wget():
cmd
=
[
'wget'
,
'--version'
]
with
open
(os.path.devnull,
'wb'
) as devnull:
try
:
subprocess.check_call(cmd, stdout
=
devnull, stderr
=
devnull)
except
exception:
return
false
return
true
download_file_wget.viable
=
has_wget
def
download_file_insecure(url, target):
"""use python to download the file, without connection authentication."""
src
=
urlopen(url)
try
:
# read all the data in one block.
data
=
src.read()
finally
:
src.close()
# write all the data in one block to avoid creating a partial file.
with
open
(target,
"wb"
) as dst:
dst.write(data)
download_file_insecure.viable
=
lambda
: true
def
get_best_downloader():
downloaders
=
(
download_file_powershell,
download_file_curl,
download_file_wget,
download_file_insecure,
)
viable_downloaders
=
(dl
for
dl
in
downloaders
if
dl.viable())
return
next
(viable_downloaders, none)
def
download_setuptools(
version
=
default_version, download_base
=
default_url,
to_dir
=
default_save_dir, delay
=
15
,
downloader_factory
=
get_best_downloader):
"""
download setuptools from a specified location and return its filename.
`version` should be a valid setuptools version number that is available
as an sdist for download under the `download_base` url (which should end
with a '/'). `to_dir` is the directory where the egg will be downloaded.
`delay` is the number of seconds to pause before an actual download
attempt.
``downloader_factory`` should be a function taking no arguments and
returning a function for downloading a url to a target.
"""
# making sure we use the absolute path
to_dir
=
os.path.abspath(to_dir)
zip_name
=
"setuptools-%s.zip"
%
version
url
=
download_base
+
zip_name
saveto
=
os.path.join(to_dir, zip_name)
if
not
os.path.exists(saveto):
# avoid repeated downloads
log.warn(
"downloading %s"
, url)
downloader
=
downloader_factory()
downloader(url, saveto)
return
os.path.realpath(saveto)
def
_build_install_args(options):
"""
build the arguments to 'python setup.py install' on the setuptools package.
returns list of command line arguments.
"""
return
[
'--user'
]
if
options.user_install
else
[]
def
_parse_args():
"""parse the command line for options."""
parser
=
optparse.optionparser()
parser.add_option(
'--user'
, dest
=
'user_install'
, action
=
'store_true'
, default
=
false,
help
=
'install in user site package'
)
parser.add_option(
'--download-base'
, dest
=
'download_base'
, metavar
=
"url"
,
default
=
default_url,
help
=
'alternative url from where to download the setuptools package'
)
parser.add_option(
'--insecure'
, dest
=
'downloader_factory'
, action
=
'store_const'
,
const
=
lambda
: download_file_insecure, default
=
get_best_downloader,
help
=
'use internal, non-validating downloader'
)
parser.add_option(
'--version'
,
help
=
"specify which version to download"
,
default
=
default_version,
)
parser.add_option(
'--to-dir'
,
help
=
"directory to save (and re-use) package"
,
default
=
default_save_dir,
)
options, args
=
parser.parse_args()
# positional arguments are ignored
return
options
def
_download_args(options):
"""return args for download_setuptools function from cmdline args."""
return
dict
(
version
=
options.version,
download_base
=
options.download_base,
downloader_factory
=
options.downloader_factory,
to_dir
=
options.to_dir,
)
def
main():
"""install or upgrade setuptools and easyinstall."""
options
=
_parse_args()
archive
=
download_setuptools(
*
*
_download_args(options))
return
_install(archive, _build_install_args(options))
if
__name__
=
=
'__main__'
:
sys.exit(main())
|
在cmd中运行:
d:>python ez_setup.py 。
进行setuptools的安装 。
在运行的时候会发生一个错误,该错误为"ascii codec can't decode byte 0xe8 in position 0:ordinal not in range(128)",大意为ascii编码不能解析byte 0xe8。 解决方法:找到并打开python根目录/lib/mimetypes.py文件,在import urllib后,添加代码
1
2
|
reload
(sys)
sys.setdefaultencoding(
'gbk'
)
|
把默认编码方式改为gbk(网上有写用utf8的,在这个脚本中是无效的,需要改成gbk格式)。重新执行python ez_setup.py,如果出现刷屏的安装信息,则说明安装成功了。此时,在python目录下多了一个script文件夹,easy_install就在里面 。
scrapy依赖项的安装 。
scrapy的依赖项 。
安装lxml-3.2.4.win32-py2.7.exe(64位系统需要安装lxml-3.2.4.win-amd64-py2.7.exe) 安装pywin32-218.win32-py2.7.exe(64位系统需要安装pywin32-218.win-amd64-py2.7.exe) 安装twisted-13.2.0.win32-py2.7.exe(64位系统需要安装twisted-13.2.0.win-amd64-py2.7.exe) 安装pyopenssl-0.13.1.win32-py2.7.exe(64位系统需要安装pyopenssl-0.13.1.win-amd64-py2.7.exe) 将zope.interface-4.0.5-py2.7-win32.egg拷贝到c:python27scripts目录下,执行$ easy_install.exe zope.interface-4.0.5-py2.7-win32.egg 。
验证scrapy依赖项是否安装成功的方法:
cmd执行$ python进入python控制台 。
执行import lxml,如果没报错,则说明lxml安装成功 执行import twisted,如果没报错,则说明twisted安装成功 执行import openssl,如果没报错,则说明openssl安装成功 执行import zope.interface,如果没报错,则说明zope.interface安装成功 如果安装成功,那么在cmd中执行& python,然后执行import lxml,如果没有报错,则说明lxml安装成功.
安装scrapy 。
方法1: 控制台输入:easy_install scrapy 方法2:解压缩scrapy-0.22.2.tar.gz,在其目录下执行$ python setup.py install进行scrapy的安装.
检查scrapy是否安装成功的方法:可以在cmd控制台执行 $ scrapy ,如果没有报错,说明安装成功.
相关文章 。
这篇文章就介绍到这了,需要的朋友可以参考一下.
原文链接:https://blog.csdn.net/apple33556/article/details/46695077 。
最后此篇关于win7 x64系统中安装Scrapy的方法的文章就讲到这里了,如果你想了解更多关于win7 x64系统中安装Scrapy的方法的内容请搜索CFSDN的文章或继续浏览相关文章,希望大家以后支持我的博客! 。
问题故障解决记录 -- Java RMI Connection refused to host: x.x.x.x .... 在学习JavaRMI时,我遇到了以下情况 问题原因:可
我正在玩 Rank-N-type 并尝试输入 x x .但我发现这两个函数可以以相同的方式输入,这很不直观。 f :: (forall a b. a -> b) -> c f x = x x g ::
这个问题已经有答案了: How do you compare two version Strings in Java? (31 个回答) 已关闭 8 年前。 有谁知道如何在Java中比较两个版本字符串
这个问题已经有答案了: How do the post increment (i++) and pre increment (++i) operators work in Java? (14 个回答)
下面是带有 -n 和 -r 选项的 netstat 命令的输出,其中目标字段显示压缩地址 (127.1/16)。我想知道 netstat 命令是否有任何方法或选项可以显示整个目标 IP (127.1.
我知道要证明 : (¬ ∀ x, p x) → (∃ x, ¬ p x) 证明是: theorem : (¬ ∀ x, p x) → (∃ x, ¬ p x) := begin intro n
x * x 如何通过将其存储在“auto 变量”中来更改?我认为它应该仍然是相同的,并且我的测试表明类型、大小和值显然都是相同的。 但即使 x * x == (xx = x * x) 也是错误的。什么
假设,我们这样表达: someIQueryable.Where(x => x.SomeBoolProperty) someIQueryable.Where(x => !x.SomeBoolProper
我有一个字符串 1234X5678 我使用这个正则表达式来匹配模式 .X|..X|X. 我得到了 34X 问题是为什么我没有得到 4X 或 X5? 为什么正则表达式选择执行第二种模式? 最佳答案 这里
我的一个 friend 在面试时遇到了这个问题 找到使该函数返回真值的 x 值 function f(x) { return (x++ !== x) && (x++ === x); } 面试官
这个问题在这里已经有了答案: 10年前关闭。 Possible Duplicate: Isn't it easier to work with foo when it is represented b
我是 android 的新手,我一直在练习开发一个针对 2.2 版本的应用程序,我需要帮助了解如何将我的应用程序扩展到其他版本,即 1.x、2.3.x、3 .x 和 4.x.x,以及一些针对屏幕分辨率
为什么案例 1 给我们 :error: TypeError: x is undefined on line... //case 1 var x; x.push(x); console.log(x);
代码优先: # CASE 01 def test1(x): x += x print x l = [100] test1(l) print l CASE01 输出: [100, 100
我正在努力温习我的大计算。如果我有将所有项目移至 'i' 2 个空格右侧的函数,我有一个如下所示的公式: (n -1) + (n - 2) + (n - 3) ... (n - n) 第一次迭代我必须
给定 IP 字符串(如 x.x.x.x/x),我如何或将如何计算 IP 的范围最常见的情况可能是 198.162.1.1/24但可以是任何东西,因为法律允许的任何东西。 我要带198.162.1.1/
在我作为初学者努力编写干净的 Javascript 代码时,我最近阅读了 this article当我偶然发现这一段时,关于 JavaScript 中的命名空间: The code at the ve
我正在编写一个脚本,我希望避免污染 DOM 的其余部分,它将是一个用于收集一些基本访问者分析数据的第 3 方脚本。 我通常使用以下内容创建一个伪“命名空间”: var x = x || {}; 我正在
我尝试运行我的test_container_services.py套件,但遇到了以下问题: docker.errors.APIError:500服务器错误:内部服务器错误(“ b'{” message
是否存在这两个 if 语句会产生不同结果的情况? if(x as X != null) { // Do something } if(x is X) { // Do something } 编
我是一名优秀的程序员,十分优秀!