gpt4 book ai didi

Cloud DataFlow 中的 Python 依赖项,requirements.txt 在本地工作,但在 worker 上不工作

转载 作者:太空宇宙 更新时间:2023-11-03 21:42:52 26 4
gpt4 key购买 nike

我正在尝试使用此处所述的 requirements.txt 文件运行我的 Cloud DataFlow 作业

https://cloud.google.com/dataflow/pipelines/dependencies-python

与其从源代码构建所有 opencv(需要 20-30 分钟),我可以只构建 python 库

通过我的计算引擎,我可以做到这一 pip

root@fcfca6a4dad2:/DeepMeerkat# pip install opencv-python
Collecting opencv-python
Downloading opencv_python-3.2.0.7-cp27-cp27mu-manylinux1_x86_64.whl (6.7MB)
100% |################################| 6.7MB 163kB/s
Collecting numpy>=1.11.1 (from opencv-python)
Downloading numpy-1.13.0-cp27-cp27mu-manylinux1_x86_64.whl (16.6MB)
100% |################################| 16.6MB 68kB/s
Installing collected packages: numpy, opencv-python
Found existing installation: numpy 1.8.2
DEPRECATION: Uninstalling a distutils installed project (numpy) has been deprecated and will be removed in a future version. This is due to the fact that uninstalling a distutils project will only partially uninstall the project.
Uninstalling numpy-1.8.2:
Successfully uninstalled numpy-1.8.2
Successfully installed numpy-1.13.0 opencv-python-3.2.0.7

我可以将其与其他一些模块一起包装到一个需求文件中

root@fcfca6a4dad2:/DeepMeerkat# pip install -r tests/prediction/requirements.txt
Requirement already satisfied: opencv-python in /usr/local/lib/python2.7/dist-packages (from -r tests/prediction/requirements.txt (line 1))
Collecting tensorflow==1.0.1 (from -r tests/prediction/requirements.txt (line 2))
Downloading tensorflow-1.0.1-cp27-cp27mu-manylinux1_x86_64.whl (44.1MB)
100% |################################| 44.1MB 27kB/s
Requirement already satisfied: numpy in /usr/local/lib/python2.7/dist-packages (from -r tests/prediction/requirements.txt (line 3))
Requirement already satisfied: mock>=2.0.0 in /usr/local/lib/python2.7/dist-packages (from tensorflow==1.0.1->-r tests/prediction/requirements.txt (line 2))
Requirement already satisfied: wheel in /usr/lib/python2.7/dist-packages (from tensorflow==1.0.1->-r tests/prediction/requirements.txt (line 2))
Requirement already satisfied: six>=1.10.0 in /usr/local/lib/python2.7/dist-packages (from tensorflow==1.0.1->-r tests/prediction/requirements.txt (line 2))
Requirement already satisfied: protobuf>=3.1.0 in /usr/local/lib/python2.7/dist-packages (from tensorflow==1.0.1->-r tests/prediction/requirements.txt (line 2))
Requirement already satisfied: funcsigs>=1; python_version < "3.3" in /usr/local/lib/python2.7/dist-packages (from mock>=2.0.0->tensorflow==1.0.1->-r tests/prediction/requirements.txt (line 2))
Requirement already satisfied: pbr>=0.11 in /usr/local/lib/python2.7/dist-packages (from mock>=2.0.0->tensorflow==1.0.1->-r tests/prediction/requirements.txt (line 2))
Requirement already satisfied: setuptools in /usr/local/lib/python2.7/dist-packages (from protobuf>=3.1.0->tensorflow==1.0.1->-r tests/prediction/requirements.txt (line 2))
Installing collected packages: tensorflow
Successfully installed tensorflow-1.0.1

但是,当我将它发送到云数据流时,它无法从 worker 中找到 opencv-python。

root@fcfca6a4dad2:/DeepMeerkat# python tests/prediction/run.py \
> --runner DataflowRunner \
> --project $PROJECT \
> --staging_location $BUCKET/staging \
> --temp_location $BUCKET/temp \
> --job_name $PROJECT-deepmeerkat \
> --setup_file tests/prediction/setup.py \
> --requirements_file tests/prediction/requirements.txt
No handlers could be found for logger "oauth2client.contrib.multistore_file"
/usr/local/lib/python2.7/dist-packages/apache_beam/io/gcp/gcsio.py:113: DeprecationWarning: object() takes no parameters
super(GcsIO, cls).__new__(cls, storage_client))
INFO:root:Starting the size estimation of the input
INFO:oauth2client.transport:Attempting refresh to obtain initial access_token
INFO:root:Finished the size estimation of the input at 1 files. Estimation took 0.0855119228363 seconds
INFO:root:Starting the size estimation of the input
INFO:oauth2client.transport:Attempting refresh to obtain initial access_token
INFO:root:Finished the size estimation of the input at 1 files. Estimation took 0.0597159862518 seconds
/usr/local/lib/python2.7/dist-packages/apache_beam/coders/typecoders.py:135: UserWarning: Using fallback coder for typehint: Any.
warnings.warn('Using fallback coder for typehint: %r.' % typehint)
INFO:root:Starting GCS upload to gs://api-project-773889352370-testing/staging/api-project-773889352370-deepmeerkat.1499372970.163850/requirements.txt...
INFO:oauth2client.transport:Attempting refresh to obtain initial access_token
INFO:root:Completed GCS upload to gs://api-project-773889352370-testing/staging/api-project-773889352370-deepmeerkat.1499372970.163850/requirements.txt
INFO:root:Executing command: ['/usr/bin/python', '-m', 'pip', 'install', '--download', '/tmp/dataflow-requirements-cache', '-r', 'tests/prediction/requirements.txt', '--no-binary', ':all:']
DEPRECATION: pip install --download has been deprecated and will be removed in the future. Pip now has a download command that should be used instead.
Collecting opencv-python (from -r tests/prediction/requirements.txt (line 1))
Could not find a version that satisfies the requirement opencv-python (from -r tests/prediction/requirements.txt (line 1)) (from versions: )
No matching distribution found for opencv-python (from -r tests/prediction/requirements.txt (line 1))
Traceback (most recent call last):
File "tests/prediction/run.py", line 22, in <module>
predict.run()
File "/DeepMeerkat/tests/prediction/modules/predict.py", line 32, in run
p.run()
File "/usr/local/lib/python2.7/dist-packages/apache_beam/pipeline.py", line 167, in run
self.to_runner_api(), self.runner, self._options).run(False)
File "/usr/local/lib/python2.7/dist-packages/apache_beam/pipeline.py", line 176, in run
return self.runner.run(self)
File "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/dataflow/dataflow_runner.py", line 252, in run
self.dataflow_client.create_job(self.job), self)
File "/usr/local/lib/python2.7/dist-packages/apache_beam/utils/retry.py", line 168, in wrapper
return fun(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/dataflow/internal/apiclient.py", line 425, in create_job
self.create_job_description(job)
File "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/dataflow/internal/apiclient.py", line 448, in create_job_description
job.options, file_copy=self._gcs_file_copy)
File "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/dataflow/internal/dependency.py", line 307, in stage_job_resources
setup_options.requirements_file, requirements_cache_path)
File "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/dataflow/internal/dependency.py", line 241, in _populate_requirements_cache
processes.check_call(cmd_args)
File "/usr/local/lib/python2.7/dist-packages/apache_beam/utils/processes.py", line 44, in check_call
return subprocess.check_call(*args, **kwargs)
File "/usr/lib/python2.7/subprocess.py", line 540, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python', '-m', 'pip', 'install', '--download', '/tmp/dataflow-requirements-cache', '-r', 'tests/prediction/requirements.txt', '--no-binary', ':all:']' returned non-zero exit status 1

问题似乎出在没有二进制标记上。本地运行(以上卸载后)

root@fcfca6a4dad2:/DeepMeerkat# pip install -r tests/prediction/requirements.txt --no-binary :all:
Collecting opencv-python (from -r tests/prediction/requirements.txt (line 1))
Could not find a version that satisfies the requirement opencv-python (from -r tests/prediction/requirements.txt (line 1)) (from versions: )
No matching distribution found for opencv-python (from -r tests/prediction/requirements.txt (line 1))

no-binary 标志被描述为排除破轮?这在这种情况下如何适用?

可以确认模块可以运行

再次,

root@fcfca6a4dad2:/DeepMeerkat# pip install opencv-python
Collecting opencv-python
Using cached opencv_python-3.2.0.7-cp27-cp27mu-manylinux1_x86_64.whl
Requirement already satisfied: numpy>=1.11.1 in /usr/local/lib/python2.7/dist-packages (from opencv-python)
Installing collected packages: opencv-python
Successfully installed opencv-python-3.2.0.7
root@fcfca6a4dad2:/DeepMeerkat# python
Python 2.7.9 (default, Jun 29 2016, 13:08:31)
[GCC 4.9.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>>

最佳答案

我认为您看到的错误实际上是由于工作人员未能安装 wheel 文件而引起的。如 opencv-python package page 中所述wheel 文件的问题可能会导致包显示为未找到。

在这种情况下,您可以使用非 PyPI 包的说明并指定 --extra_package <local path to wheel file>而不是将 opencv-python 添加为要求。这应该会导致 wheel 文件在每个 worker 中暂存和安装。

关于Cloud DataFlow 中的 Python 依赖项,requirements.txt 在本地工作,但在 worker 上不工作,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44958387/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com