gpt4 book ai didi

python - Azure Artifacts 源上的 Python 包的最佳实践

转载 作者:行者123 更新时间:2023-12-03 07:00:13 30 4
gpt4 key购买 nike

我开发了一些 Python 包,并通过 DevOps 管道将其上传到 Azure DevOps Artifacts 上。它运行良好,但管道不仅存储我的包,还存储它们对 setup.cfg 文件的依赖项!

它们是正常的依赖项、pandas 等,但在 Artifacts 上存储这些库的副本是否是最佳实践?按照我的逻辑,我会说不......我怎样才能防止这种行为?

这些是我的管道和我的 cfg 文件:

管道

trigger:
tags:
include:
- 'v*.*'
branches:
include:
- main
- dev-release

pool:
vmImage: 'ubuntu-latest'

stages:
- stage: 'Stage_Test'
variables:
- group: UtilsDev
jobs:
- job: 'Job_Test'
steps:
- task: UsePythonVersion@0
inputs:
versionSpec: '$(pythonVersion)'
displayName: 'Use Python $(pythonVersion)'

- script: |
python -m pip install --upgrade pip
displayName: 'Upgrade PIP'

- script: |
pip install pytest pytest-azurepipelines
displayName: 'Install test dependencies'

- script: |
pytest
displayName: 'Execution of PyTest'

- stage: 'Stage_Build'
variables:
- group: UtilsDev
jobs:
- job: 'Job_Build'
steps:
- task: UsePythonVersion@0
inputs:
versionSpec: '$(pythonVersion)'
displayName: 'Use Python $(pythonVersion)'

- script: |
python -m pip install --upgrade pip
displayName: 'Upgrade PIP'

- script: |
pip install build wheel
displayName: 'Install build dependencies'

- script: |
python -m build
displayName: 'Artifact creation'

- publish: '$(System.DefaultWorkingDirectory)'
artifact: package

- stage: 'Stage_Deploy_DEV'
condition: and(succeeded(), eq(variables['Build.SourceBranch'], 'refs/heads/dev-release'))
variables:
- group: UtilsDev
jobs:
- deployment: Build_Deploy
displayName: Build Deploy
environment: [OMIT]-artifacts-dev
strategy:
runOnce:
deploy:
steps:
- download: current
artifact: package

- task: UsePythonVersion@0
inputs:
versionSpec: '$(pythonVersion)'
displayName: 'Use Python $(pythonVersion)'

- script: |
pip install twine
displayName: 'Install build dependencies'

- task: TwineAuthenticate@1
displayName: 'Twine authentication'
inputs:
pythonUploadServiceConnection: 'PythonPackageUploadDEV'

- script: |
python -m twine upload --skip-existing --verbose -r $(feedName) --config-file $(PYPIRC_PATH) dist/*
workingDirectory: '$(Pipeline.Workspace)/package'
displayName: 'Artifact upload'

- stage: 'Stage_Deploy_PROD'
dependsOn: 'Stage_Build'
condition: and(succeeded(), or(eq(variables['Build.SourceBranch'], 'refs/heads/main'), startsWith(variables['Build.SourceBranch'], 'refs/tags/v')))
variables:
- group: UtilsProd
jobs:
- job: 'Approval_PROD_Release'
pool: server
steps:
- task: ManualValidation@0
timeoutInMinutes: 1440 # task times out in 1 day
inputs:
notifyUsers: |
[USER]@[OMIT].com
instructions: 'Please validate the build configuration and resume'
onTimeout: 'resume'
- deployment: Build_Deploy
displayName: Build Deploy
environment: [OMIT]-artifacts-prod
strategy:
runOnce:
deploy:
steps:
- download: current
artifact: package

- task: UsePythonVersion@0
inputs:
versionSpec: '$(pythonVersion)'
displayName: 'Use Python $(pythonVersion)'

- script: |
pip install twine
displayName: 'Install build dependencies'

- task: TwineAuthenticate@1
displayName: 'Twine authentication'
inputs:
pythonUploadServiceConnection: 'PythonPackageUploadPROD'

- script: |
python -m twine upload --skip-existing --verbose -r $(feedName) --config-file $(PYPIRC_PATH) dist/*
workingDirectory: '$(Pipeline.Workspace)/package'
displayName: 'Artifact upload'

安装文件

[metadata]
name = [OMIT]_azure
version = 0.2
author = [USER]
author_email = [USER]@[OMIT].com
description = A package containing utilities for interacting with Azure
long_description = file: README.md
long_description_content_type = text/markdown
project_urls =
classifiers =
Programming Language :: Python :: 3
License :: OSI Approved :: MIT License
Operating System :: OS Independent

[options]
package_dir =
= src
packages = find:
python_requires = >=3.7
install_requires =
azure-storage-file-datalake>="12.6.0"
pyspark>="3.2.1"
openpyxl>="3.0.9"
pandas>="1.4.2"
pyarrow>="8.0.0"
fsspec>="2022.3.0"
adlfs>="2022.4.0"
[OMIT]-utils>="0.4"

[options.packages.find]
where = src

我注意到管道仅在生产阶段(Stage_Deploy_PROD)而不是在开发发布阶段(Stage_Deploy_DEV)有此行为,并且存储的依赖项远多于 setup.cfg 文件中指定的 8 个.

有人处理过这个问题吗?

提前致谢!!

最佳答案

根据这个doc启用上游源后,每次从公共(public)注册表安装包时,Azure Artifacts 都会在源中保存该包的副本。

Artifact 中的包比 setup.cfg 文件中的包多的原因之一是,当您下载某些包时,这些包的必要依赖项也会一起下载。拿PySpark例如,当您下载PySpark时,由于需要Py4J,因此也会一起下载。 enter image description here

这是我的测试结果,当我只在管道中下载PySpark时,Py4J也被下载并保存到Artifact中。 enter image description here

关于python - Azure Artifacts 源上的 Python 包的最佳实践,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/72517209/

30 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com