python - import-im6.q16 : not authorized error 'os' @ error/constitue. c/WriteImage/1037 用于 Python 网络抓取工具-6ren

python - import-im6.q16 : not authorized error 'os' @ error/constitue. c/WriteImage/1037 用于 Python 网络抓取工具

转载作者：太空宇宙更新时间：2023-11-03 16:43:50

我为漫画网站编写了一个简单的网络抓取工具。我在 Ubuntu 上运行它(Linux ubuntu 4.18.0-16-generic#17~18.04.1-Ubuntu)但是当我执行脚本时(权限设置为 chmod ug+x ) 我不断收到导入系统库的一系列错误以及令人困惑的语法错误:

import-im6.q16: not authorized `time' @ error/constitute.c/WriteImage/1037.
import-im6.q16: not authorized `os' @ error/constitute.c/WriteImage/1037.
import-im6.q16: not authorized `sys' @ error/constitute.c/WriteImage/1037.
import-im6.q16: not authorized `re' @ error/constitute.c/WriteImage/1037.
import-im6.q16: not authorized `requests' @ error/constitute.c/WriteImage/1037.
from: can't read /var/mail/bs4
./poorlywrittenscraper.py: line 15: DEFAULT_DIR_NAME: command not found
./poorlywrittenscraper.py: line 16: syntax error near unexpected token `('
./poorlywrittenscraper.py: line 16: `COMICS_DIRECTORY = os.path.join(os.getcwd(), DEFAULT_DIR_NAME)'

有趣的是，当我通过 python3 运行相同的脚本时，它会启动、创建文件夹、获取图像，但...不会保存它们。 o.O

知道我在这里遗漏了什么或如何解决这个问题吗？

完整的脚本代码如下:

"""
A simple image downloader for poorlydrawnlines.com/archive
"""
import time
import os
import sys
import re
import concurrent.futures


import requests
from bs4 import BeautifulSoup as bs


DEFAULT_DIR_NAME = "poorly_created_folder"
COMICS_DIRECTORY = os.path.join(os.getcwd(), DEFAULT_DIR_NAME)


LOGO = """
a Python comic(al) scraper for poorlydwarnlines.com
                         __
.-----.-----.-----.----.|  |.--.--.
|  _  |  _  |  _  |   _||  ||  |  |
|   __|_____|_____|__|  |__||___  |
|__|                        |_____|
                __ __   __
.--.--.--.----.|__|  |_|  |_.-----.-----.
|  |  |  |   _||  |   _|   _|  -__|     |
|________|__|  |__|____|____|_____|__|__|
.-----.----.----.---.-.-----.-----.----.
|__ --|  __|   _|  _  |  _  |  -__|   _|
|_____|____|__| |___._|   __|_____|__|
                      |__|
version: 0.4 | author: baduker | https://github.com/baduker
"""


ARCHIVE_URL = "http://www.poorlydrawnlines.com/archive/"
COMIC_PATTERN = re.compile(r'http://www.poorlydrawnlines.com/comic/.+')

def download_comics_menu(comics_found):
    """
    Main download menu, takes number of available comics for download
    """
    print("\nThe scraper has found {} comics.".format(len(comics_found)))
    print("How many comics do you want to download?")
    print("Type 0 to exit.")

    while True:
        try:
            comics_to_download = int(input(">> "))
        except ValueError:
            print("Error: expected a number. Try again.")
            continue
        if comics_to_download > len(comics_found) or comics_to_download < 0:
            print("Error: incorrect number of comics to download. Try again.")
            continue
        elif comics_to_download == 0:
            sys.exit()
        return comics_to_download


def grab_image_src_url(session, url):
    """
    Fetches urls with the comic image source
    """
    response = session.get(url)
    soup = bs(response.text, 'html.parser')
    for i in soup.find_all('p'):
        for img in i.find_all('img', src=True):
            return img['src']


def download_and_save_comic(session, url):
    """
    Downloads and saves the comic image
    """
    file_name = url.split('/')[-1]
    with open(os.path.join(COMICS_DIRECTORY, file_name), "wb") as file:
        response = session.get(url)
        file.write(response.content)


def fetch_comics_from_archive(session):
    """
    Grabs all urls from the poorlydrawnlines.com/archive and parses for only those that link to published comics
    """
    response = session.get(ARCHIVE_URL)
    soup = bs(response.text, 'html.parser')
    comics = [url.get("href") for url in soup.find_all("a")]
    return [url for url in comics if COMIC_PATTERN.match(url)]


def download_comic(session, url):
    """
    Download progress information
    """
    print("Downloading: {}".format(url))
    url = grab_image_src_url(session, url)
    download_and_save_comic(session, url)


def main():
    """
    Encapsulates and executes all methods in the main function
    """
    print(LOGO)

    session = requests.Session()

    comics = fetch_comics_from_archive(session)
    comics_to_download = download_comics_menu(comics)

    try:
        os.mkdir(DEFAULT_DIR_NAME)
    except OSError as exc:
        sys.exit("Failed to create directory (error_no {})".format(exc.error_no))

    start = time.time()
    with concurrent.futures.ThreadPoolExecutor() as executor:
        executor.map(lambda url: download_comic(session, url), comics[:comics_to_download])
    executor.shutdown()
    end = time.time()
    print("Finished downloading {} comics in {:.2f} sec.".format(comics_to_download, end - start))

if __name__ in "__main__":
    main()

最佳答案

例如，我很确定您在文件开头缺少一个 shebang

#!/usr/bin/env python3
#!/usr/bin/env python2

关于python - import-im6.q16 : not authorized error 'os' @ error/constitue. c/WriteImage/1037 用于 Python 网络抓取工具，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/55558605/

文章推荐： python - Django FormView : ViewDoesNotExist at/s3direct/

文章推荐： python - 合并具有不同索引的数据帧

文章推荐： python - 使用 for 循环创建具有不同名称的变量

python - os.path.abspath(os.path.join(os.path.dirname(__file__), os.path.pardir)) 是什么意思？ Python
在几个 SO 的问题中，有这些行可以访问代码的父目录，例如os.path.join(os.path.dirname(__file__)) returns nothing和 os.path.join(o
python - os.chmod、os.fchmod 和 os.lchmod 之间的区别
我想用 Python 更改文件模式。 os 模块具有三个功能上看似相同的功能: os.chmod os.fchmod os.lchmod 这三个版本有什么区别？最佳答案 chmod 用于更改路径指定
Python:os.read()/os.write() 在 os.pipe() 线程安全吗？
考虑: pipe_read, pipe_write = os.pipe() 现在，我想知道两件事: (1) 我有两个线程。如果我保证只有一个正在读取 os.read(pipe_read,n) 而另一个
Python os- 当我使用 os.mkdir 创建目录时，os.path.isdir 告诉我该目录不存在
这个问题不太可能帮助任何 future 的访问者；它只与一个小的地理区域、一个特定的时间点或一个非常狭窄的情况有关，这些情况并不普遍适用于互联网的全局受众。为了帮助使这个问题更广泛地适用，visit
linux - OS - OS 如何将准备好的设备数据映射到各个进程
让我们以硬盘驱动器/网络接口(interface)为例。它由多个进程共享。现在多个进程可能会向硬盘驱动器发出并发命令来读取数据。当数据可用时，内核如何知道哪个进程的数据已准备好？操作系统和硬盘驱动器之
python - os.link() 与 os.rename() 与 os.replace() 用于编写原子写入文件。最好的方法是什么？
嗨，我正在尝试编写像这样的原子写入函数...... with tempfile.NamedTemporaryFile(mode= "w", dir= target_directory) as f:
sockets - 如何从 os.Error 获取 os.Errno？其他使用 os.Timeout 的方法？
net.Conn接口(interface)提供了 SetTimeout 方法，我应该用 os.Timeout 检查返回的错误.但是我看不到在返回的 os.Error 上调用 os.Timeout 的方
python - os.getcwd() 与 os.path.abspath(os.path.dirname(__file__))
我正在使用 os 模块在我的 Django 项目 settings.py 文件中具有相对路径。变量 SITE_ROOT 设置为 settings.py 文件的当前工作目录，然后用于引用同样位于同一目录
python - 为什么 os.path.join 不使用 os.path.sep 或 os.sep？
正如我们所知，Windows 接受 "\" 和 "/" 作为分隔符。但是在python中，使用的是"\"。例如，调用 os.path.join("foo","bar")，将返回 'foo\\bar'。
python - 如何修复 os.path.join(os.getcwd(), os.relpath ('my_file' )) 不返回 'my_file' 的路径？
我有以下工作目录:/Users/jordan/Coding/Employer/code_base ，我想要获取绝对路径的文件位于 /Users/jordan/Coding/Employer/code_
python - os.path.expanduser ("~/x") 是否等同于 os.path.abspath(os.path.expanduser ("~/x"))？
在 Python 中，如果路径中包含“~”，我能否确定扩展的用户调用将是绝对路径？例如，这个表达式是否总是为真？ path = '~/.my_app' os.path.expanduser(path
python - os.path.dirname(os.path.abspath(__file__)) 和 os.path.dirname(__file__) 的区别
我是 Django 项目的初学者。Django 项目的 settings.py 文件包含这两行: BASE_DIR = os.path.dirname(os.path.dirname(os.path.
python `os` 返回 `os` 认为不存在的文件
我有一个旧 MAC OS 文件存储中的文件集合。我知道集合存在文件名/路径名问题。问题源于我认为在原始操作系统中呈现为破折号的路径中包含一个代码点，但 Windows 与代码点斗争，并且其中一个包含
Ubuntu安装mac os x主题让你的Ubuntu看起来更像MAC OS X
Ubuntu怎么安装mac os x主题呢？下文小编将为大家分享ubuntu14.04安装mac os x主题教程，安装MAC OS X&
firefox-os - 如何在 Firefox OS 的默认浏览器中打开链接？
我有一个 Firefox OS 应用程序，我希望在该应用程序之外打开一个链接(该链接指向不同的站点，在应用程序中打开它会使应用程序在没有强制的情况下无法使用)。我怎么做？ Related bug re
mobile-os - 如何为 Firefox OS 编写应用程序
我想为 Firefox OS 编写我的应用程序.使用什么样的语言(如 Android 的 Java 和 iOS 的 Objective C++)和工具(如 Eclipse、Xcode)？最佳答案适
palm-os - "Background"Palm OS 中的任务
我正在尝试创建一个 Palm OS 应用程序，以每 X 分钟或几小时检查一次网站，并在有数据可用时提供通知。我知道这种事情可以在新的 Palm 上完成——例如，当应用程序不在顶部时，我的 Centro
firefox-os - 如何在 Firefox-OS 中处理不同的屏幕分辨率？
我需要在 Firefox OS 中显示全屏图像。我有一个具有 qHD 分辨率(960x540 像素)的“峰值”开发预览手机。如何确保我的应用程序在其他具有不同屏幕分辨率的 firefox-os 设备
firefox-os - 本地化 Firefox OS (B2G)
我正在尝试在 Firefox OS 中安装一个新的语言环境，但我不确定我是否正确地按照这些步骤操作。首先，我尝试使用 Mercurial 下载所需的语言环境:它对我不起作用，Mercurial 说访
macos - 在Mac OS(OS X)中登录时启动Shell脚本
我有这个shell脚本Test.sh： #! /bin/bash FILE_TO_CHECK="/Users/test/start.txt" EXIT=0 while [ $EXIT -eq 0 ];

太空宇宙

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

python - import-im6.q16 : not authorized error 'os' @ error/constitue. c/WriteImage/1037 用于 Python 网络抓取工具