python - Flask 应用程序处理请求线程错误？-6ren

python - Flask 应用程序处理请求线程错误？

转载作者：太空狗更新时间：2023-10-29 18:09:03

这可能是一个远景，但这是我得到的错误:

  File "/home/MY NAME/anaconda/lib/python2.7/SocketServer.py", line 596, in process_request_thread
    self.finish_request(request, client_address)
  File "/home/MY NAME/anaconda/lib/python2.7/SocketServer.py", line 331, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/home/MY NAME/anaconda/lib/python2.7/SocketServer.py", line 654, in __init__
    self.finish()
  File "/home/MY NAME/anaconda/lib/python2.7/SocketServer.py", line 713, in finish
    self.wfile.close()
  File "/home/MY NAME/anaconda/lib/python2.7/socket.py", line 283, in close
    self.flush()
  File "/home/MY NAME/anaconda/lib/python2.7/socket.py", line 307, in flush
    self._sock.sendall(view[write_offset:write_offset+buffer_size])
error: [Errno 32] Broken pipe

我构建了一个 Flask 应用程序，它将地址作为输入并执行一些字符串格式化、操作等，然后将它们发送到 Bing Maps 进行地理编码(通过 geopy 外部模块)。

我正在使用此应用程序来清理非常大的数据集。该应用程序适用于通常约 1,500 个地址的输入(每行输入 1 个)。我的意思是，它将处理地址并将其发送到 Bing Maps 进行地理编码，然后返回。在大约 1,500 个地址后，应用程序变得无响应。如果在我工作时发生这种情况，我的代理会告诉我存在 tcp 错误。如果我在非工作计算机上，它就不会加载页面。如果我重新启动应用程序，那么它就可以正常运行。因此，我被迫以大约 1,000 个地址的批处理运行我的程序(只是为了安全起见，因为我还不确定程序崩溃的确切数量)。

有人知道是什么原因造成的吗？

我的想法与我当天达到我的 Bing API key 限制(30,000 个)类似，但这并不准确，因为我每天使用的请求很少超过 15,000 个。

我的第二个想法是，也许是因为我仍在使用标准的 Flask 服务器来运行我的应用程序。切换到 gunicorn 或 uWSGI 会解决这个问题吗？

我的第三个想法是它可能因请求量而重载。我尝试在前 1,000 个地址后让程序休眠 15 秒左右，但这并没有解决任何问题。

如果有人需要进一步说明，请告诉我。

这是我的 Flask 应用程序后端代码。我从这个函数得到输入:

@app.route("/clean", methods=['POST'])
def dothing():
    addresses = request.form['addresses']
    return cleanAddress(addresses)

这是 cleanAddress 函数:它现在有点困惑，所有的 if 语句都用于检查地址中的特定拼写错误，但我计划将大部分代码移到其他代码中另一个文件中的函数，并通过这些函数传递地址来稍微清理一下。

def cleanAddress(addresses):

    counter = 0

    # nested helper function to fix addresses such as '30 w 60th'
    def check_st(address):
        if 'broadway' in address:
            return address
        has_th_st_nd_rd = re.compile(r'(?P<number>[\d]{1,4}(th|st|nd|rd)\s)(?P<following>.*)')
        has_number = has_th_st_nd_rd.search(address)
        if has_number is not None:
            if re.match(r'(street|st|floor)', has_number.group('following')):   
                return address
            else:
                new_address = re.sub('(?P<number>[\d]{1,4}(st|nd|rd|th)\s)', r'\g<number>street ', address, 1)
                return new_address
        else:
            return address

    addresses = addresses.split('\n')
    cleaned = []
    success = 0
    fail = 0
    cleaned.append('<body bgcolor="#FACC2E"><center><img src="http://goglobal.dhl-usa.com/common/img/dhl-express-logo.png" alt="Smiley face" height="100" width="350"><br><p>')

    cleaned.append('<br><h3>Note: Everything before the first comma is the Old Address. Everything after the first comma is the New Address</h13>')
    cleaned.append('<p><h3>To format the output in Excel, split the columns using "," as the delimiter. </p></h3>')
    cleaned.append('<p><h2><font color="red">Old Address </font> <font color="black">New Address </font></p></h2>')

    for address in addresses:
        dirty = address.strip()
        if ',' in address:
            dirty = dirty.replace(',', '')
        cleaned.append('<font color="red">' + dirty + ', ' + '</font>')

        address = address.lower()
        address = re.sub('[^A-Za-z0-9#]+', ' ', address).lstrip()

        pattern = r"\d+.* +(\d+ .*(" + "|".join(patterns) + "))"
        address = re.sub(pattern, "\\1", address)

        address = check_st(address) 


        if 'one ' in address:
            address = address.replace('one', '1')
        if 'two' in address:
            address = address.replace('two', '2')
        if 'three' in address:
            address = address.replace('three', '3')
        if 'four' in address:
            address = address.replace('four', '4')
        if 'five' in address:
            address = address.replace('five', '5')
        if 'eight' in address:
            address = address.replace('eight', '8')
        if 'nine' in address:
            address = address.replace('nine', '9')
        if 'fith' in address:
            address = address.replace('fith', 'fifth')
        if 'aveneu' in address:
            address = address.replace('aveneu', 'avenue')
        if 'united states of america' in address:
            address = address.replace('united states of america', '')
        if 'ave americas' in address:
            address = address.replace('ave americas', 'avenue of the americas')
        if 'americas avenue' in address:
            address = address.replace('americas avenue', 'avenue of the americas')
        if 'avenue of americas' in address:
            address = address.replace('avenue of americas', 'avenue of the americas')
        if 'avenue of america ' in address:
            address = address.replace('avenue of america ', 'avenue of the americas ')
        if 'ave of the americ' in address:
            address = address.replace('ave of the americ', 'avenue of the americas')
        if 'avenue america' in address:
            address = address.replace('avenue america', 'avenue of the americas')
        if 'americaz' in address:
            address = address.replace('americaz', 'americas')
        if 'ave of america' in address:
            address = address.replace('ave of america', 'avenue of the americas')
        if 'amrica' in address:
            address = address.replace('amrica', 'americas')
        if 'americans' in address:
            address = address.replace('americans', 'americas')
        if 'walk street' in address:
            address = address.replace('walk street', 'wall street')
        if 'northend' in address:
            address = address.replace('northend', 'north end')
        if 'inth' in address:
            address = address.replace('inth', 'ninth')
        if 'aprk' in address:
            address = address.replace('aprk', 'park')
        if 'eleven' in address:
            address = address.replace('eleven', '11')
        if ' av ' in address:
            address = address.replace(' av ', ' avenue')
        if 'avnue' in address:
            address = address.replace('avnue', 'avenue')
        if 'ofthe americas' in address:
            address = address.replace('ofthe americas', 'of the americas')
        if 'aj the' in address:
            address = address.replace('aj the', 'of the')
        if 'fifht' in address:
            address = address.replace('fifht', 'fifth')
        if 'w46' in address:
            address = address.replace('w46', 'w 46')
        if 'w42' in address:
            address = address.replace('w42', 'w 42')
        if '95st' in address:
            address = address.replace('95st', '95th st')
        if 'e61 st' in address:
            address = address.replace('e61 st', 'e 61st')
        if 'driver information' in address:
            address = address.replace('driver information', '')
        if 'e87' in address:
            address = address.replace('e87', 'e 87')
        if 'thrd avenus' in address:
            address = address.replace('thrd avenus', 'third avenue')
        if '3r ' in address:
            address = address.replace('3r ', '3rd ')
        if 'st ates' in address:
            address = address.replace('st ates', '')
        if 'east52nd' in address:
            address = address.replace('east52nd', 'east 52nd')
        if 'authority to leave' in address:
            address = address.replace('authority to leave', '')
        if 'sreet' in address:
            address = address.replace('sreet', 'street')
        if 'w47' in address:
            address = address.replace('w47', 'w 47')
        if 'signature required' in address:
            address = address.replace('signature required', '')
        if 'direct' in address:
            address = address.replace('direct', '')
        if 'streetapr' in address:
            address = address.replace('streetapr', 'street')
        if 'steet' in address:
            address = address.replace('steet', 'street')
        if 'w39' in address:
            address = address.replace('w39', 'w 39')
        if 'ave of new york' in address:
            address = address.replace('ave of new york', 'avenue of the americas')
        if 'avenue of new york' in address:
            address = address.replace('avenue of new york', 'avenue of the americas')
        if 'brodway' in address:
            address = address.replace('brodway', 'broadway')
        if 'w 31 ' in address:
            address = address.replace('w 31 ', 'w 31th ')
        if 'w 34 ' in address:
            address = address.replace('w 34 ', 'w 34th ')
        if 'w38' in address:
            address = address.replace('w38', 'w 38')
        if 'broadeay' in address:
            address = address.replace('broadeay', 'broadway')
        if 'w37' in address:
            address = address.replace('w37', 'w 37')
        if '35street' in address:
            address = address.replace('35street', '35th street')
        if 'eighth avenue' in address:
            address = address.replace('eighth avenue', '8th avenue')
        if 'west 33' in address:
            address = address.replace('west 33', 'west 33rd')
        if '34t ' in address:
            address = address.replace('34t ', '34th ')
        if 'street ave' in address:
            address = address.replace('street ave', 'ave')
        if 'avenue of york' in address:
            address = address.replace('avenue of york', 'avenue of the americas')
        if 'avenue aj new york' in address:
            address = address.replace('avenue aj new york', 'avenue of the americas')
        if 'avenue ofthe new york' in address:
            address = address.replace('avenue ofthe new york', 'avenue of the americas')
        if 'e4' in address:
            address = address.replace('e4', 'e 4')
        if 'avenue of nueva york' in address:
            address = address.replace('avenue of nueva york', 'avenue of the americas')
        if 'avenue of new york' in address:
            address = address.replace('avenue of new york', 'avenue of the americas')
        if 'west end new york' in address:
            address = address.replace('west end new york', 'west end avenue')

        #print address    
        address = address.split(' ')
        for pattern in patterns:
            try:
                if address[0].isdigit():
                    continue
                else:
                    location = address.index(pattern) + 1
                    number_location = address[location]
                    #print address[location]
                    #if 'th' in address[location + 1] or 'floor' in address[location + 1] or '#' in address[location]:
                    #    continue
            except (ValueError, IndexError):
                continue
            if number_location.isdigit() and len(number_location) <= 4:
                address = [number_location] + address[:location] + address[location+1:]
                break
        address = ' '.join(address)

        if '#' in address:
            address = address.replace('#', '')


        #print (address)


        i = 0
        for char in address:
            if char.isdigit():
                address = address[i:]
                break
            i += 1


        #print (address)

        if 'plz' in address:
            address = address.replace('plz', 'plaza ', 1)
        if 'hstreet' in address:
            address = address.replace('hstreet', 'h street')
        if 'dstreet' in address:
            address = address.replace('dstreet', 'd street')
        if 'hst' in address:
            address = address.replace('hst', 'h st')
        if 'dst' in address:
            address = address.replace('dst', 'd st')
        if 'have' in address:
            address = address.replace('have', 'h ave')
        if 'dave' in address:
            address = address.replace('dave', 'd ave')
        if 'havenue' in address:
            address = address.replace('havenue', 'h avenue')
        if 'davenue' in address:
            address = address.replace('davenue', 'd avenue')



        #print address

        regex = r'(.*)(' + '|'.join(patterns) + r')(.*)'
        address = re.sub(regex, r'\1\2', address).lstrip() + " nyc"

        print (address)

        if 'americasas st' in address:
            address = address.replace('americasas st', 'americas')

        try:

            clean = geolocator.geocode(address)
            x = clean.address
            address, city, zipcode, country = x.split(",")
            address = address.lower()
            if 'first' in address:
                address = address.replace('first', '1st')
            if 'second' in address:
                address = address.replace('second', '2nd')
            if 'third' in address:
                address = address.replace('third', '3rd')
            if 'fourth' in address:
                address = address.replace('fourth', '4th')
            if 'fifth' in address:
                address = address.replace('fifth', '5th')
            if ' sixth a' in address:
                address = address.replace('ave', '')
                address = address.replace('avenue', '')
                address = address.replace(' sixth', ' avenue of the americas')
            if ' 6th a' in address:
                address = address.replace('ave', '')
                address = address.replace('avenue', '')
                address = address.replace(' 6th', ' avenue of the americas')
            if 'seventh' in address:
                address = address.replace('seventh', '7th')
            if 'fashion' in address:
                address = address.replace('fashion', '7th')
            if 'eighth' in address:
                address = address.replace('eighth', '8th')
            if 'ninth' in address:
                address = address.replace('ninth', '9th')
            if 'tenth' in address:
                address = address.replace('tenth', '10th')
            if 'eleventh' in address:
                address = address.replace('eleventh', '11th')


            zipcode = zipcode[3:]
            to_write = str(address) + ", " + str(zipcode.lstrip()) + ", " + str(clean.latitude) + ", " + str(clean.longitude)
            to_find = str(address)

            #print to_write

            # returns 'can not be cleaned' if street address has no numbers
            if any(i.isdigit() for i in str(address)):
                with open('/home/MY NAME/Address_Database.txt', 'a+') as database:
                    if to_find not in database.read():
                        database.write(dirty + '|' + to_write + '\n')
                if 'ncy rd' in address:
                    cleaned.append('<font color="red"> Can not be cleaned </font> <br>')
                    fail += 1
                elif 'nye rd' in address:
                    cleaned.append('<font color="red"> Can not be cleaned </font> <br>')
                    fail += 1
                elif 'nye c' in address:
                    cleaned.append('<font color="red"> Can not be cleaned </font> <br>')
                    fail += 1                    
                else:
                    cleaned.append(to_write + '<br>')
                    success += 1
            else:
                cleaned.append('<font color="red"> Can not be cleaned </font> <br>')
                fail += 1
        except AttributeError:
            cleaned.append('<font color="red"> Can not be cleaned </font> <br>')
            fail += 1
        except ValueError:
            cleaned.append('<font color="red"> Can not be cleaned </font> <br>')
            fail += 1
        except GeocoderTimedOut as e:
            cleaned.append('<font color="red"> Can not be cleaned </font> <br>')
            fail += 1

    total = success + fail
    percent = float(success) / float(total) * 100
    percent = round(percent, 2)
    print percent
    cleaned.append('<br>Accuracy: ' + str(percent) + ' %')
    cleaned.append('</p></center></body>')

    return "\n".join(cleaned)

更新:我已经切换到使用 gunicorn 运行应用程序，这解决了我从家庭网络访问应用程序时的问题，但是，我仍然收到 TCP 错误我的工作代理。我的控制台中没有收到任何错误消息，浏览器只显示 TCP 错误。我可以看出该工具仍在后台运行，因为我在循环中有一个打印语句告诉我每个地址仍在进行地理编码。这可能是因为我的工作网络不喜欢页面长时间加载然后只显示代理错误页面吗？

最佳答案

听起来好像文件句柄用完了(普通用户的默认限制为 1024)，您可以通过运行 grep 'open' /proc/<webapp pid> 来检查限制和 ls -1 /proc/<pid>/fd | wc -l当前打开的文件句柄。

我认为您的代码没有发送正确的响应，这导致连接保持打开状态，最终用完打开的文件句柄(打开的套接字是 posix 系统上的文件)。

可以用 netstat -an | grep <webapp port> 确认连接的状态当你看到这个问题。它应该有 1k+ IP 和端口及其状态的列表。

会猜测它们在 TIME_WAIT 中指示客户端未正确关闭连接的状态，由内核稍后对它们进行垃圾回收。

尝试:

from flask import make_response

@app.route("/clean", methods=['POST'])
def dothing():
    addresses = request.form['addresses']
    resp = make_response(cleanAddress(addresses), 200)
    return resp

关于python - Flask 应用程序处理请求线程错误？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/38899790/

文章推荐： python - 复制和合并不包括某些扩展名的目录

文章推荐： c# - 有没有一段时间 && (AndAlso) 与 & (And) 无关

文章推荐： c# - 在 C# 中编写 F# 递归文件夹访问者 - seq 与 IEnumerable

文章推荐： python - 如何在 Pandas 中创建数据框 View ？

ios - 如果对主纹理进行 mipmap 处理，是否还需要对多重采样纹理进行 mipmap 处理？
对于 Metal ，如果对主纹理进行 mipmap 处理，是否还需要对多采样纹理进行 mipmap 处理？我阅读了苹果文档，但没有得到任何相关信息。最佳答案 Mipmapping 适用于您将从中
javascript - 让一些路由由 Groovy 处理，另一些由 React-router v4 处理
我正在使用的代码在后端 Groovy 代码中具有呈现 GSP(Groovy 服务器页面)的 Controller 。对于前端，我们使用 React-router v4 来处理路由。我遇到的问题是，通过
jquery - 让客户端 (Javascript) 处理 HTML 比用 C# 处理 HTML 更好吗？
我们正在 build 一个巨大的网站。我们正在考虑是在服务器端(ASP .Net)还是在客户端进行 HTML 处理。例如，我们有 HTML 文件，其作用类似于用于生成选项卡的模板。服务器端获取 HT
java - 处理 - 图像数组错误 - "Type mismatch, ' 处理 .core.PImage' 不匹配.."
我正在尝试将图像加载到 void setup() 中的数组中，但是当我这样做时出现此错误:“类型不匹配，'processing .core.PImage' does not匹配“processing.
javascript - 客户更新请求可通过 POSTMAN 处理，但无法使用 Shopify 私有(private)应用程序通过 AJAX 处理
我正在尝试使用其私有(private)应用程序更新 Shopify 上的客户标签。我用 postman 尝试过，一切正常，但通过 AJAX，它带我成功回调而不是错误，但成功后我得到了身份验证链接，而不
处理 - 更改默认应用程序图标
如何更改我的 Processing appIconTest.exe 导出的默认图标在窗口中的应用程序？默认一个: 最佳答案经过一些研究，我能找到的最简单的解决方案是: 进入 ...\process
处理:如何添加背景音乐
我在 Processing 中做了一个简单的小游戏，但需要一些帮助。我有一个 mp3，想将它添加到我的应用程序中，以便在后台循环运行。这可能吗？非常感谢。最佳答案您可以使用声音库。处理已经自带
处理 - 将一起形成一个圆的多个图像按钮
我有几个这样创建的按钮: 在 setup() PImage[] imgs1 = {loadImage("AREA1_1.png"),loadImage("AREA1_2.png"),loadImage
处理:如何分屏？
我正在尝试使用 Processing 创建一个多人游戏，但无法弄清楚如何将屏幕分成两个以显示玩家的不同情况？就像在 c# 中一样，我们有Viewport leftViewport,rightView
处理如何根据草图中的位置改变颜色？
我一直在尝试使用 Moore 邻域在处理过程中创建元胞自动机，到目前为止非常成功。我已经设法使基本系统正常工作，现在我希望通过添加不同的功能来使用它。现在，我检查细胞是否存活。如果是，我使用 fill
JavaScript 处理
有没有办法用 JavaScript 代码检查资源使用情况？我可以检查脚本的 RAM 使用情况和 CPU 使用情况吗？由于做某事有多种方法，我可能会使用不同的方法编写代码，并将其保存为两个不同的文件，
list - 处理 list
我想弄清楚如何处理这样的列表: [ [[4,6,7], [1,2,4,6]] , [[10,4,2,4], [1]] ] 这是一个整数列表的列表我希望我的函数将此列表作为输入并返回列表中没有重复的整
Flutter 处理 MethodChannel
有没有办法在不需要时处理 MethodChannel/EventChannel ？我问是因为我想为对象创建多个方法/事件 channel 。例子: class Call { ... fields
python - 处理 ConnectionResetError
我有一个关于在 Python3 中处理 ConnectionResetError 的问题。这通常发生在我使用 urllib.request.Request 函数时。我想知道如果我们遇到这样的错误是否可
处理 float 的奇怪问题
我一直在努力解决这个问题几个小时，但无济于事。代码很简单，一个弹跳球(粒子)。将粒子的速度初始化为 (0, 0) 将使其保持上下弹跳。将粒子的初始化速度更改为 (0, 0.01) 或任何十进制浮点数都
python - 处理 : 时遇到错误
我把自己弄得一团糟。我想在我的系统中添加 python3.6 所以我决定在我的 Ubuntu 19.10 中卸载现有的。但是现在每次我想安装一些东西我都会得到这样的错误: dpkg: error w
Rpart - NA 处理
我正在努力解决 Rpart 包中的 NA 功能。我得到了以下数据框(下面的代码) Outcome VarA VarB 1 1 1 0 2 1 1 1
java - 处理/访问磁盘上的文件
我将 Java 与 JSF 一起使用，这是 Glassfish 3 容器。在我的 Web 应用程序中，我试图实现一个文件(图像)管理系统。我有一个 config.properties我从中读取上传
optimization - 处理:如何提高程序的帧率？
所以我一直在Processing工作几个星期以来，虽然我没有编程经验，但我已经转向更复杂的项目。我正在编写一个进化模拟器，它会产生具有随机属性的生物。最终，我将添加复制，但现在这些生物只是在屏幕上漂
Delphi 2009 处理 with
有人知道 Delphi 2009 对“with”的处理有什么不同吗？我昨天解决了一个问题，只是将“with”解构为完整引用，如“with Datamodule、Dataset、MainForm”。

太空狗

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

python - Flask 应用程序处理请求线程错误？