- android - 多次调用 OnPrimaryClipChangedListener
- android - 无法更新 RecyclerView 中的 TextView 字段
- android.database.CursorIndexOutOfBoundsException : Index 0 requested, 光标大小为 0
- android - 使用 AppCompat 时,我们是否需要明确指定其 UI 组件(Spinner、EditText)颜色
请原谅我的无知,我是新手。我搜索了这个并尝试了几个示例,但我认为我发现大多数可能在 python2.7 中工作的东西,但我需要使用 python3.5 才能工作。我试图从维基百科上的这个列表中只提取城市
标签名称不同,否则我会尝试使用请求,这实际上是理想的,因为我们需要随着维基百科的更新而更新我们的列表。相反,我复制了数据并将其粘贴到一个 txt 文档中,以便我可以构建概念证明并获得该项目的批准。我最终得到的结果看起来像这样:
1. Oklahoma City 1,012,389
2. Tulsa 609,450
3. Norman 110,925
4. Broken Arrow 98,850
5. Lawton (town) 96,867
6. Edmond 81,405
7. Moore 55,081
8. Midwest City 54,371
我发现了几件事,我尝试了几种不同的方法,认为如果我找到拆分文件的正确方法,我就可以得到所有有内容的行。然后我可以再次拆分它们并返回索引为 1 的行项目。
我在尝试:
file = open('cities_oklahoma.txt', 'r')
s = file.readline()
for line in s:
line_has_txt = line.split() # I have no clue what should be here
print([line_has_txt.split(' ')[1])
我什至接近我想在这里做的事情了吗?另请注意,我在示例中操纵了第 5 行,以显示发生的数据可能存在的一些偏差。另外,正如您从第 1 行中看到的那样,一些城市名称实际上有 city 这个词,这打破了我的理论
最佳答案
如果你想要城市列表:
import requests
r = requests.get("https://en.wikipedia.org/wiki/List_of_towns_and_cities_in_Oklahoma_by_population#Largest_10_cities_by_population")
from bs4 import BeautifulSoup
soup = BeautifulSoup(r.content)
for p in soup.find("div",{"class":"mw-content-ltr"}).find_all("p"):
print(p.text)
这给了你所有的城市和标题:
The following list of towns and cities in Oklahoma, shows the incorporated places in the U.S. state of Oklahoma, in order of population according to the 2010 United States Census:[1]
1. Oklahoma City 1,012,389
2. Tulsa 609,450
3. Norman 110,925
4. Broken Arrow 98,850
5. Lawton 96,867
6. Edmond 81,405
7. Moore 55,081
8. Midwest City 54,371
9. Enid 49,379
10. Stillwater 45,688
11. Muskogee 39,223
12. Bartlesville 35,750
13. Shawnee 29,857
14. Owasso 28,915
.......................
359. Greenfield (town) 93
360. Roosevelt (town) 25
361. Cooperton (town) 12
你可以跳过标题和空字符串,你必须更加小心你过滤的内容,但这是一般的想法:
soup = BeautifulSoup(r.content)
ps = soup.find("div", {"class": "mw-content-ltr"}).find_all("p")
city_data = dict(p.text.lstrip("0123456789. ").rsplit(None, 1) for p in ps[3:])
from pprint import pprint as pp
pp(city_data)
这给了你:
{'Achille town, Bryan County': '492',
'Ada': '16,810',
'Adair (town)': '790',
'Afton (town)': '1,049',
'Agra (town)': '339',
'Alex (town)': '550',
'Allen (town)': '932',
'Altus': '19,813',
'Alva': '4,945',
'Amber town, Grady County': '419',
'Anadarko': '6,762',
'Antlers': '2,453',
'Apache (town)': '1,444',
'Arapaho (town)': '796',
'Ardmore': '24,283',
'Arkoma (town)': '1,989',
'Arnett (town)': '524',
'Asher town, Pottawatomie County': '393',
'Atoka': '3,107',
'Avant (town)': '320',
'Barnsdall': '1,243',
'Bartlesville': '35,750',
'Beaver (town)': '1,515',
'Beggs': '1,321',
'Bernice (town)': '562',
'Bethany': '19,051',
'Bethel Acres (town)': '2,895',
'Billings (town)': '509',
'Binger (town)': '672',
'Bixby': '20,884',
'Blackwell': '7,092',
'Blair (town)': '818',
'Blanchard': '7,670',
'Boise City': '1,266',
'Bokchito (town)': '632',
'Bokoshe (town)': '512',
'Boley (town)': '1,184',
'Boswell (town)': '709',
'Bowlegs town, Seminole County': '405',
'Bray (town)': '1,209',
'Bristow': '4,222',
'Broken Arrow': '98,850',
'Broken Bow': '4,120',
'Buffalo (town)': '1,299',
'Burns Flat (town)': '2,057',
'Butler (town)': '287',
'Byng (town)': '1,175',
'Cache': '2,796',
'Caddo (town)': '997',
'Calera (town)': '2,164',
'Calumet (town)': '507',
'Canton (town)': '625',
'Canute (town)': '541',
'Carmen (town)': '355',
'Carnegie (town)': '1,723',
'Carney (town)': '647',
'Cashion (town)': '802',
'Catoosa': '7,151',
'Cement (town)': '501',
'Central High (town)': '1,199',
'Chandler': '3,100',
'Chattanooga town, Comanche County': '461',
'Checotah': '3,335',
'Chelsea (town)': '1,964',
'Cherokee': '1,498',
'Cheyenne (town)': '801',
'Chickasha': '16,036',
'Choctaw': '11,146',
'Chouteau (town)': '2,097',
'Claremore': '18,581',
'Clayton (town)': '821',
'Cleveland': '3,251',
'Clinton': '9,033',
'Coalgate': '1,967',
'Colbert (town)': '1,140',
'Colcord (town)': '815',
'Cole (town)': '555',
'Collinsville': '5,606',
'Comanche': '1,663',
'Commerce': '2,473',
'Cooperton (town)': '12',
'Copan (town)': '733',
'Corn (town)': '503',
'Covington (town)': '527',
'Coweta': '9,943',
'Coyle town, Logan County': '325',
'Crescent': '1,411',
'Crowder town, Pittsburg County': '430',
'Cushing': '7,826',
'Custer City (town)': '375',
'Cyril (town)': '1,059',
'Davenport (town)': '814',
'Davidson (town)': '315',
'Davis': '2,683',
'Del City': '21,332',
'Delaware town, Nowata County': '417',
'Depew (town)': '476',
'Dewar (town)': '888',
'Dewey': '3,432',
'Dickson (town)': '1,207',
'Dill City (town)': '562',
'Dover town, Kingfisher County': '464',
'Drummond (town)': '455',
'Drumright': '2,907',
'Duncan': '23,431',
'Durant': '15,856',
'Dustin town, Hughes County': '395',
'Earlsboro (town)': '628',
'East Duke (town)': '424',
'Edmond': '81,405',
'El Reno': '16,749',
'Eldorado town, Jackson County': '446',
'Elgin': '2,156',
'Elk City': '11,693',
'Elmore City (town)': '697',
'Empire City (town)': '955',
'Enid': '49,379',
'Erick': '1,052',
'Eufaula': '2,813',
'Fairfax (town)': '1,380',
'Fairland (town)': '1,057',
'Fairview': '2,579',
'Fanshawe (town)': '419',
'Fletcher (town)': '1,177',
'Forest Park (town)': '998',
'Forgan (town)': '547',
'Fort Cobb (town)': '634',
'Fort Coffee town, Le Flore County': '424',
'Fort Gibson (town)': '4,154',
'Fort Supply (town)': '330',
'Fort Towson (town)': '519',
'Francis (town)': '315',
'Frederick': '3,940',
'Gage (town)': '442',
'Garber': '822',
'Geary': '1,280',
'Geronimo (town)': '1,268',
'Glencoe (town)': '601',
'Glenpool': '10,808',
'Goldsby (town)': '1,801',
'Goodwell (town)': '1,293',
'Gore (town)': '977',
'Grandfield': '1,038',
'Granite (town)': '2,065',
'Greenfield (town)': '93',
'Grove': '6,623',
'Guthrie': '10,191',
'Guymon': '11,442',
'Haileyville': '813',
'Hammon (town)': '568',
'Harrah': '5,095',
'Hartshorne': '2,125',
'Haskell (town)': '2,007',
'Haworth (town)': '297',
'Healdton': '2,788',
'Heavener': '3,414',
'Helena (town)': '1,403',
'Hennessey (town)': '2,131',
'Henryetta': '5,927',
'Hinton (town)': '3,196',
'Hobart': '3,756',
'Holdenville': '5,771',
'Hollis': '2,060',
'Hominy': '3,565',
'Hooker': '1,918',
'Howe (town)': '802',
'Hugo': '5,301',
'Hulbert (town)': '590',
'Hydro (town)': '969',
'Idabel': '7,010',
'Indiahoma (town)': '344',
'Inola (town)': '1,788',
'Jay': '2,448',
'Jenks': '16,924',
'Jennings town, Pawnee County': '363',
'Jones (town)': '2,692',
'Kansas (town)': '802',
'Kaw City city, Kay County': '375',
'Kellyville (town)': '1,150',
'Keota (town)': '564',
'Ketchum Town, Craig County': '442',
'Keyes (town)': '324',
'Kiefer (town)': '1,685',
'Kingfisher': '4,633',
'Kingston (town)': '1,601',
'Kiowa (town)': '731',
'Konawa': '1,298',
'Krebs': '2,053',
'Lahoma (town)': '611',
'Lamont town, Grant County': '417',
'Langley (town)': '819',
'Langston (town)': '1,724',
'Laverne (town)': '1,344',
'Lawton': '96,867',
'Lexington': '2,152',
'Lindsay': '2,840',
'Locust Grove (town)': '1,423',
'Lone Grove': '5,054',
'Lone Wolf town, Kiowa County': '438',
'Luther (town)': '1,221',
'Madill': '3,770',
'Mangum': '3,010',
'Mannford (town)': '3,076',
'Mannsville (town)': '863',
'Marietta': '2,626',
'Marlow': '4,662',
'Maud': '1,048',
'Maysville (town)': '1,232',
'McAlester': '18,383',
'McCurtain (town)': '516',
'McLoud (town)': '4,044',
'Medford': '996',
'Medicine Park (town)': '382',
'Meeker (town)': '1,144',
'Miami': '13,570',
'Midwest City': '54,371',
'Mill Creek (town)': '319',
'Millerton (town)': '320',
'Minco': '1,632',
'Moore': '55,081',
'Mooreland (town)': '1,190',
'Morris': '1,479',
'Morrison (town)': '733',
'Mounds (town)': '1,168',
'Mountain Park town, Kiowa County': '409',
'Mountain View (town)': '795',
'Muldrow (town)': '3,466',
'Muskogee': '39,223',
'Mustang': '17,395',
'New Cordell': '2,915',
'Newcastle': '7,685',
'Newkirk': '2,317',
'Nichols Hills': '3,710',
'Nicoma Park': '2,393',
'Ninnekah (town)': '1,002',
'Noble': '6,481',
'Norman': '110,925',
'North Enid (town)': '860',
'North Miami town, Ottawa County': '374',
'Nowata': '3,731',
'Oakland town, Marshall County': '1,057',
'Oaks (town)': '288',
'Ochelata town, Washington County': '424',
'Oilton': '1,013',
'Okarche (town)': '1,215',
'Okay (town)': '620',
'Okeene (town)': '1,204',
'Okemah': '3,223',
'Oklahoma City': '1,012,389',
'Okmulgee': '12,321',
'Oktaha town, Muskogee County': '390',
'Olustee (town)': '607',
'Oologah (town)': '1,146',
'Owasso': '28,915',
'Paden (town)': '461',
'Panama (town)': '1,413',
'Paoli (town)': '610',
'Pauls Valley': '6,187',
'Pawhuska': '3,584',
'Pawnee': '2,196',
'Perkins': '2,831',
'Perry': '5,126',
'Piedmont': '5,720',
'Pink (town)': '2,058',
'Pocola (town)': '4,056',
'Ponca City': '25,387',
'Pond Creek': '856',
'Porter (town)': '566',
'Porum (town)': '727',
'Poteau': '8,520',
'Prague': '2,386',
'Prue town, Osage County': '465',
'Pryor': '9,539',
'Purcell': '5,884',
'Quapaw (town)': '906',
'Quinton (town)': '1,051',
'Ralston (town)': '330',
'Ramona (town)': '535',
'Randlett (town)': '438',
'Ravia (town)': '528',
'Red Oak (town)': '549',
'Ringling (town)': '1,037',
'Ringwood (town)': '497',
'Ripley town, Payne County': '403',
'Rock Island (town)': '646',
'Roff (town)': '725',
'Roland (town)': '3,169',
'Roosevelt (town)': '25',
'Rush Springs (town)': '1,231',
'Ryan (town)': '816',
'Salina (town)': '1,396',
'Sallisaw': '8,880',
'Sand Springs': '18,906',
'Sapulpa': '20,544',
'Savanna (town)': '686',
'Sayre': '4,375',
'Schulter (town)': '509',
'Seiling': '860',
'Seminole': '7,488',
'Sentinel (town)': '901',
'Shady Point (town)': '1,026',
'Shattuck (town)': '1,356',
'Shawnee': '29,857',
'Shidler': '441',
'Skiatook': '7,397',
'Slaughterville (town)': '4,137',
'Snyder': '1,394',
'Soper (town)': '261',
'South Coffeyville (town)': '785',
'Spavinaw (town)': '437',
'Spencer': '3,912',
'Sperry (town)': '1,206',
'Spiro (town)': '2,164',
'Springer (town)': '700',
'Sterling (town)': '793',
'Stigler': '2,685',
'Stillwater': '45,688',
'Stilwell': '3,949',
'Stonewall (town)': '470',
'Stratford (town)': '1,525',
'Stringtown town, Atoka County': '410',
'Stroud': '2,690',
'Sulphur': '4,929',
'Taft (town)': '250',
'Tahlequah': '15,753',
'Talihina (town)': '1,114',
'Taloga (town)': '299',
'Tecumseh': '6,457',
'Temple (town)': '1,002',
'Terral town, Jefferson County': '382',
'Texhoma (town)': '926',
'Thackerville town, Love County': '445',
'The Village': '8,929',
'Thomas': '1,181',
'Tipton (town)': '847',
'Tishomingo': '3,034',
'Tonkawa': '3,216',
'Tryon town, Lincoln County': '491',
'Tulsa': '609,450',
'Tupelo': '329',
'Tushka town, Atoka County': '312',
'Tuttle': '6,019',
'Tyrone (town)': '762',
'Union City (town)': '1,645',
'Valley Brook (town)': '765',
'Valliant (town)': '754',
'Velma (town)': '620',
'Verden (town)': '530',
'Verdigris (town)': '3,993',
'Vian (town)': '1,466',
'Vici (town)': '699',
'Vinita': '5,743',
'Wagoner': '8,323',
'Wakita town, Grant County': '344',
'Walters': '2,551',
'Wanette town, Pottawatomie County': '350',
'Wapanucka town, Johnston County': '438',
'Warner (town)': '1,641',
'Warr Acres': '10,043',
'Washington (town)': '618',
'Watonga': '5,111',
'Waukomis (town)': '1,286',
'Waurika': '2,064',
'Wayne (town)': '688',
'Waynoka': '927',
'Weatherford': '10,833',
'Webbers Falls (town)': '616',
'Welch (town)': '619',
'Weleetka (town)': '998',
'Wellston (town)': '788',
'West Siloam Springs (town)': '846',
'Westville (town)': '1,639',
'Wetumka': '1,282',
'Wewoka': '3,430',
'Wilburton': '2,843',
'Wilson': '1,724',
'Winchester (town)': '516',
'Wister (town)': '1,102',
'Woodward': '12,051',
'Wright City (town)': '762',
'Wyandotte (town)': '333',
'Wynnewood': '2,212',
'Wynona (town)': '437',
'Yale': '1,227',
'Yukon': '22,709'}
如果您打算分析数据,您可能会发现 pandas 很有用::
city_data =(p.text.lstrip("0123456789. ").rsplit(None, 1) for p in ps[3:])
import pandas as pd
df = pd.DataFrame(city_data,columns=["City", "Population"])
print(df)
输出:
City Population
0 Oklahoma City 1,012,389
1 Tulsa 609,450
2 Norman 110,925
3 Broken Arrow 98,850
4 Lawton 96,867
5 Edmond 81,405
6 Moore 55,081
7 Midwest City 54,371
8 Enid 49,379
9 Stillwater 45,688
10 Muskogee 39,223
11 Bartlesville 35,750
12 Shawnee 29,857
13 Owasso 28,915
14 Ponca City 25,387
15 Ardmore 24,283
16 Duncan 23,431
17 Yukon 22,709
18 Del City 21,332
19 Bixby 20,884
20 Sapulpa 20,544
21 Altus 19,813
22 Bethany 19,051
23 Sand Springs 18,906
24 Claremore 18,581
25 McAlester 18,383
26 Mustang 17,395
27 Jenks 16,924
28 Ada 16,810
29 El Reno 16,749
.. ... ...
您可能希望将 population 列转换为 int 以进行任何计算:
import locale
locale.setlocale(locale.LC_NUMERIC, '')
df["Population"] = df["Population"].apply(locale.atoi)
print(df["Population"])
0 1012389
1 609450
2 110925
3 98850
4 96867
5 81405
6 55081
7 54371
8 49379
9 45688
10 39223
11 35750
12 29857
..................
关于python - 在 python 中使用 "unclean"文本,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35142260/
我需要将文本放在 中在一个 Div 中,在另一个 Div 中,在另一个 Div 中。所以这是它的样子: #document Change PIN
奇怪的事情发生了。 我有一个基本的 html 代码。 html,头部, body 。(因为我收到了一些反对票,这里是完整的代码) 这是我的CSS: html { backgroun
我正在尝试将 Assets 中的一组图像加载到 UICollectionview 中存在的 ImageView 中,但每当我运行应用程序时它都会显示错误。而且也没有显示图像。 我在ViewDidLoa
我需要根据带参数的 perl 脚本的输出更改一些环境变量。在 tcsh 中,我可以使用别名命令来评估 perl 脚本的输出。 tcsh: alias setsdk 'eval `/localhome/
我使用 Windows 身份验证创建了一个新的 Blazor(服务器端)应用程序,并使用 IIS Express 运行它。它将显示一条消息“Hello Domain\User!”来自右上方的以下 Ra
这是我的方法 void login(Event event);我想知道 Kotlin 中应该如何 最佳答案 在 Kotlin 中通配符运算符是 * 。它指示编译器它是未知的,但一旦知道,就不会有其他类
看下面的代码 for story in book if story.title.length < 140 - var story
我正在尝试用 C 语言学习字符串处理。我写了一个程序,它存储了一些音乐轨道,并帮助用户检查他/她想到的歌曲是否存在于存储的轨道中。这是通过要求用户输入一串字符来完成的。然后程序使用 strstr()
我正在学习 sscanf 并遇到如下格式字符串: sscanf("%[^:]:%[^*=]%*[*=]%n",a,b,&c); 我理解 %[^:] 部分意味着扫描直到遇到 ':' 并将其分配给 a。:
def char_check(x,y): if (str(x) in y or x.find(y) > -1) or (str(y) in x or y.find(x) > -1):
我有一种情况,我想将文本文件中的现有行包含到一个新 block 中。 line 1 line 2 line in block line 3 line 4 应该变成 line 1 line 2 line
我有一个新项目,我正在尝试设置 Django 调试工具栏。首先,我尝试了快速设置,它只涉及将 'debug_toolbar' 添加到我的已安装应用程序列表中。有了这个,当我转到我的根 URL 时,调试
在 Matlab 中,如果我有一个函数 f,例如签名是 f(a,b,c),我可以创建一个只有一个变量 b 的函数,它将使用固定的 a=a1 和 c=c1 调用 f: g = @(b) f(a1, b,
我不明白为什么 ForEach 中的元素之间有多余的垂直间距在 VStack 里面在 ScrollView 里面使用 GeometryReader 时渲染自定义水平分隔线。 Scrol
我想知道,是否有关于何时使用 session 和 cookie 的指南或最佳实践? 什么应该和什么不应该存储在其中?谢谢! 最佳答案 这些文档很好地了解了 session cookie 的安全问题以及
我在 scipy/numpy 中有一个 Nx3 矩阵,我想用它制作一个 3 维条形图,其中 X 轴和 Y 轴由矩阵的第一列和第二列的值、高度确定每个条形的 是矩阵中的第三列,条形的数量由 N 确定。
假设我用两种不同的方式初始化信号量 sem_init(&randomsem,0,1) sem_init(&randomsem,0,0) 现在, sem_wait(&randomsem) 在这两种情况下
我怀疑该值如何存储在“WORD”中,因为 PStr 包含实际输出。? 既然Pstr中存储的是小写到大写的字母,那么在printf中如何将其给出为“WORD”。有人可以吗?解释一下? #include
我有一个 3x3 数组: var my_array = [[0,1,2], [3,4,5], [6,7,8]]; 并想获得它的第一个 2
我意识到您可以使用如下方式轻松检查焦点: var hasFocus = true; $(window).blur(function(){ hasFocus = false; }); $(win
我是一名优秀的程序员,十分优秀!