gpt4 book ai didi

iphone - HTML 解析因重音字母而失败(例如 : é)

转载 作者:行者123 更新时间:2023-12-01 17:02:52 25 4
gpt4 key购买 nike

我正在使用这个库:http://benreeves.co.uk/objective-c-hmtl-parser/为我正在制作的一个小 iPhone 应用程序解析 HTML。到目前为止,我已经让代码工作了,但是当出现重音时它会失败(到目前为止只有经验丰富的 é)。这是我正在使用的代码:

NSError * error = nil;
HTMLParser * parser = [[HTMLParser alloc] initWithContentsOfURL:[NSURL URLWithString:@"http://intranet.westminster.org.uk/almanack/food.asp?nextweek=TRUE"] error:&error];

if (error) {
NSLog(@"Error: %@", error);
return nil;
}
HTMLNode * bodyNode = [parser body]; //Find the body tag
NSArray *individualMeals = [bodyNode findChildTags:@"font"];
for (HTMLNode *node in individualMeals) {
if ([[node getAttributeNamed:@"color"] isEqual:@"green"]) {
NSLog(@"%@",[node rawContents]);
}
}

但它不会解析所有文本。在 URL 中找到重音后,它似乎放弃了。这是它在运行时产生的结果:
2010-10-07 18:40:59.296 Westminster[1011:207] <font color="green"/>
2010-10-07 18:40:59.298 Westminster[1011:207] <font color="green"/>
2010-10-07 18:40:59.305 Westminster[1011:207] <font color="green"/>
2010-10-07 18:40:59.307 Westminster[1011:207] <font color="green"/>
2010-10-07 18:40:59.308 Westminster[1011:207] <font color="green">Sausage &#13;<br/>Bacon&#13;<br/>Hash Brown&#13;<br/>Baked Beans&#13;<br/>Breakfast special&#13;<br/>Three cheese omelets&#13;<br/><br/><br/>Plain Porridge &#13;<br/><br/><br/><br/>Croissants &#13;<br/><br/> Natural Yogurt&#13;<br/>Dried Fruits &#13;<br/>Granola&#13;<br/>Honey</font>
2010-10-07 18:40:59.309 Westminster[1011:207] <font color="green">Mulligatawny &#13;<br/>Black Olive &#13;<br/>RICE&#13;<br/>Roasted med veg in paella rice&#13;<br/>Hot and sticky wings on yellow rice&#13;<br/>Hoi Sin Pork Belly Steaks&#13;<br/>Vegetable Biriyani with a Mild Curry Sauce&#13;<br/>Babycorn Bamboo Shoots and Water Chestnuts &#13;<br/>Stir fried noodles with seaweed &#13;<br/>Lemon Sponge with Orange Sauce&#13;<br/>Vanilla Granola</font>
2010-10-07 18:40:59.310 Westminster[1011:207] <font color="green"/>
2010-10-07 18:40:59.312 Westminster[1011:207] <font color="green">Pea &amp; Ham &#13;<br/><br/>Black Olive &#13;<br/>Roast Chicken with Bread Sauce and Roast Jus&#13;<br/>Warm Salad of Salmon and Crispy Bacon&#13;<br/><br/><br/>Vegetarian Chilli&#13;<br/>With Sour Cream and Braised Rice&#13;<br/>Green Beans&#13;<br/><br/>Bubble &amp; Squeak&#13;<br/><br/>Tiramisu&#13;<br/>3 Cheeses &amp; Biscuits</font>
2010-10-07 18:40:59.313 Westminster[1011:207] <font color="green">Sausage&#13;<br/>Bacon&#13;<br/>Grilled Tomato&#13;<br/>Grilled mushrooms&#13;<br/>Fried Egg&#13;<br/><br/><br/><br/>Plain Porridge&#13;<br/><br/><br/><br/>Bread &#13;<br/><br/>Natural Yogurt&#13;<br/>Dried Fruits &#13;<br/>Granola&#13;<br/>Honey</font>
2010-10-07 18:40:59.317 Westminster[1011:207] <font color="green">Root Vegetable&#13;<br/>Red Pesto &#13;<br/>WRAP&#13;<br/>Chimichanga&#13;&#13;<br/>Mexican fish tortillas&#13;&#13;<br/>Roast Leg of Lamb &#13;<br/>Gnocchi with Roasted Vegetables and Flaked Parmesan &#13;<br/>Broccoli &#13;<br/><br/><br/>Thyme Roasted Potatoes &#13;<br/> Sticky Toffee Pudding and Toffee Sauce&#13;<br/>Banana Bread</font>
2010-10-07 18:40:59.318 Westminster[1011:207] <font color="green"/>
2010-10-07 18:40:59.318 Westminster[1011:207] <font color="green">Tomato with Basil Oil&#13;&#13;<br/>Red Pesto &#13;<br/>Beef Olives&#13;<br/><br/>Lamb with Ginger, Spring onion and Noodles&#13;<br/><br/><br/>Field Mushroom Pies&#13;&#13;<br/>Ratatouille &#13;<br/><br/>Creamed Potatoes&#13;<br/><br/>Lemon Tart&#13;<br/>3 Cheeses &amp; Biscuits</font>
2010-10-07 18:40:59.319 Westminster[1011:207] <font color="green">Sausage &#13;<br/>Bacon&#13;<br/>Baked Beans&#13;<br/>Grilled Tomato&#13;<br/>Breakfast special&#13;<br/>Avocado on toast&#13;<br/><br/>Plain Porridge &#13;<br/><br/><br/>Bread and banana bread&#13;<br/><br/>Natural Yogurt&#13;<br/>Dried Fruits &#13;<br/>Granola&#13;<br/>Honey</font>
2010-10-07 18:40:59.333 Westminster[1011:207] <font color="green">(GREEK)&#13;<br/><br/>FLAT BREADS&#13;<br/>SPINACH, ROCKET AND FETA AND TOASTED SOUR DOUGHS &#13;<br/>SEAFOOD STUFFED PEPPERS&#13;<br/>STIFADO (beef)&#13;<br/><br/>LAMB FRICASSEE&#13;<br/>zucchini pie from Macedonia&#13;&#13;<br/>RICE&#13;<br/><br/>GIGANTIS PLAKI&#13;<br/><br/>ORANGE AND LEMON CAKE TOPPED WITH GREEK YOGURT AND HONEY</font>
2010-10-07 18:40:59.333 Westminster[1011:207] <font color="green"/>
2010-10-07 18:40:59.334 Westminster[1011:207] <font color="green">Roasted Vegetable&#13;<br/>FLAT BREADS&#13;<br/>Pork Steak Served with a Tomato, Tarragon and Mushroom sauce&#13;<br/>Roast beef and homemade horseradish sauce&#13;<br/><br/><br/>Lancashire Cheese Sausages with Onion Gravy&#13;&#13;<br/>Courgettes&#13;<br/><br/>Roast Potatoes&#13;<br/><br/>Mississippi Mud Pie&#13;<br/>3 Cheeses &amp; Biscuits</font>
2010-10-07 18:40:59.343 Westminster[1011:207] <font color="green">Sausage&#13;<br/>Bacon &#13;<br/>Hash Brown&#13;<br/>Grilled mushrooms&#13;<br/>Fried Egg&#13;<br/><br/><br/><br/>Plain Porridge &#13;<br/><br/><br/><br/>Bread &#13;<br/><br/> Natural Yogurt&#13;<br/>Dried Fruits &#13;<br/>Granola&#13;<br/>Honey</font>
2010-10-07 18:40:59.344 Westminster[1011:207] <font color="green">Leek, Blue Cheese and Potato &#13;&#13;<br/>Sunflower Seed&#13;<br/>COUS COUS&#13;<br/>Couscous with apricots, lemon and coriander &#13;<br/><br/>Couscous fried chicken with couscous and spiced tomato sauce &#13;&#13;<br/>Butchers Sausages &#13;<br/>Balsamic Roasted Vegetable Frittata&#13;<br/>Red Cabbage&#13;<br/><br/><br/>Mashed Potatoes&#13;<br/><br/>Jam Roly Poly &#13;<br/>Bakewell Slice</font>
2010-10-07 18:40:59.344 Westminster[1011:207] <font color="green"/>
2010-10-07 18:40:59.345 Westminster[1011:207] <font color="green">Curried Parsnip and Apple &#13;&#13;<br/>Sunflower Seed&#13;<br/>Spiced Sticky chicken pieces&#13;<br/>Mexican Beef Chilli Wraps with Natural Yogurt and Guacamole&#13;<br/><br/><br/>Roasted Teriyaki Tofu Steaks with Glazed Green Vegetables&#13;&#13;<br/>Spiced Aubergine&#13;<br/><br/>Rice and Peas&#13;<br/><br/>Mango Mousse&#13;<br/>3 Cheeses &amp; Biscuits</font>
2010-10-07 18:40:59.351 Westminster[1011:207] <font color="green">Sausage&#13;<br/>Bacon&#13;<br/>Baked Beans&#13;<br/>Grilled Tomato&#13;<br/>Breakfast special&#13;<br/>Muffin bar&#13;<br/><br/>Plain Porridge&#13;<br/><br/><br/><br/>Croissants &#13;<br/><br/>Natural Yogurt&#13;<br/>Dried Fruits &#13;<br/>Granola&#13;<br/>Honey</font>
2010-10-07 18:40:59.352 Westminster[1011:207] <font color="green">Carrot and Chilli &#13;&#13;<br/>Rosemary &#13;<br/>NOODLES&#13;<br/><br/>Crispy tofu&#13;<br/>Lemon chicken&#13;<br/>Fish with Traditional Crispy Batter&#13;<br/>Japanese Vegetable Curry with Rice Noodles and Tofu&#13;&#13;<br/>Garden peas&#13;<br/><br/><br/>Chips &#13;<br/>Viennese Jam Tart and Custard &#13;<br/>Fresh Fruit Salad</font>
2010-10-07 18:40:59.361 Westminster[1011:207] <font color="green"/>
2010-10-07 18:40:59.361 Westminster[1011:207] <font color="green">Three onion, spring, red and white&#13;<br/>Rosemary &#13;<br/>Pepperoni Pizza Topped with Boccaccio&#13;<br/>Bolognaise pasta bake&#13;<br/><br/>Vegetarian Plait&#13;&#13;<br/>Green Cabbage&#13;<br/><br/>Oven Baked Cajun Wedges&#13;<br/><br/>Ice&#13;<br/>Cream Sundae&#13;<br/><br/>3 Cheeses &amp; Biscuits</font>
2010-10-07 18:40:59.362 Westminster[1011:207] <font color="green">Sausage &#13;<br/>Bacon &#13;<br/>Hash Brown&#13;<br/>Grilled Mushrooms&#13;<br/>Poached Eggs&#13;<br/><br/><br/><br/>Plain Porridge &#13;<br/><br/><br/><br/><br/><br/>Natural Yogurt&#13;<br/>Dried Fruits &#13;<br/>Granola&#13;<br/>Honey</font>
2010-10-07 18:40:59.362 Westminster[1011:207] (null)
2010-10-07 18:40:59.363 Westminster[1011:207] <font color="green"/>
2010-10-07 18:40:59.363 Westminster[1011:207] <font color="green"/>

它放弃了带有炒土 bean 的部分,并且不从该部分或后面的任何部分返回任何结果。

我认为这可能是由于网站没有对 és 进行编码。当我查看源代码时,我看到的是 é 而不是 & eacute; (没有空格,否则 SO 格式化它......)本网站建议: http://www.w3.org/MarkUp/html3/latin1.html

谢谢你的时间。如果您知道从该网站获取午餐食物的更好方法,我也很想听听。

最佳答案

我有类似的问题。问题是 libxml2 只能解析 UTF8 编码的文档。所以你需要先将html页面转换为UTF8。

关于iphone - HTML 解析因重音字母而失败(例如 : é),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/3884611/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com