gpt4 book ai didi

java - JSON解析错误: Unexpected character (s) at position 226025

转载 作者:行者123 更新时间:2023-11-30 06:40:54 26 4
gpt4 key购买 nike

我在 Stackoverflow 上看到了类似的问题,但没有一个能帮助我解决我的问题。因此,我寻求帮助,因为我试图找出我收到的错误背后的原因,但失败了。请不要将其标记为重复问题。

我正在解析 Json 文件并收到以下错误。

Jun 06, 2017 2:06:24 PM edu.virginia.cs.services.FileManager ParseJson
SEVERE: null
Unexpected character (s) at position 226025.
at org.json.simple.parser.Yylex.yylex(Yylex.java:610)
at org.json.simple.parser.JSONParser.nextToken(JSONParser.java:269)
at org.json.simple.parser.JSONParser.parse(JSONParser.java:118)
at org.json.simple.parser.JSONParser.parse(JSONParser.java:92)
at edu.virginia.cs.services.FileManager.ParseJson(FileManager.java:68)
at edu.virginia.cs.main.Processer.main(Processer.java:20)

Exception in thread "main" java.lang.NullPointerException
at edu.virginia.cs.services.FileManager.ParseJson(FileManager.java:76)
at edu.virginia.cs.main.Processer.main(Processer.java:20)

感兴趣的代码:

try {
arr = (JSONArray) parser.parse(new FileReader(sourceFile));
} catch (IOException | ParseException ex) {
Logger.getLogger(FileManager.class.getName()).log(Level.SEVERE, null, ex);
}

文件内容如下所示:

[
{
"url": "http://www.save-on-crafts.com/",
"title": "Events & Wedding Sale | Save 20-60% | SaveOnCrafts",
"content": {
"p": ["Wedding decorations, party supplies, home d cor & craft supplies at 20-70% off. Save On Crafts brings you classic and trending fashions.", "Save On Crafts has continually evolved to meet the needs of our customers DIY brides, home decorators, party planners, florists, and caterers. Our goal is simple: provide an exciting selection of quality , , and items at the lowest price possible for the customer with discerning taste."],
"div": ["indicates required", "(831) 768-8428", "Take a Peek at our Specials: Save up to 70%!", "Candle Holders", "Flowers & Branches", "Crystal D cor, Chandeliers", "Set the Mood with Candles", "Champagne & Ice Buckets", "Chalkboards", "Eco Confetti", "Wedding Signs", "Sola Flowers", "Natural Wood Slices", "Classic & trending styles without the traditional retail markup.", "(831) 768-8428"],
"a": ["X", "What's New", "SPECIALS", "Wedding Decorations", "Lights | Event Lighting", "Wood Slabs & Tree Slices", "Vases", "Apothecary Jars", "Banners", "Baskets", "Bell Jars, Cloches", "Beverage Bar Supplies", "Bird Cages & Birds", "Botanicals, Lavender, Sola Flowers", "Bottles & Jars", "Branches - Natural", "Buckets & Tubs", "Burlap Fabric, Jute, and Linen", "Cake Stands", "Candles", "Candle Holders", "Candy Buffet", "Chair Sashes, Banners, Signs", "Chalkboards", "Chandeliers", "Charger & Base Plates", "Confetti", "Corsage & Bouquet Supplies", "Craft Supplies", "Crates, Boxes, & Trays", "Crystal Decorations", "Easels & Frames", "Event Decor", "Favors", "Feathers", "Floral Supplies", "Flowers", "Greenery", "Home & Garden Decor", "Lanterns", "Mirrors & Mirror Stands", "Moss Natural & Artificial", "Nautical Decor & Decorations", "Packaging, Gift Wrapping", "Paper Lanterns & Parasols", "Paper Party Decorations", "Party Supplies", "Pots & Planters", "Placecard Holders,Table Numbers, Displays", "Preserved Flowers & Leaves", "Props, Pedestals, Risers", "Ribbon", "Silk Flowers", "Signage", "Shells - Sand", "Shepherds Hooks & Stanchions", "Sola Flowers", "Succulents & Cactus", "Table Runners & Toppers", "Terrariums", "Tote Bags, Welcome Bags", "Trees, Potted Plants", "Vases & Vase Fillers", "Wedding Cake Decorations and Toppers", "Wedding Decorations", "Wedding Signs", "Wedding Themes", "Wedding Trees & Wishing Trees", "Wood Crafts", "Wood Slabs & Tree Slices", "Wreath Making Supplies, Frames, Forms", "Gifts - Holiday Decorations", "Gifts Under $25", "Ideas & Inspiration", "Shopping Cart", "About", "Shipping", "Return Policy", "Contact", "FAQ", "Privacy Policy", "Terms and Conditions", "Read More", "Shipping", "Cart"],
"strong": ["Need Help?", "SUBSCRIBE", "wedding supplies", "party decorations", "home d cor", "Affordable Wedding & Event Decor", "Save 20-70%", "Need Help?"],
"span": ["*", "*", "Live Chat", "Shop Categories", "Customer Service: 7am - 5pm PST (M-F) | (831)768-8428", "Copyright 2017 Save-On-Crafts. All Rights Reserved. Designated trademarks and brands are the property of their respective owners. Use of this website constitutes acceptance of the Save-On-Craftsand Privacy Policy.", "Live Chat"]
}
},
{
"url": "http://www.carsurvey.org/",
"title": "Carsurvey.org - Car Reviews",
"content": {
"p": ["I feel as if this vehicle was custom built for me, love it", "Neat cruiser, comfort first, performance second", "Beast maaaaaaate!", "Best value for the money", "There are reviews on the site", "new reviews and new comments are in the Members section, awaiting approval"],
"td": ["2 days ago", "2 days ago", "3 days ago", "3 days ago", "18 hours ago", "19 hours ago", "19 hours ago", "19 hours ago"],
"a": ["Write a Review", "About", "Members", "Reviews by Region", "Write a Review", "About", "Members", "Reviews by Region", "BMW", "Buick", "Chevrolet", "Chrysler", "Citroen", "Dodge", "Fiat", "Ford", "Honda", "Hyundai", "Jeep", "Kia", "Mazda", "Mercedes-Benz", "Mercury", "Mitsubishi", "Nissan", "Oldsmobile", "Peugeot", "Pontiac", "Renault", "Saturn", "Subaru", "Toyota", "Vauxhall", "Volkswagen", "Volvo", "AC", "Acura", "Alfa Romeo", "Alvis", "AMC", "ARO", "Asia Motors", "Aston Martin", "Asuna", "Audi", "Austin", "Austin Healey", "Autobianchi", "Autocars", "Avanti", "Bajaj", "Bedford", "Bentley", "Birkin", "BMW", "Bombardier", "Bond", "Brennan-Mays", "Bricklin", "Bugatti", "Buick", "Cadillac", "Caterham", "Checker", "Chery", "Chevrolet", "Chrysler", "Citroen", "Commer", "Cord", "Dacia", "Daewoo", "DAF", "Daihatsu", "Datsun", "DeLorean", "DeSoto", "DeTomaso", "Dodge", "Eagle", "Edsel", "Ferrari", "Fiat", "Ford", "Franklin", "Freightliner", "FSO", "Geely", "Geo", "GMC", "Great Wall", "Grinnall", "Hillman", "Holden", "Honda", "HSV", "Humber", "Hummer", "Hyundai", "IHC", "IKA", "Infiniti", "Innocenti", "Inokom", "Iran Khodro", "Iso Rivolta", "Isuzu", "Iveco", "Jaguar", "Jeep", "Jensen", "JiangNan", "Kaiser", "Kia", "Kish Khodro", "Lada", "Laforza", "Lamborghini", "Lancia", "Land Rover", "Lexus", "Leyland", "Leyland DAF", "Lincoln", "Lotus", "Mahindra", "Maple", "Marcos", "Maruti", "Maserati", "Matra", "Maybach", "Mazda", "McLaren", "Mercedes-Benz", "Mercury", "Merkur", "Meson", "Meyers Manx", "MG", "Microcar", "Mitsubishi", "Morgan", "Morris", "Moskvitch", "Nash", "NAZA", "Nissan", "Noble", "Nova", "NSU", "Oldsmobile", "Oltcit", "Opel", "Packard", "Panther", "Perodua", "Peugeot", "Plymouth", "Pontiac", "Porsche", "Premier", "Proton", "Puma", "Pyonghwa Motors", "Quantum", "Qvale", "Ram Trucks", "Rayton Fissore", "Reliant", "Renault", "Riley", "Robert Jankel Design", "Rolls Royce", "Rover - Austin", "SAAB", "Saleen", "Samsung", "Santana", "Sao", "Saturn", "Scion", "Seat", "Sebring", "Sebring Vanguard", "Shelby", "Simca", "Singer", "Skoda", "smart", "Spartan", "SsangYong", "Standard", "Sterling", "Studebaker", "Subaru", "Sunbeam", "Suzuki", "Talbot", "Tata", "Tatra", "Tesla", "Tickford", "Toyota", "Trabant", "Triumph", "Troller", "TVR", "Vanden Plas", "Vauxhall", "Venturi", "Volga", "Volkswagen", "Volvo", "Wartburg", "Westfield", "Willys", "Wolseley", "Yugo", "Zagato", "ZAZ", "Zhengzhou Nissan", "Zhonghua", "ZXAUTO", "1997 Lexus LS", "2012 Audi A7", "1985 Dodge D100", "2007 Citroen C5", "More New Car Reviews", "1987 Chrysler New Yorker", "1995 Chevrolet Monte Carlo", "1995 Chevrolet Monte Carlo", "1995 Chevrolet Monte Carlo", "More New Comments", "Advertise on this site", "Privacy Policy"],
"strong": ["110091", "0", "3"],
"h1": ["Car Reviews by Manufacturer"],
"h2": ["Most Popular", "All Manufacturers"],
"h3": ["Newest Car Reviews", "Newest Comments", "Current Status"],
"span": ["Copyright 1997 - 2017 CSDO Media Limited", "|"]
}
},
{
"url": "http://www.hollywood.com/",
"title": "Hollywood.com - Best of Movies, TV, and Celebrities",
"content": {
"div": ["TRENDING NOW", "Hollywood.com Photo Archive", "Hollywood.com Esports", "Hollywood.com Discovery", "MovieTickets.com Discovery", "Wenn Penelope Cruz will always put her all into every role she wins, even if it means transforming herself physically. The Spanish actress has varied...", "Wenn Sean Penn reportedly resolved a dispute with fellow passengers during a recent flight to New York. The Mystic River actor had just boarded the...", "Wenn Rita Ora has hinted in a new interview that she and Cara Delevingne were more than just good friends. The 26-year-old singer and the...", "Wenn Charlie Sheen has stepped out in public with a new girlfriend. The 51-year-old actor showed off his blonde partner, known only as Jools, as...", "Wenn Tom Cruise's insistence on perfecting a zero-gravity stunt for The Mummy caused members of the film's crew to vomit. Tom stars as military operative...", "Wenn The Big Chill star Meg Tilly has made a return to Hollywood after 18 years to play Brad Pitt's wife. The actress stepped away...", "Wenn Rob Kardashian has slammed rumors he's dating reality TV star Mehgan James. A report published by Us Weekly magazine on Thursday (01Jun17) suggested that...", "Wenn Taylor Swift has been pictured with her actor boyfriend Joe Alwyn for the first time. News of the Bad Blood hitmaker's relationship with 26-year-old...", "NBC Ariana Grande has touched down in the U.K. ahead of her benefit concert for victims of the terrorist attack on her gig in...", "Wenn Alec Baldwin helped raise $5.1 million for New Jersey Democrats at an event in Collingswood, New Jersey, on Thursday night (01Jun17). The 30 Rock...", "Wenn Johnny Depp has claimed he was completely unaware his former managers were using his name to take out $40 million in loans. The fight...", "Wenn Carey Mulligan is reportedly expecting her second child. The Great Gatsby actress was pictured outside Sexy Fish restaurant in London with her husband Marcus...", "When it was first announced that Scarlett Johansson would play The Major in the wildly popular 'Ghost in the Shell' fans weren't happy, to...", "Billy Bob Thornton and the cast of Bad Santa 2 looked super naughty at AMC Loews Lincoln Square in New York City. Check out...", "Hulu's much anticipated drama The Handmaid's Tale premiered last night. This 10-part series is an adaptation from Margaret Atwood's 1985 novel of the same name, set...", "Julianne Moore and Michelle Williams premiered their new movie Wonderstruck at the 70th Cannes Film Festival. For a complete gallery of pictures, click here.", "Selena Gomez hosted WE Day celebrations at The Forum in California for her fifth year. WE Day is one of the largest Facebook non-profits in...", "Check out the super whimsical cast of NBC's Hairspray Live! before the musical premieres Wednesday, December 7th!", "Wenn / Paramount Pictures Thandie Newton wore a wig she was given on Mission: Impossible 2 to the BAFTAs on Sunday night (12Feb17). The Westworld...", "The Light Between Oceans premiered at the Venice Film Festival and co-stars and real-life lovers Michael Fassbender and Alicia Vikander were all smiles on...", "With the Margot Robbie stepping into the role of Maid Marian, and the currently-filming of Robin Hood: Origins, there's been a resurgence of interest...", "Tom Hanks is Forrest Gump, just like like Richard Gere is Edward Lewis in Pretty Woman; some actors have had such iconic movie roles,...", "Disney These days, Disney is known for pushing the envelope and hiding adult themes and jokes in their films. However, there was a time...", "ABC Television Network Abby, The Deadliest Catch Darby Stanchfield plays Abby Whelan, and she's come a long way to get to D.C. She actually grew up...", "There are many different kinds of family businesses, but one we hardly think about is acting. However, there are families that have actors going...", "It's no secret that Hollywood loves its cliches from action heroes who magically avoid every bullet fired at them to fat sitcom husbands who...", "HBO HBO's Silicon Valley just finished its first season. The show features a great cast of comedians, and it's managed to satirize the nerdy masculinity of...", "32.2x", "|", "19.2x", "|", "6.84x", "|", "6.16x", "|", "4.77x", "|", "4.22x", "|", "Powered by Crowdtangle", "1999-2017 HOLLYWOOD.COM, LLC. ALL RIGHTS RESERVED", "| | | |", "MOVIE, TV, AND CELEBRITY DATA PROVIDED BY AND IS THE COPYRIGHT OF"],
"a": ["CLOSE", "Click here - to use the wp menu builder", "Click here - to use the wp menu builder", "SIGN UP FOR OUR NEWSLETTER", "Meg Tilly Returns to Movies after Two Decade Hiatus to Play Brad Pitt's Wife", "Kathy Griffin in Tears at Press Conference", "Rob Kardashian Denies Reports He's Dating Reality Star Mehgan James", "Rita Ora talks 'ambiguous' relationship with Cara Delevingne", "Sean Penn Involved in Dispute During Flight to JFK", "Khloe Kardashian won't identify friend she claims is stealing from her", "Underwear On The Outside At The 'Captain Underpants' Premiere", "Penelope Cruz: 'I don't mind getting ugly for movie roles'", "Charlie Sheen goes public with new girlfriend", "Tom Cruise made The Mummy crew vomit with zero-gravity stunt", "Kathy Griffin in Tears at Press Conference", "Underwear On The Outside At The 'Captain Underpants' Premiere", "Khloe Kardashian won't identify friend she claims is stealing from her", "Underwear On The Outside At The 'Captain Underpants' Premiere", "'Baby Driver' Looks Like The Most Fun Movie In 2nd Trailer", "Go Behind the Voices of 'Captain Underpants: The First Movie'", "Something Is Wrong In the 'Murder on the Orient Express' Trailer", "Nicole Kidman lends her Balenciaga wedding dress to exhibition", "Penelope Cruz: I don t mind getting ugly for movie roles", "Sean Penn Involved in Dispute During Flight to JFK", "Rita Ora talks ambiguous relationship with Cara Delevingne", "Charlie Sheen goes public with new girlfriend", "Tom Cruise made The Mummy crew vomit with zero-gravity stunt", "Meg Tilly Returns to Movies after Two Decade Hiatus to Play Brad Pitt s Wife", "Rob Kardashian Denies Reports He s Dating Reality Star Mehgan James", "Taylor Swift Spotted with New Boyfriend Joe Alwyn for First Time", "Ariana Grande returns to U.K. as thousands make false ticket claims for Manchester benefit show", "Alec Baldwin Raises $5 million for Democrats", "Johnny Depp was unaware ex managers were using his name to get loans", "Carey Mulligan is Pregnant", "see more", "RED CARPET", "Travel To Tokyo For The Ghost in the Shell World Premiere", "The Cast Of Bad Santa 2 Spiced Up The Red Carpet At The NYC Premiere", "Hulu s The Handmaid s Tale Premieres", "Julianne Moore and Michelle Williams Premiere Wonderstruck at Cannes", "Selena Gomez, Demi Lovato, and Alicia Keys Celebrate WE Day", "The Cast Of NBC s Hairspray Live! Were Super Whimsical On The Red Carpet", "Thandie Newton wore Mission: Impossible II wig to the BAFTAs", "Michael Fassbender & Alicia Vikander Are Perfection At The Light Between Oceans Premiere", "see more", "DID YOU KNOW?", "All the Actresses Who Have Played Maid Marian", "12 Iconic Movie Roles That Famous Actors Turned Down", "The Original Drawing For Snow White Was Banned By Disney Because It Was Too Sexy!", "Facts You Never Knew About The Cast of Scandal", "11 Actors You Didn t Know Have Famous Grandparents", "15 Celebrity Dads You Didn t Know Have Hot Sons", "The 10 Most Overused Sound Effects in Hollywood", "21 Facts You Don t Know About Silicon Valley", "see more", "Teen Mom: OG Star Ryan Edwards Has Checked into Rehab", "E! News", "How To Train Your Dragon 3: Eveything We Know So Far", "moviepilot.com", "Alec Baldwin's Advice to Kathy Griffin on Trump Brouhaha: 'F--- Them All'", "The Wrap", "Alec Baldwin Defends Kathy Griffin in Wake of Trump Decapitated Photo Controversy: 'Ignore Him'", "People", "The Wonder Woman Scene That Pays Tribute To Superman", "CinemaBlend", "14 Of The Most Utterly Bizarre Things On Display At The M tter Museum", "Ranker", "Movies", "TV", "Celebrities", "Best Of/Worst Of", "Where Are They Now?", "Did You Know", "Buzzing", "Quizzes", "Pop Lists", "News", "SSNInsider", "MovieTickets.com", "EsportsHW", "Photo Archive", "About Us", "Contact Us", "Media Kit", "PRIVACY POLICY", "TERMS OF SERVICE", "COPYRIGHT ISSUES", "DISCLOSURE", "REPORT ABUSE", "BASELINE"],
"em": ["Want More?"],
"h1": ["WANT MORE?"],
"i": ["Facebook", "Google+", "Twitter", "YouTube", "Instagram"],
"h2": ["Sign Up For Our Newsletter!", "Sign Up For Our Newsletter!"],
"h3": ["FOLLOW US!", "LIKE US!", "TOPIC", "Category", "partners", "COMPANY", "Be friends with us"],
"time": ["Jun 2, 2017", "Jun 2, 2017", "Jun 2, 2017", "Jun 2, 2017", "Jun 2, 2017", "Jun 2, 2017", "Jun 2, 2017", "Jun 2, 2017", "Jun 2, 2017", "Jun 2, 2017", "Jun 2, 2017", "Jun 2, 2017", "Mar 17, 2017", "Nov 16, 2016", "Apr 26, 2017", "May 18, 2017", "Apr 28, 2017", "Nov 18, 2016", "Feb 14, 2017", "Sep 1, 2016", "Mar 7, 2017", "Aug 15, 2013", "Oct 5, 2016", "Apr 4, 2014", "Apr 22, 2016", "Aug 22, 2014", "Sep 21, 2015", "Jun 11, 2014"],
"span": ["Celebrities", "Movies", "Television", "Showtimes", "Search", "Esports", "Photo Archive", "The Latest", "Video", "Buzzing", "Pop Lists", "Did You Know?", "Where Are They Now?", "Featured", "Take A Sneak Peak At The Movies Coming Out This Week (8/12)", "Kathy Griffin in Tears at Press Conference", "Underwear On The Outside At The Captain Underpants Premiere", "Khloe Kardashian won t identify friend she claims is stealing from her", "Penelope Cruz: I don t mind getting ugly for movie roles", "Partners", "MovieTickets.com", "SSN Insider", "Privacy Policy", "Copyright Notice", "Terms of Use", "Report Abuse", "Videos", "Buzzing", "Red Carpet", "Esports", "Photo Archive", "Newsletter Signup", "Meg Tilly Returns to Movies after Two Decade Hiatus to Play Brad Pitt's Wife", "WENN", "Kathy Griffin in Tears at Press Conference", "WENN", "Rob Kardashian Denies Reports He's Dating Reality Star Mehgan James", "WENN", "Rita Ora talks 'ambiguous' relationship with Cara Delevingne", "WENN", "Sean Penn Involved in Dispute During Flight to JFK", "WENN", "Khloe Kardashian won't identify friend she claims is stealing from her", "WENN", "Underwear On The Outside At The 'Captain Underpants' Premiere", "Michael Chaney", "Penelope Cruz: 'I don't mind getting ugly for movie roles'", "WENN", "Charlie Sheen goes public with new girlfriend", "WENN", "Sign Up for Our Newsletter!", "Follow @hollywood", "THE LATEST", "Hot on Facebook"]
}
}
]

我已经抓取了 500K 个网页并将它们存储在 Json 文件中。现在,我正在尝试阅读它。整个文件有2GB,所以我无法共享整个文件。

我知道 Json 解析器在文件中获取了一些意外字符 (s),但我无法找到 json 文件中的哪一行是错误的。有什么办法可以找出json文件中的错误行吗?

<小时/>

编辑

处理网页内容的主要代码如下。

for (Element element : elements) {
String tagName = element.tagName();
if (Util.isValidTag(tagName)) {
String textValue = Util.removeNonPrintableChars(element.ownText()).trim().replace("\"", "\'");
if (!textValue.isEmpty()) {
if (tagTextMap.containsKey(tagName)) {
tagTextMap.get(tagName).add(textValue);
} else {
ArrayList<String> arr = new ArrayList<>();
arr.add(textValue);
tagTextMap.put(tagName, arr);
}
}
}
}

我刚刚删除了不可打印的字符,并将双引号替换为单引号,就是这样。

<小时/>

更新

我在 json 文件中找到了有问题的部分。

{
"url": "http://www.kudzu.com/",
"title": "Atlanta roofers, hvac, plumbers, electricians and other businesses - reviews, coupons and cost estimates from your neighbors.",
"content": {
"h2": ["From Our Experts", "Recent Projects", "Recent Articles", "What It Costs", "Review a Business", "What It Costs", "Other Markets"],
"body": ["\"],
"span": ["Area", "Area", "Cost"]
}
}

这部分 - "body": ["\"], 是问题的根源。我现在可以理解为什么它会导致问题。

最佳答案

您似乎在转义特殊字符时遇到问题。请参阅 JSON 中使用的特殊字符列表:

  1. \b 退格键(ascii 代码 08)
  2. \f 换页(ascii 代码 0C)
  3. \n 新行
  4. \r 回车
  5. \t 制表符
  6. \"双引号
  7. \反斜杠字符

因此,在转储 json 时,您需要转义这些特殊字符。幸运的是,每个 json 库都有办法完成这项工作。由于您似乎已经使用了 JSON.simple 工具包,因此您可以使用 JSONObject.escape()转义特殊字符的方法。

关于java - JSON解析错误: Unexpected character (s) at position 226025,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44397234/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com