gpt4 book ai didi

java - JsonParser 的解析方法大大降低了我的代码速度

转载 作者:行者123 更新时间:2023-11-30 10:34:49 26 4
gpt4 key购买 nike

我正在做一个项目,它应该从 JSON 文件中提取数据(包含有关波兰代表的信息),并仅使用这些数据进行一些计算。

代码执行正常,但有一种方法会大大降低一切速度。 我不是最擅长描述的,所以让我们展示一下我的 Jsonreader 类
Gist link
(第17、43、50行用的方法)代码看起来有点乱,但它工作正常,使用 jsonparser.parse 方法排除片段。每个特使花费大约 2 秒的时间是 Not Acceptable 。我必须更改那几行,但我不知道如何更改。我正在考虑将 json 文件指定为映射对象,然后对其进行处理,但我不确定这是否是一个好的选择。
(抱歉我的语法不好)

最佳答案

How can i check if problem lies in taht getContent method?

您可以间接地证明这一点:只需在您的网络浏览器网络调试器选项卡中检查您的服务 API 性能,或者测量简单 wget 的时间,例如 time wget YOUR_URL

我同意 Andreas怀疑 parse 方法是万恶之源。其实不然。如果您仔细查看要点,您会发现 parse 方法接受委托(delegate)读取器,该读取器实际上使用与远程主机“连接”的基础输入流。 I/O 通常是非常耗时的操作,尤其是网络。此外,建立 HTTP 连接在这里是一件昂贵的事情。在我的机器上,我得到了以下平均时间:

  • 发出 HTTP 请求:第一个请求约为 1.50..2.00s,连续请求为 0.50..1.00s;
  • 读取数据:~0.80 秒(要么是哑读到最后,要么是 JSON 解析——并不重要,Gson 确实非常快;您甚至可以在浏览器中使用网络调试器或 time wget URL 如果您使用 Unix 终端)。

Andreas 建议的另一点是使用多线程以并行运行独立任务。不幸的是,这可以加快速度,但不会给您带来巨大的改变,因为您的服务访问速度不是那么快。

Executing SingleThreadedDemo...
Executing SingleThreadedDemo took 1063935ms = ~17:43
Executing MultiThreadedDemo...
Executing MultiThreadedDemo took 353044ms = ~5:53

稍后运行演示给出了以下结果(大约快了 3 倍,不知道之前减速的真正原因是什么)

Executing SingleThreadedDemo...
Executing SingleThreadedDemo took 382249ms = ~6:22
Executing MultiThreadedDemo...
Executing MultiThreadedDemo took 130502ms = ~2:11
Executing MultiThreadedDemo...
Executing MultiThreadedDemo took 110119ms = ~1:50

抽象演示.java

下面的类违反了一些良好的 OOP 设计理念,但为了不让类的总数膨胀,就让它的东西就在这里。

abstract class AbstractDemo
implements Callable<List<EnvoyData>> {

// Gson is thread-safe
private static final Gson gson = new Gson();

// JsonParser is thread-safe: https://groups.google.com/forum/#!topic/google-gson/u6hq2OVpszc
private static final JsonParser jsonParser = new JsonParser();

interface IPointsAndYearbooksConsumer {

void acceptPointsAndYearbooks(SerializedDataPoints points, SerializedDataYears yearbooks);

}

interface ITripsConsumer {

void acceptTrips(SerializedDataTrips trips);

}

AbstractDemo() {
}

protected abstract List<EnvoyData> doCall()
throws Exception;

// This implementation measures time (in milliseconds) taken for each demo call
@Override
public final List<EnvoyData> call()
throws Exception {
final String name = getClass().getSimpleName();
final long start = currentTimeMillis();
try {
out.printf("Executing %s...\n", name);
final List<EnvoyData> result = doCall();
out.printf("Executing %s took %dms\n", name, currentTimeMillis() - start);
return result;
} catch ( final Exception ex ) {
err.printf("Executing %s took %dms\n", name, currentTimeMillis() - start);
throw ex;
}
}

// This is a generic method that encapsulates generic pagination and lets you to iterate over the service pages in for-each style manner
static Iterable<JsonElement> jsonRequestsAt(final URL startUrl, final Function<? super JsonObject, URL> nextLinkExtrator, final JsonParser jsonParser) {
return () -> new Iterator<JsonElement>() {
private URL nextUrl = startUrl;

@Override
public boolean hasNext() {
return nextUrl != null;
}

@Override
public JsonElement next() {
if ( nextUrl == null ) {
throw new NoSuchElementException();
}
try ( final Reader reader = readFrom(nextUrl) ) {
final JsonElement root = jsonParser.parse(reader);
nextUrl = nextLinkExtrator.apply(root.getAsJsonObject());
return root;
} catch ( final IOException ex ) {
throw new RuntimeException(ex);
}
}
};
}

// Just a helper method to iterate over the start response
static Iterable<JsonElement> getAfterwords()
throws MalformedURLException {
return jsonRequestsAt(
afterwordsUrl(),
root -> {
try {
final JsonElement next = root.get("Links").getAsJsonObject().get("next");
return next != null ? new URL(next.getAsString()) : null;
} catch ( final MalformedURLException ex ) {
throw new RuntimeException(ex);
}
},
jsonParser
);
}

// Just extract points and yearbooks.
// You can return a custom data holder class, but this one uses consuming-style passing the results via its parameter consumer
static void extractPointsAndYearbooks(final Reader reader, final IPointsAndYearbooksConsumer consumer) {
final JsonObject expensesJsonObject = jsonParser.parse(reader)
.getAsJsonObject()
.get("layers")
.getAsJsonObject()
.get("wydatki")
.getAsJsonObject();
final SerializedDataPoints points = gson.fromJson(expensesJsonObject.get("punkty").getAsJsonArray(), SerializedDataPoints.class);
final SerializedDataYears yearbooks = gson.fromJson(expensesJsonObject.get("roczniki").getAsJsonArray(), SerializedDataYears.class);
consumer.acceptPointsAndYearbooks(points, yearbooks);
}

// The same as above but for another type of response
static void extractTrips(final Reader reader, final ITripsConsumer consumer) {
final JsonElement tripsJsonElement = jsonParser.parse(reader)
.getAsJsonObject()
.get("layers")
.getAsJsonObject()
.get("wyjazdy");
final SerializedDataTrips trips = tripsJsonElement.isJsonArray()
? gson.fromJson(tripsJsonElement.getAsJsonArray(), SerializedDataTrips.class)
: null;
consumer.acceptTrips(trips);
}

// It might be a constant field, but the next methods are dynamic (parameter-dependent), so let them all be similar
// Checked exceptions are not that evil, and let the call-site decide what to do with them
static URL afterwordsUrl()
throws MalformedURLException {
return new URL("https://api-v3.mojepanstwo.pl/dane/poslowie.json");
}

// The same as above
static URL afterwordsUrl(final int page)
throws MalformedURLException {
return new URL("https://api-v3.mojepanstwo.pl/dane/poslowie.json?_type=objects&page=" + page);
}

// The same as above
static URL tripsUrl(final int envoyId)
throws MalformedURLException {
return new URL("https://api-v3.mojepanstwo.pl/dane/poslowie/" + envoyId + ".json?layers[]=wyjazdy");
}

// The same as above
static URL expensesUrl(final int envoyId)
throws MalformedURLException {
return new URL("https://api-v3.mojepanstwo.pl/dane/poslowie/" + envoyId + ".json?layers[]=wydatki");
}

// Since jsonParser is encapsulated
static JsonElement parseJsonElement(final Reader reader) {
return jsonParser.parse(reader);
}

// A helper method to return a reader for the given URL
static Reader readFrom(final URL url)
throws IOException {
final HttpURLConnection request = (HttpURLConnection) url.openConnection();
request.connect();
return new BufferedReader(new InputStreamReader((InputStream) request.getContent()));
}

// Waits for all futures used in multi-threaded demo
// Not sure how good this method is since I'm not an expert in concurrent programming unfortunately
static void waitForAllFutures(final Iterable<? extends Future<?>> futures)
throws ExecutionException, InterruptedException {
final Iterator<? extends Future<?>> iterator = futures.iterator();
while ( iterator.hasNext() ) {
final Future<?> future = iterator.next();
future.get();
iterator.remove();
}
}

}

SingleThreadedDemo.java

最简单的演示。整个数据拉取在单个线程中执行,因此它往往是这里最慢的演示。这是完全线程安全的,没有字段,可以声明为单例。

final class SingleThreadedDemo
extends AbstractDemo {

private static final Callable<List<EnvoyData>> singleThreadedDemo = new SingleThreadedDemo();

private SingleThreadedDemo() {
}

static Callable<List<EnvoyData>> getSingleThreadedDemo() {
return singleThreadedDemo;
}

@Override
protected List<EnvoyData> doCall()
throws IOException {
final List<EnvoyData> envoys = new ArrayList<>();
for ( final JsonElement afterwordJsonElement : getAfterwords() ) {
final JsonArray dataObjectArray = afterwordJsonElement.getAsJsonObject().get("Dataobject").getAsJsonArray();
for ( final JsonElement dataObjectElement : (Iterable<JsonElement>) dataObjectArray::iterator ) {
final int envoyId = dataObjectElement.getAsJsonObject().get("id").getAsInt();
try ( final Reader expensesReader = readFrom(expensesUrl(envoyId)) ) {
extractPointsAndYearbooks(expensesReader, (points, yearbooks) -> {
// ... consume points and yearbooks here
});
}
try ( final Reader tripsReader = readFrom(tripsUrl(envoyId)) ) {
extractTrips(tripsReader, trips -> {
// ... consume trips here
});
}
}
}
return envoys;
}

}

多线程Demo.java

不幸的是,我在 Java 并发方面真的很弱,也许这些多线程演示可以得到显着改进。这个使用这两种方法的半多线程演示:

  • 一个用于遍历页面的线程;
  • 多线程获取积分、年鉴和旅行数据。

另请注意,此演示(以及下面的另一个多线程演示)不是故障安全的:如果在提交的任务中有任何东西抛出异常,则执行程序服务后台线程将不会正确终止。因此,您可能希望自己使其具有故障安全性和稳健性。

final class MultiThreadedDemo
extends AbstractDemo {

private final ExecutorService executorService;

private MultiThreadedDemo(final ExecutorService executorService) {
this.executorService = executorService;
}

static Callable<List<EnvoyData>> getMultiThreadedDemo(final ExecutorService executorService) {
return new MultiThreadedDemo(executorService);
}

@Override
protected List<EnvoyData> doCall()
throws InterruptedException, ExecutionException, MalformedURLException {
final List<EnvoyData> envoys = synchronizedList(new ArrayList<>());
final Collection<Future<?>> futures = new ConcurrentLinkedQueue<>();
for ( final JsonElement afterwordJsonElement : getAfterwords() ) {
final JsonArray dataObjectArray = afterwordJsonElement.getAsJsonObject().get("Dataobject").getAsJsonArray();
for ( final JsonElement dataObjectElement : (Iterable<JsonElement>) dataObjectArray::iterator ) {
final int envoyId = dataObjectElement.getAsJsonObject().get("id").getAsJsonPrimitive().getAsInt();
submitExtractPointsAndYearbooks(futures, envoyId);
submitExtractTrips(futures, envoyId);
}
}
waitForAllFutures(futures);
return envoys;
}

private void submitExtractPointsAndYearbooks(final Collection<? super Future<?>> futures, final int envoyId) {
futures.add(executorService.submit(() -> {
try ( final Reader expensesReader = readFrom(expensesUrl(envoyId)) ) {
extractPointsAndYearbooks(expensesReader, (points, yearbooks) -> {
// ... consume points and yearbooks here
});
return null;
}
}));
}

private void submitExtractTrips(final Collection<? super Future<?>> futures, final int envoyId) {
futures.add(executorService.submit(() -> {
try ( final Reader tripsReader = readFrom(tripsUrl(envoyId)) ) {
extractTrips(tripsReader, trips -> {
// ... consume trips here
});
return null;
}
}));
}

}

MultiThreadedEstimatedPagesDemo.java

这个是上一个演示的增强版。但是这个演示提交了用于迭代服务页面的执行程序服务任务。为此,需要事先检测页数。并且拥有页面数量可以使 https://...poslowie.json?...page=... URL 并行处理。请注意,如果找到超过 1 个页面,则下一次迭代从第 2 页开始,而不是重复请求。

final class MultiThreadedEstimatedPagesDemo
extends AbstractDemo {

private final ExecutorService executorService;

private MultiThreadedEstimatedPagesDemo(final ExecutorService executorService) {
this.executorService = executorService;
}

static Callable<List<EnvoyData>> getMultiThreadedEstimatedPagesDemo(final ExecutorService executorService) {
return new MultiThreadedEstimatedPagesDemo(executorService);
}

@Override
protected List<EnvoyData> doCall()
throws IOException, JsonIOException, JsonSyntaxException, InterruptedException, ExecutionException {
final List<EnvoyData> envoys = synchronizedList(new ArrayList<>());
final JsonObject page1RootJsonObject;
final int totalPages;
try ( final Reader page1Reader = readFrom(afterwordsUrl()) ) {
page1RootJsonObject = parseJsonElement(page1Reader).getAsJsonObject();
totalPages = estimateTotalPages(page1RootJsonObject);
}
final Collection<Future<?>> futures = new ConcurrentLinkedQueue<>();
futures.add(executorService.submit(() -> {
final JsonArray dataObjectArray = page1RootJsonObject.getAsJsonObject().get("Dataobject").getAsJsonArray();
for ( final JsonElement dataObjectElement : (Iterable<JsonElement>) dataObjectArray::iterator ) {
final int envoyId = dataObjectElement.getAsJsonObject().get("id").getAsInt();
submitExtractPointsAndYearbooks(futures, envoyId);
submitExtractTrips(futures, envoyId);
}
return null;
}));
for ( int page = 2; page <= totalPages; page++ ) {
final int finalPage = page;
futures.add(executorService.submit(() -> {
try ( final Reader reader = readFrom(afterwordsUrl(finalPage)) ) {
final JsonElement afterwordJsonElement = parseJsonElement(reader);
final JsonArray dataObjectArray = afterwordJsonElement.getAsJsonObject().get("Dataobject").getAsJsonArray();
for ( final JsonElement dataObjectElement : (Iterable<JsonElement>) dataObjectArray::iterator ) {
final int envoyId = dataObjectElement.getAsJsonObject().get("id").getAsInt();
submitExtractPointsAndYearbooks(futures, envoyId);
submitExtractTrips(futures, envoyId);
}
}
return null;
}));
}
waitForAllFutures(futures);
return envoys;
}

private static int estimateTotalPages(final JsonObject rootJsonObject) {
final int elementsPerPage = rootJsonObject.get("Dataobject").getAsJsonArray().size();
final int totalElements = rootJsonObject.get("Count").getAsInt();
return (int) ceil((double) totalElements / elementsPerPage);
}

private void submitExtractPointsAndYearbooks(final Collection<? super Future<?>> futures, final int envoyId) {
futures.add(executorService.submit(() -> {
try ( final Reader expensesReader = readFrom(expensesUrl(envoyId)) ) {
extractPointsAndYearbooks(expensesReader, (points, yearbooks) -> {
// ... consume points and yearbooks here
});
return null;
}
}));
}

private void submitExtractTrips(final Collection<? super Future<?>> futures, final int envoyId) {
futures.add(executorService.submit(() -> {
try ( final Reader tripsReader = readFrom(tripsUrl(envoyId)) ) {
extractTrips(tripsReader, trips -> {
// ... consume trips here
});
return null;
}
}));
}

}

测试.java

以及演示本身:

public final class Test {

private Test() {
}

public static void main(final String... args)
throws Exception {
runSingleThreadedDemo();
runMultiThreadedDemo();
runMultiThreadedEstimatedPagesDemo();
}

private static void runSingleThreadedDemo()
throws Exception {
final Callable<?> singleThreadedDemo = getSingleThreadedDemo();
singleThreadedDemo.call();
}

private static void runMultiThreadedDemo()
throws Exception {
final ExecutorService executorService = newFixedThreadPool(getRuntime().availableProcessors());
final Callable<?> demo = getMultiThreadedDemo(executorService);
demo.call();
executorService.shutdown();
}

private static void runMultiThreadedEstimatedPagesDemo()
throws Exception {
final ExecutorService executorService = newFixedThreadPool(getRuntime().availableProcessors());
final Callable<?> demo = getMultiThreadedEstimatedPagesDemo(executorService);
demo.call();
executorService.shutdown();
}

}

关于java - JsonParser 的解析方法大大降低了我的代码速度,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41496526/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com