gpt4 book ai didi

java - 如何更新 "Practical Graph Analytics with Apache Giraph"示例以在当前 Cloudera Quickstart VM 上运行

转载 作者:可可西里 更新时间:2023-11-01 15:52:33 25 4
gpt4 key购买 nike

我是 Hadoop/Giraph 和 Java 的新手。作为任务的一部分,我在其上下载了 Cloudera Quickstart VM 和 Giraph。我正在使用这本书,名为“使用 Apache Giraph 进行实用图形分析;作者:Shaposhnik、Roman、Martella、Claudio、Logothetis、Dionysios”,我尝试从中运行第 111 页上的第一个示例(Twitter Followership Graph)。

编辑:显然,书中的示例(2015 年出版)所依赖的 Hadoop 版本比当前(2017 年)版本的 Cloudera Quickstart VM 提供的版本要旧得多。如何让示例运行?

原帖:

运行 GiraphHelloWorld.java 程序

import org.apache.giraph.edge.Edge;
import org.apache.giraph.GiraphRunner;
import org.apache.giraph.graph.BasicComputation;
import org.apache.giraph.graph.Vertex;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.util.ToolRunner;

// Giraph applications are custom classes that typically use
// BasicComputation class for all their defaults... except for
// the compute method that has to be defined

public class GiraphHelloWorld extends
BasicComputation<IntWritable, IntWritable,
NullWritable, NullWritable> {
@Override
public void compute(Vertex<IntWritable, IntWritable, NullWritable> vertex, Iterable<NullWritable> messages) {
System.out.print("Hello world from the: " + vertex.getId().toString() + " who is following:");

// iterating over vertex's neighbors
for (Edge<IntWritable, NullWritable> e : vertex.getEdges()) {
System.out.print(" " + e.getTargetVertexId());
}
System.out.println("");

// signaling the end of the current BSP computation for the current vertex
vertex.voteToHalt();
}
public static void main(String[] args) throws Exception {
System.exit(ToolRunner.run(new GiraphRunner(), args));
}
}

下面的代码在终端上运行以执行程序:

export HADOOP_HOME=/usr/lib/hadoop
export GIRAPH_HOME=/usr/local/giraph
export HADOOP_CONF_DIR=$GIRAPH_HOME/conf
PATH=$HADOOP_HOME/bin:$GIRAPH_HOME/bin:$PATH

giraph target/book-examples-1.0.0-jar-with-dependencies.jar GiraphHelloWorld -vip /home/cloudera/src/main/resources/1 -vif org.apache.giraph.io.formats.IntIntNullTextInputFormat -w 1 -ca giraph.SplitMasterWorker=false,giraph.logLevel=error

以上导致了以下错误:

rker=false,giraph.logLevel=error
No lib directory, assuming dev environment
HADOOP_CONF_DIR=/usr/local/giraph/conf
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/cloudera/workspace/first/target/book-examples-1.0.0-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2017-12-08 16:46:24,917 INFO [main] utils.ConfigurationUtils (ConfigurationUtils.java:populateGiraphConfiguration(336)) - No edge input format specified. Ensure your InputFormat does not require one.
2017-12-08 16:46:24,926 INFO [main] utils.ConfigurationUtils (ConfigurationUtils.java:populateGiraphConfiguration(346)) - No vertex output format specified. Ensure your OutputFormat does not require one.
2017-12-08 16:46:24,926 INFO [main] utils.ConfigurationUtils (ConfigurationUtils.java:populateGiraphConfiguration(361)) - No edge output format specified. Ensure your OutputFormat does not require one.
2017-12-08 16:46:24,957 INFO [main] utils.ConfigurationUtils (ConfigurationUtils.java:populateGiraphConfiguration(402)) - Setting custom argument [giraph.SplitMasterWorker] to [false] in GiraphConfiguration
2017-12-08 16:46:24,957 INFO [main] utils.ConfigurationUtils (ConfigurationUtils.java:populateGiraphConfiguration(402)) - Setting custom argument [giraph.logLevel] to [error] in GiraphConfiguration
2017-12-08 16:46:25,329 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1175)) - mapreduce.job.counters.limit is deprecated. Instead, use mapreduce.job.counters.max
2017-12-08 16:46:25,330 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1175)) - mapred.job.map.memory.mb is deprecated. Instead, use mapreduce.map.memory.mb
2017-12-08 16:46:25,330 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1175)) - mapred.job.reduce.memory.mb is deprecated. Instead, use mapreduce.reduce.memory.mb
2017-12-08 16:46:25,330 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1175)) - mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
2017-12-08 16:46:25,332 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1175)) - mapreduce.user.classpath.first is deprecated. Instead, use mapreduce.job.user.classpath.first
2017-12-08 16:46:25,332 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1175)) - mapred.map.max.attempts is deprecated. Instead, use mapreduce.map.maxattempts
2017-12-08 16:46:25,336 INFO [main] job.GiraphJob (GiraphJob.java:run(226)) - run: Since checkpointing is disabled (default), do not allow any task retries (setting mapred.map.max.attempts = 0, old value = 4)
2017-12-08 16:46:25,339 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1175)) - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2017-12-08 16:46:25,401 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1175)) - session.id is deprecated. Instead, use dfs.metrics.session-id
2017-12-08 16:46:25,405 INFO [main] jvm.JvmMetrics (JvmMetrics.java:init(76)) - Initializing JVM Metrics with processName=JobTracker, sessionId=
Exception in thread "main" java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
at org.apache.giraph.bsp.BspOutputFormat.checkOutputSpecs(BspOutputFormat.java:43)
at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:270)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:143)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1307)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1304)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1304)
at org.apache.giraph.job.GiraphJob.run(GiraphJob.java:259)
at org.apache.giraph.GiraphRunner.run(GiraphRunner.java:94)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.giraph.GiraphRunner.main(GiraphRunner.java:124)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

Maven pom xml 文件:

<?xml version="1.0" encoding="UTF-8"?>
<project>
<modelVersion>4.0.0</modelVersion>

<groupId>giraph</groupId>
<artifactId>book-examples</artifactId>
<version>1.0.0</version>

<dependencies>
<dependency>
<groupId>org.apache.giraph</groupId>
<artifactId>giraph-core</artifactId>
<version>1.1.0</version>
</dependency>

<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.9.0</version>
</dependency>
</dependencies>

<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<version>2.4</version>
<executions>
<execution>
<id>create-jar-bundle</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>

<repositories>
<repository>
<id>cloudera</id>
<url>https://repository.cloudera.com/artifactory/cloudera-repos</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>true</enabled>
</snapshots>
</repository>
</repositories>

</project>

如果还有其他需要,请告诉我。感谢您的帮助,提前致谢!

最佳答案

当我尝试使用 Giraph 项目所需的依赖项创建自己的 pom 文件时,版本问题得到解决。

`

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<groupId>com</groupId>
<artifactId>R4.giraphshortestpath</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>

<name>R4.giraphshortestpath</name>
<url>http://maven.apache.org</url>

<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>

<repositories>
<repository>
<id>cloudera</id>
<name>cloudera repository</name>
<url>https://repository.cloudera.com/content/repositories/releases/</url>
</repository>
</repositories>

<dependencies>
<dependency>
<groupId>org.apache.giraph</groupId>
<artifactId>giraph-parent</artifactId>
<version>1.2.0-hadoop2</version>
<type>pom</type>
</dependency>

<dependency>
<groupId>org.apache.giraph</groupId>
<artifactId>giraph-core</artifactId>
<version>1.2.0-hadoop2</version>
</dependency>


<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>

<!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-common -->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.6.0-cdh5.12.0</version>
</dependency>

<!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-client -->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.6.0-mr1-cdh5.12.0</version>
</dependency>

</dependencies>

<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<version>2.4</version>
<executions>
<execution>
<id>create-jar-bundle</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>

`

关于java - 如何更新 "Practical Graph Analytics with Apache Giraph"示例以在当前 Cloudera Quickstart VM 上运行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47724275/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com