gpt4 book ai didi

java - 嵌入在Java中的Pig:本地的PigServer-没有错误消息,但不会启动map reduce(Maven吗?)

转载 作者:行者123 更新时间:2023-12-02 21:46:51 25 4
gpt4 key购买 nike

我正在尝试使用PigServer运行我的Pig脚本,因为我需要在脚本中使用“while”和“if”。因此,java可以帮上忙。

困难在于我的主要运行但什么都没发生(除了system.out.print之外),我不知道为什么 map 缩小无法启动。程序结束,没有任何错误。

我认为这是我的pom的问题,我认为我并没有放置所有需要的依赖项。

这是我的pom.xml:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.blablabla</groupId>
<artifactId>testPigServer</artifactId>
<version>0.0.1-SNAPSHOT</version>


<dependencies>

<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.2.0</version>
</dependency>

<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.16</version>
</dependency>

<dependency>
<groupId>org.apache.pig</groupId>
<artifactId>pig</artifactId>
<version>0.12.1</version>
</dependency>

<dependency>
<groupId>org.antlr</groupId>
<artifactId>antlr-runtime</artifactId>
<version>3.4</version>
</dependency>

</dependencies>

这是我的主要:
import java.io.IOException;

import org.apache.pig.ExecType;
import org.apache.pig.PigServer;
import org.apache.pig.backend.executionengine.ExecException;

public class MainPigServer {
/**
* @param args
* @throws IOException
* @throws ExecException
*/
public static void main(String[] args) throws ExecException, IOException {

System.out.println("Hello");
PigServer pigServer = new PigServer(ExecType.LOCAL);;
try {

String inputFile = "/home/cloudera/jeuxEtudiants/data/parents.csv";
String outPut = "/home/cloudera/jeuxEtudiants/resultat_PigServer_9";
queryCSV(pigServer, inputFile, outPut);
// queryJson(pigServer, inputFile,inputRef, outPut);
} catch (ExecException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
finally{
pigServer.shutdown();
System.out.println("Finally");
}
}


public static void queryCSV(PigServer pigServer, String inputFile, String outPut) throws IOException {
System.out.println("dans queryCSV");
pigServer.registerQuery("donnees_fait = LOAD '" + inputFile + "' USING PigStorage(';') ;");
pigServer.registerQuery("donnees_group = GROUP donnees_fait by $0 ;");
pigServer.store("donnees_group", outPut, "PigStorage('|')");
System.out.println("fin queryCSV");
}

public static void queryJson(PigServer pigServer, String inputFile, String inputRef, String outPut) {
System.out.println("dans queryJson");
try {
pigServer.registerQuery("donnees_fait = LOAD '" + inputFile + "' USING PigStorage(';') AS(id,nom,prenom);");
pigServer.registerQuery("ligne_finale = FOREACH donnees_fait GENERATE id AS Description, (nom,prenom) AS Test:(nom,prenom);");
pigServer.store("ligne_finale", outPut, "JsonStorage");
} catch (IOException e) {
e.printStackTrace();
}
}

}

当我运行main时,我得到:
Hello
log4j:WARN No appenders could be found for logger (org.apache.pig.impl.util.PropertiesUtil).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
dans queryCSV
fin queryCSV
Finally

我不知道怎么回事

更重要的是,我尝试执行grunt脚本,它可以工作。

感谢您的阅读。

安杰利克

最佳答案

最后,我找到了解决方案:

在文件settings.xml中(您可以在??? /。m2 / settings.xml中找到它),您可能必须创建一个。放置:

    <?xml version="1.0" encoding="UTF-8"?>
<settings>
<profiles>
<profile>
<id>standard-extra-repos</id>
<activation>
<activeByDefault>true</activeByDefault>
</activation>
<repositories>
<repository>
<!-- Central Repository -->
<id>central</id>
<url>http://repo1.maven.org/maven2/</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>true</enabled>
</snapshots>
</repository>
<repository>
<!-- Cloudera Repository -->
<id>cloudera</id>
<url>https://repository.cloudera.com/artifactory/cloudera-repos</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>true</enabled>
</snapshots>
</repository>
</repositories>
</profile>
</profiles>
</settings>

在pom中:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.businessdecision</groupId>
<artifactId>testPigServer</artifactId>
<version>0.0.1-SNAPSHOT</version>


<repositories>
<repository>
<id>cloudera</id>
<url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
</repository>
</repositories>

<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.0.0-cdh4.5.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
<version>2.0.0-mr1-cdh4.5.0</version>
</dependency>
<dependency>
<groupId>joda-time</groupId>
<artifactId>joda-time</artifactId>
<version>2.3</version>
</dependency>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.17</version>
</dependency>
<dependency>
<groupId>jline</groupId>
<artifactId>jline</artifactId>
<version>0.9.5</version>
</dependency>
<dependency>
<groupId>org.antlr</groupId>
<artifactId>antlr-runtime</artifactId>
<version>3.5.2</version>
</dependency>
<dependency>
<groupId>org.apache.pig</groupId>
<artifactId>pig</artifactId>
<version>0.11.0-cdh4.5.0</version>
</dependency>
<dependency>
<groupId>org.apache.pig</groupId>
<artifactId>pigunit</artifactId>
<version>0.11.0-cdh4.5.0</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.11</version>
<scope>test</scope>
</dependency>
</dependencies>

我仍然不知道真正的问题是什么,但是现在它可以工作了。我可能需要更多依赖项。

安杰利克

关于java - 嵌入在Java中的Pig:本地的PigServer-没有错误消息,但不会启动map reduce(Maven吗?),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24430998/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com