gpt4 book ai didi

sql-server - 在 Hadoop 中访问 SQL Server

转载 作者:可可西里 更新时间:2023-11-01 14:59:33 26 4
gpt4 key购买 nike

我想做的: 该程序旨在从 SQL Server 2008 加载区域销售数据,并在 MapReduce 上运行简单的统计计算以获得每个区域的总销售额。我得到的错误说程序找不到 sqljdbc4.jar 文件,但是,该文件确实已复制到代码中指定的位置。

  1. 所以代码如下:

    //fileName: MRExp.java
    public class MRExp {
    public static void main(String[] args) throws IOException {
    JobConf conf = new JobConf(MRExp.class);
    DistributedCache.addFileToClassPath(new Path("/userX/sqljdbc4.jar"), conf);

    conf.setMapperClass(MRMapper.class);
    conf.setReducerClass(MRReducer.class);

    conf.setMapOutputKeyClass(Text.class);
    conf.setMapOutputValueClass(LongWritable.class);

    conf.setOutputKeyClass(LongWritable.class);
    conf.setOutputValueClass(Text.class);

    conf.setInputFormat(DBInputFormat.class);
    FileOutputFormat.setOutputPath(conf, new Path(args[0]));

    DBConfiguration
    .configureDB(
    conf,
    "com.microsoft.sqlserver.jdbc.SQLServerDriver",
    "jdbc:sqlserver://MyDbServerAddr:1433;databaseName=ThisDb;integratedSecurity=true;",
    "db_userName", "db_Pws");

    DBInputFormat
    .setInput(conf, InfoUnit.class,
    "SELECT R_NAME,L_ORDERKEY from dbo.United10MB ;"/* inputQuery */
    , "SELECT COUNT(L_ORDERKEY) from dbo.United10MB"/* inputCountQuery */);

    try {
    JobClient.runJob(conf);
    } catch (Exception e) {
    e.printStackTrace();
    }
    }
    }

    //后面是MRMapper和MRReducer以及InfoUnit的定义。 InfoUnit 实现了 Writable、DBWritable。

  2. 文件位置:

    [root@test MRExp]#密码
    /root/MRExp
    [root@test MRExp]# ls
    类 hadoop-0.20.2-core.jar MRExp.java sqljdbc4.jar

  3. 然后,为了编译 MRExp.java:

    [root@test MRExp]# javac -classpath hadoop-0.20.2-core.jar -d classes/MRExp.java
    [root@test MRExp]# jar -cvf MRExp.jar -C classes/.

同时将 sqljdbc4.jar 复制到 HDFS:

[root@test MRExp]#  hadoop dfs -copyFromLocal sqljdbc4.jar /userX  

所以我们得到:

[root@test MRExp]# ls  
classes hadoop-0.20.2-core.jar MRExp.jar MRExp.java sqljdbc4.jar
  1. 完成上述操作后,启动 MapR 进程:

    [root@test MRExp]# hadoop jar MRExp.jar mrexp.MRExp/userX/output

但是程序说:

17:02:50 WARN mapreduce.JobSubmitter: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
12/10/28 17:02:50 INFO mapreduce.JobSubmitter: Cleaning up the staging area /tmp/hadoop-yarn/staging/root/.staging/job_1350984913454_0009
java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: com.microsoft.sqlserver.jdbc.SQLServerDriver
at org.apache.hadoop.mapreduce.lib.db.DBInputFormat.setConf(DBInputFormat.java:165)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:70)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:130)
at org.apache.hadoop.mapred.JobConf.getInputFormat(JobConf.java:607)
at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:476)
at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:468)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:359)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1226)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1223)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1223)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:609)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:604)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:604)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:880)
at mrexp.MRExp.main(MRExp.java:70)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: com.microsoft.sqlserver.jdbc.SQLServerDriver
at org.apache.hadoop.mapreduce.lib.db.DBInputFormat.getConnection(DBInputFormat.java:191)
at org.apache.hadoop.mapreduce.lib.db.DBInputFormat.setConf(DBInputFormat.java:159)
... 25 more
Caused by: java.lang.ClassNotFoundException: com.microsoft.sqlserver.jdbc.SQLServerDriver
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:169)
at org.apache.hadoop.mapreduce.lib.db.DBConfiguration.getConnection(DBConfiguration.java:148)
at org.apache.hadoop.mapreduce.lib.db.DBInputFormat.getConnection(DBInputFormat.java:185)
... 26 more

最佳答案

hadoop jar ... 命令的“-libjars”命令行选项中包含 sqljdbc4.jar JAR。

请阅读this从 Cloudera 发布以获取更多信息。

更新:

执行以下操作

[root@test MRExp]#  hadoop dfs -ls /userX

将绝对路径复制到文件系统中的 sqljdbc4.jar 并放入以下行

DistributedCache.addFileToClassPath(new Path("<Absolute Path>/sqljdbc4.jar"), conf);

这将解决问题。

关于sql-server - 在 Hadoop 中访问 SQL Server,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/13251994/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com