gpt4 book ai didi

azure - 使用二进制 mapper.exe 和 reducer.exe 在 C# Streaming mapreduce 作业中获取文件名作为键

转载 作者:可可西里 更新时间:2023-11-01 16:42:24 25 4
gpt4 key购买 nike

以下代码可以很好地向集群提交流作业。

string statusFolderName = @"/tutorials/wordcountstreaming/status";

var jobcred = new BasicAuthCredential();
jobcred.UserName = "username";
jobcred.Password = "pass";
jobcred.Server = new Uri("https://something.azurehdinsight.net");

// Define the Hadoop streaming MapReduce job
StreamingMapReduceJobCreateParameters myJobDefinition = new StreamingMapReduceJobCreateParameters()
{
JobName = "my word counting job",
StatusFolder = statusFolderName,
Input = "/example/data/gutenberg/davinci.txt",
Output = "/tutorials/wordcountstreaming/output",
Reducer = "wc.exe",
Mapper = "cat.exe"

};

myJobDefinition.Files.Add("/example/apps/wc.exe");
myJobDefinition.Files.Add("/example/apps/cat.exe");

var jobClient = JobSubmissionClientFactory.Connect(jobcred);

// Run the MapReduce job
JobCreationResults mrJobResults = jobClient.CreateStreamingJob(myJobDefinition);

--------------------映射器------------------------ ---

namespace wc
{
class wc
{
static void Main(string[] args)
{
string line;
var count = 0;

if (args.Length > 0)
{
Console.SetIn(new StreamReader(args[0]));
}

while ((line = Console.ReadLine()) != null)
{
count += line.Count(cr => (cr == ' ' || cr == '\n'));
}
Console.WriteLine(count);
}
}
}

如何获取文本文件的名称作为 key ?我希望输出显示键值。键是文件名,值是文件中的单词数我有多个文件。

最佳答案

In order to get the name of text file processed by Mapper as key you can use the below command in your mapper function.

string Key = Environment.GetEnvironmentVariable("map_input_file");
Modify your Mapper code as:

namespace wc
{
class wc
{
static void Main(string[] args)
{
string line;
var count = 0;

if (args.Length > 0)
{
Console.SetIn(new StreamReader(args[0]));
}

while ((line = Console.ReadLine()) != null)
{
count += line.Count(cr => (cr == ' ' || cr == '\n'));
}
string Key = Environment.GetEnvironmentVariable("map_input_file");
var output = String.Format("{0}\t{1}",Key, count);
Console.WriteLine(output);
}
}
}

希望这对您有所帮助。

关于azure - 使用二进制 mapper.exe 和 reducer.exe 在 C# Streaming mapreduce 作业中获取文件名作为键,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39565852/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com