java - mapreduce中的序号

我已经编写了Java代码以在Java中创建RowId。但是我需要将其转换为mapreduce。我是MapReduce的新手，需要您的帮助。

输入是本地文件

example: Alex 23 M NY

Alex 19 M NJ

Alex 29 M DC

Michael 20 M NY

Michael 24 M DC

计数文件作为辅助输入
例:

Alex 3

Michael 2

Desired Output:
1 Alex 23 M NY

2 Alex 19 M NJ

3 Alex 29 M DC

1 Michael 20 M NY

2 Michael 24 M DC

我的Java代码在这里:

public class RowId
                  {
public static void main( String [] args) throws IOException
                 {
BufferReader in = null;
BufferReader cnt = null;
BufferWriter out = null;
String in_line;
String out_line;
int frst_row_ind=1;
int row_cnt=0;
int new_col=0;

try{
in= BufferReader(new FileReader ("file path in local");
File out_file = new File("o/p path in local");
if(!out_file.exists()){
out_file.createNewFile();
        }

FileWriter fw = new FileWriter(out_file);

out = new BufferWriter(fw);
while((in_line = in.readLine())! = null)
{

if (in_line!=null)

{
String[] splitData = in_line.split("\\t");
cnt = new BufferReader(new FileReader("file path of countFile")
while((cnt_line=cnt.readLine()) != null )
{
String[] splitCount = cnt_line.split("\\t");
if ((splitCount[0]).equalsIgnoreCase(splitData[0]))
{
if (frst_row_ind==1)
{
row_cnt = Integer.parseInt(splitCount[1]);
}
new_col++
out.write(String.valueOf(new_col));
out.write("\\t");

for(int i= 0; i <splitData.length; i++)
{
if (!(splitData[i] == null) || (splitData[i].length()== 0))
{
out.write(splitData[i].trim());
if (i!=splitData.length-1)
{
out.write("\\t");
}
}
}

row_cnt--;
out.write("\r\n");
if(row_cnt==0)
{
frst_row_ind=1;
new_col=0;
}
else{
frst_row_ind=0;
}
out.flush();
break;
}
}
}
}
}
catch (IOException e)
{
e.printStrackTrace();
}
finally
{
try{
if(in!=null) in.close();
if(cnt !=null) cnt.close();
}
catch (IOException e)
{
e.printStrackTrace();
}
}
}
}

请还原您的想法。

最佳答案

以下是您可以y = try的代码。注意:我没有执行相同的操作，但希望它能为您提供所需的输出。

public class StMApper extends Mapper<LongWritable,Text,Text,Text>
{
Text outkey-new Text();
Text outvalue=new Text();

public void map(LongWritable key,Text values, Context context)
{
    //Alex 19 M NJ
    String []col=values.toString().split(" ");
    outkey.set(cols[0]);
    outvalue.set(values.toString());
    context.write(outkey,outvalue);
}
}

public class StReducer extends Reducer<Text,Text,IntWritable,Text>
{
IntWritable outkey=new IntWritable();
Text outvalue=new Text();
    ////Alex{Alex 19 M NJ , Alex 29 M DC,...}
public void reduce(Text key,Iterable<Text> values,Context context)
{   
    int i=0;
    for(Text val:values)
    {
        outkey.set(i);
        outvalue.set(val);
        i++;
        context.write(outkey,outvalue);
    }
 }
}

关于java - mapreduce中的序号，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/37290612/

java - mapreduce中的序号

上一篇：java - 在集成nutch 2.3，Hbase和Solr时花费太多时间进行索引

下一篇：docker - 等待脚本会覆盖默认的CMD并退出Docker容器