这是我的数据样本

如果第一列的索引为 0,我想使用 MapReduce 从该文件中获取每个商店的总销售额,商店名称在索引 2 处,收入在索引 4 处

这是我的映射器代码

public void map(LongWritable key , Text value , Context context) 
throws IOException , InterruptedException 
{ 
    String line = value.toString(); 
    String[] columns = line.split("\t"); 
 
    if(columns.length == 6) 
    { 
        String storeNameString = columns[2]; 
        Text storeName = new Text(storeNameString); 
 
        String storeRevenueString = columns[4]; 
        IntWritable storeRevenue = new IntWritable(Integer.parseInt(storeRevenueString)); 
        context.write(storeName, storeRevenue); 
    }    
} 

这是我的 Reducer 代码

public void reduce(Text key, Iterable<IntWritable> values, Context context) 
        throws IOException , InterruptedException { 
 
    Text storeName = key; 
    int storeSales = 0; 
 
    while(values.iterator().hasNext()) 
    { 
        storeSales += values.iterator().next().get(); 
 
    } 
    context.write(storeName, new IntWritable(storeSales)); 
} 

这是运行作业的代码

public class StoreSales extends Configured implements Tool { 
 
public static void main(String[] args) throws Exception { 
    // this main function will call run method defined above. 
    int res = ToolRunner.run(new StoreSales(),args); 
    System.exit(res); 
} 
 
@Override 
public int run(String[] args) throws Exception { 
    // TODO Auto-generated method stub 
    JobConf conf = new JobConf(); 
 
    @SuppressWarnings("unused") 
    Job job = new Job(conf , "Sales Per Store"); 
 
    job.setMapperClass(StoreSalesMapper.class); 
    job.setReducerClass(StoreSalesReducer.class); 
    job.setJarByClass(StoreSales.class); 
 
    job.setOutputKeyClass(Text.class); 
    job.setOutputValueClass(IntWritable.class); 
 
    Path input = new Path(args[0]); 
    Path output = new Path(args[1]); 
 
    FileInputFormat.addInputPath(conf , input); 
    FileOutputFormat.setOutputPath(conf, output); 
 
    JobClient.runJob(conf); 
 
    return 0; 
    } 
 } 

这是结果的示例

这是我得到的结果

我做错了什么?

请您参考如下方法:

您的逻辑没有任何问题,我已经使用新的 map reduce api 在驱动程序中使用了您的逻辑和修改位:

映射器部分

导入java.io.IOException;

import org.apache.hadoop.io.IntWritable; 
import org.apache.hadoop.io.LongWritable; 
import org.apache.hadoop.io.Text; 
import org.apache.hadoop.mapreduce.Mapper; 
 
public class Map extends Mapper<LongWritable,Text,Text,IntWritable>{ 
 
 
    public void map(LongWritable key , Text value , Context context) 
            throws IOException , InterruptedException 
            { 
                String line = value.toString(); 
                String[] columns = line.split("\\t"); 
 
                if(columns.length == 6) 
                { 
                    String storeNameString = columns[2]; 
                    Text storeName = new Text(storeNameString); 
 
                    String storeRevenueString = columns[4]; 
                    IntWritable storeRevenue = new IntWritable(Integer.parseInt(storeRevenueString)); 
                    context.write(storeName, storeRevenue); 
                }    
            } 
} 
 
import java.io.IOException; 
 
import org.apache.hadoop.io.IntWritable; 
import org.apache.hadoop.io.Text; 
import org.apache.hadoop.mapreduce.Reducer; 
 
public class Reduce extends Reducer<Text,IntWritable,Text,IntWritable>{ 
 
    public void reduce(Text key, Iterable<IntWritable> values, Context context) 
            throws IOException , InterruptedException { 
 
        Text storeName = key; 
        int storeSales = 0; 
 
        while(values.iterator().hasNext()) 
        { 
            storeSales += values.iterator().next().get(); 
 
        } 
        context.write(storeName, new IntWritable(storeSales)); 
    } 
 
} 
 
 
import org.apache.hadoop.conf.Configuration; 
import org.apache.hadoop.fs.Path; 
import org.apache.hadoop.io.IntWritable; 
import org.apache.hadoop.io.Text; 
import org.apache.hadoop.mapreduce.*; 
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; 
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; 
 
public class Driver { 
 
public static void main(String[] args) throws Exception { 
    // this main function will call run method defined above. 
 
    // TODO Auto-generated method stub 
    Configuration conf=new Configuration(); 
    @SuppressWarnings("unused") 
    Job job = new Job(conf , "Sales Per Store"); 
 
    job.setMapperClass(Map.class); 
    job.setReducerClass(Reduce.class); 
    job.setJarByClass(Driver.class); 
 
    job.setOutputKeyClass(Text.class); 
    job.setOutputValueClass(IntWritable.class); 
 
    FileInputFormat.addInputPath(job, new Path(args[0])); 
    FileOutputFormat.setOutputPath(job, new Path(args[1])); 
    job.waitForCompletion(true); 
 
 
    } 
 } 

示例输入文件:

2012-01-01 09.00 sanJose clothin 214 amex

2012-01-01 09.00 西雅图音乐 320 大师

2012-01-01 09.00 seattle elec 3120 master

2012-01-01 09.00 sanJose 香水 3200 amex

输出文件:

猫 test123/part-r-00000

圣何塞 3414

西雅图 3440


评论关闭
IT虾米网

微信公众号号:IT虾米 (左侧二维码扫一扫)欢迎添加!