不断获取IndexOutOfBoundsException hadoop mapreduce

问题描述

我是Hadoop和Java的新手。所以，请忍受我。

我能够使mapreduce与.tsv文件一起使用，但似乎无法使其与.csv文件一起使用。

这主要是我的制图员问题，我看不出是什么问题。

这是代码：

package question5;


import java.io.IOException;


import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapreduce.Mapper;

public class FreqMapper extends Mapper<LongWritable,Text,IntWritable>{
    
    @Override
    public void map(LongWritable key,Text value,Context context) throws IOException,InterruptedException{
        /*
         * When the file is inputed,the first line is read.
         * The first line in this case,are the headers,which we do not want.
         * Since the input is split into a key-pair structure,we only need to skip key 0.
         * As seen below.
         * */
        if(key.get()==0) {
            return;
        }else {
            /*
             * After skipping the first line,we extract the necessary data to be mapped into our desired
             * key-pair structure.
             * 
             * In this case,channel_title -> likes
             * channel_title being Text data type
             * likes being IntWritable data type
             * 
             * The data is split at the comma.
             * */
            String line = value.toString();
            Text channel_name = new Text(line.split(",")[3]);
            IntWritable likes = new IntWritable(Integer.parseInt(line.split(",(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)")[8]));
            context.write(channel_name,likes);
        }
        
        
    }
    
}

当我要访问索引8的拆分数组时，问题发生在IntWritable。发生IndexOutOfBoundsException。我测试了正则表达式，它工作正常，如此处https://regex101.com/r/J3P6xQ/1

任何建议都将受到欢迎。谢谢您的阅读。

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

mapper

不断获取IndexOutOfBoundsException hadoop mapreduce

问题描述

解决方法

相关问答