felix.thirdpart
Class XmlInputFormat
java.lang.Object
org.apache.hadoop.mapreduce.InputFormat<K,V>
org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>
org.apache.hadoop.mapreduce.lib.input.TextInputFormat
felix.thirdpart.XmlInputFormat
public class XmlInputFormat
- extends org.apache.hadoop.mapreduce.lib.input.TextInputFormat
Reads records that are delimited by a specific begin/end tag
-- ACK: THIS THIRD-PART CLASS IS NOT WRITTEN BY FELIX'S AUTHORS.
Nested Class Summary |
static class |
XmlInputFormat.XmlRecordReader
XMLRecordReader class to read through a given xml document to output xml blocks as records as specified
by the start tag and end tag |
Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat |
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.Counter |
Method Summary |
org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text> |
createRecordReader(org.apache.hadoop.mapreduce.InputSplit split,
org.apache.hadoop.mapreduce.TaskAttemptContext context)
Returns XMLRecord reader to read xml document. |
Methods inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat |
addInputPath, addInputPaths, getInputPathFilter, getInputPaths, getMaxSplitSize, getMinSplitSize, getSplits, setInputPathFilter, setInputPaths, setInputPaths, setMaxInputSplitSize, setMinInputSplitSize |
Methods inherited from class java.lang.Object |
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
START_TAG_KEY
public static final java.lang.String START_TAG_KEY
- See Also:
- Constant Field Values
END_TAG_KEY
public static final java.lang.String END_TAG_KEY
- See Also:
- Constant Field Values
XmlInputFormat
public XmlInputFormat()
createRecordReader
public org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text> createRecordReader(org.apache.hadoop.mapreduce.InputSplit split,
org.apache.hadoop.mapreduce.TaskAttemptContext context)
- Returns XMLRecord reader to read xml document.
- Overrides:
createRecordReader
in class org.apache.hadoop.mapreduce.lib.input.TextInputFormat