sorting - how to sort mapper output key with multi-fields? -
i want sort mapper
output records first 2 fields before feeding them reducer
, , here how did it:
hadoop streaming \-d mapred.job.name="multi_field_key_sort"\ -d mapred.job.map.capacity=100\ -d mapred.reduce.tasks=1\ -d stream.num.map.output.key.fields=2\ -d mapred.output.key.comparator.class=org.apache.hadoop.mapred.lib.keyfieldbasedcomparator\ -d mapred.text.key.comparator.options="-k1,2n"\ -input "..."\ -output "..."\ -mapper "..."\ -reducer "cat"\
but final results not sorted first 2 fields, sorted 1st fields, why? wrong hadoop job conf?
Comments
Post a Comment