Wednesday 15 January 2014

hadoop - Map a hive partition to a location -



hadoop - Map a hive partition to a location -

i have hive external table partition year, month day , hour.

partitioned ( `year` int, `month` int, `day` int, `hour` int) row format serde 'org.openx.data.jsonserde.jsonserde' stored inputformat 'org.apache.hadoop.mapred.sequencefileinputformat' outputformat 'org.apache.hadoop.hive.ql.io.hivesequencefileoutputformat' location 'hdfs://path/to/data'

the info exists in directories such

2014/05/10/07/00

2014/05/10/07/01

...

2014/05/10/07/22

2014/05/10/07/23

i results when select info using following:

select * my_table year=2014 , month="05" , day="07" , hour="03"

but want able query out quotes values starting zero. next 2 examples don't work:

select * my_table year=2014 , month=05 , day=07 , hour=03 select * my_table year=2014 , month=5 , day=7 , hour=3

how can back upwards this? (instead of changing directories not have 0 prefix on single digit values).

thanks,

guy

before go answer, involve changing directory names create querying simple you.

we have similar kind of construction our partitions instead of using names format 2014/05/10/07/22, utilize 2014/201405/20140510/07/20140510.22. partitions are:

partitioned ( years bigint, months bigint, days bigint, hours float )

now coming advantages of using this:

query mentioned in question:

select * my_table year=2014 , month=05 , day=07 , hour=03

after new partitions

select * my_table hr = 20140507.03

also other queries on days , months can run straight without explicitly specifying months , years.

hadoop hive hql hiveql

No comments:

Post a Comment