Wednesday, August 2, 2017

Creating Custom UDF with LLAP

Creating and running Temporary functions are discouraged while running query on LLAP because of security reason, since many users are sharing same instances of LLAP, it can create a conflict but still you can create temp functions using add jar and hive.llap.execution.mode=auto.

with exculsive llap execution mode(hive.llap.execution.mode=only) you will run into the ClassNotFoundException, hive.llap.execution.mode=auto will allow some part of query(map tasks) to run in the tez container.

Here are steps to create custom permanent funtion in LLAP(steps are tested on HDP-260)

  1. create a jar for UDF funtion (in this case I am using simple udf):
git clone https://github.com/rajkrrsingh/SampleCode
mvn clean package
  1. upload the target/SampleCode.jar to the node where HSI is running(in my case I have copied it to /tmp directory)
  2. add jar to hive_aux_jars goto Ambari--> hive --> config --> hive-interactive-env template
      export HIVE_AUX_JARS_PATH=$HIVE_AUX_JARS_PATH:/tmp/SampleCode.jar
  1. add the jar to Auxillary JAR list goto Ambari--> hive --> config --> Auxillary JAR list
Auxillary JAR list=/tmp/SampleCode.jar
  1. restart LLAP
  2. create Permanent Custom function
connect to HSI using beeline
create FUNCTION CustomLength as 'com.rajkrrsingh.hiveudf.CustomLength';
 describe function CustomLength;
 select CustomLength(description) from sample_07 limit 1;
  1. check where the SampleCode.jar localized
root@hdp26 container_e06_1501140901077_0019_01_000002]# pwd
/hadoop/yarn/local/usercache/hive/appcache/application_1501140901077_0019/container_e06_1501140901077_0019_01_000002
[root@hdp26 container_e06_1501140901077_0019_01_000002]# find . -iname sample*
./app/install/lib/SampleCode.jar

No comments: