Skip to main content
Skip table of contents

Connect a standalone Indexima cluster to a kerberized instance

Kerberos configuration on Indexima cluster

For example, on CentOS

yum install krb5-workstation

Edit /etc/krb5.conf file

  • Modify default domain
  • admin_server and kdc location
  • Possible problems
    • Comment renew_lifetime parameter
    • Modify default_ccache_name to use /tmp directory and not keyring

HDFS configuration

You will have to choose a user and a keytab to connect to your kerberized cluster. This user needs to be declared as a proxy user in your HDFS configuration.

After modification, you will need to restart your HDFS cluster.

Example of impala as a proxy user :

CODE
<property>
   <name>hadoop.impala.indexima.groups</name>
   <value>*</value>
</property>
<property>
    <name>hadoop.impala.knox.groups</name>
    <value>*</value>
</property

Galactica configuration modification

jaas.conf

You can create a keytab for indexima on your kerberized cluster or use the keytab.

You need to copy your keytab on each machine from the Indexima cluster.

Create a jaas.conf file.

CODE
com.sun.security.jgss.initiate {
 com.sun.security.auth.module.Krb5LoginModule required
 principal="impala/ipadress@DOMAIN.COM"
 keyTab="/etc/security/keytabs/impala.keytab"
 useKeyTab=true
 storeKey=true
 debug=true;
};

Galactica-env.sh

Add the following line

CODE
export NODESERVER_JVM_OPTIONS="-Djava.security.auth.login.config=/opt/k/work/indexima/galactica/jaas.conf -Djavax.security.auth.useSubjectCredsOnly=false"

Additional actions to connect to a Kerberized Impala

Execute a manual Kinit

Depending on your Impala driver version (2.5.5.1007), you will need to do a manual kinit with the user your choose to connect on your Impala cluster.

kinit -kt ... (specify your user and keytab)

Table creation from Impala

CODE
create table from_impala from my_impala_table
IN 'jdbc:impala://impala_server_adress:impalaPort;AuthMech=1;KrbRealm=XXX.COM;KrbHostFQDN=ip-FQDN-adress-;KrbServiceName=impala'
(index(id1))

Table load from Impala

You can load data from Impala by doing a JDBC load but it is more efficient to use an HDFS load

CODE
load data inpath 'hdfs://ipadress:8020/user/hive/warehouse/xxx' into table from_impala format parquet

Help for debug purposes

impala-shell installation

This section is not mandatory but may be useful for debugging purposes.

yum install python-pip gcc gcc-c++ cyrus-sasl-devel
python pip install impala-shell

Try to connect to remote Impala instance

kinit -kt ... (your keytab) ...
impala-shell -k

Check if you can browse tables and data from Impala.


JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.