Skip to main content
Skip table of contents

Files Load (CSV, parquet...) via command

Indexima supports loading from the following format:

  • CSV
  • JSON
  • PARQUET
  • ORC

Load data directly from HDFS to avoid Hive

You can use the LOAD DATA command to extract data directly from a Hadoop Datanode.

Command

LOAD DATA HDFS

SQL
LOAD DATA INPATH 'hdfs://data_node:8020/apps/hive/warehouse/my_hive_data INTO TABLE my_table';
COMMIT my_table;

This will read every file located in /apps/hive/warehouse/my_hive_data of the data node. The data will be loaded in an Indexima table named "my_table" and then be committed.

The LOAD DATA INPATH designates a folder and not a file. All the data files inside this directory will be imported. All such files must have the same structure and format to get a consistent result in the final table.

Load partitions from Hive table

You can learn more about the LOAD DATA INPATH commands here.

Load data from files directly on the filesystem of the machines running Indexima.

Command

LOAD DATA LOCAL

SQL
LOAD DATA LOCAL INPATH '/tmp/my_data' INTO TABLE my_table FORMAT CSV SEPARATOR ',' SKIP 2;
COMMIT my_table;

This will load all data in the files located in the folder /tmp/my_data into the Indexima table default.my_table. The files must be CSV files with a comma separator. The first 2 lines are skipped.


More

You can learn more about the LOAD DATA INPATH commands here.

More

You can learn more about the LOAD DATA commands here.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.