Overview

Within Indexima, queries history can be saved into a CSV file to be later loaded into Indexima for analysis.

Define queries history storage place

In galactica.conf, you can set the parameter history.export to the absolute path you want to store your query history to. Eg. history.export = /var/log/indexima/history_csv

We strongly recommend that you set history.export to a different path (a sub-path is fine) than history.dir, so that the different types of log don't get mixed up.

We recommend you to define a d├ędicated directory in your warehouse. Thus you would be able to import those files into an Indexima table through a LOAD command.

If you want to store your history files in AWS S3, you need to write the full S3 path prefixed with s3a. Eg. history.export = s3a://my-bucket/path/to/query/logs

Exported CSV files are stored each day in a different file.

Content

Files contains:

  • All queries sent to indexima
  • subqueries for queries that contain subqueries. In that case, the Parent column is fulfilled
    • In order to separate subqueries from "original" queries, we recommend creating a calculated column.

CSV format

Column name  

Description  

Infos

OPERATION

the SQL operation name

SELECT, LOAD, etc..
Note that select queries that contain UNION operator get a "UNION" operation

STATEMENT

the SQL statement


SUBMIT

Number of submitted tasks


FINISH

Number of finished tasks


DURATION

Execution duration

In milliseconds

TIME

Execution start time


ERROR

Error if any


TABLE

Table name


IDENTIFIER

Query identifier


KILL

Request killed ?

True / False

LOADED

Index Loaded during request

True / False

RESULT

Number of results


USER

User name


IP

User IP


INDEXES

Index name


CACHED

Request found in cache

True / False

HYBRID

Hybrid request (disk scanning)

True / False

DELEGATED

Request delegated to external

True / False

EXTERNAL_TABLE_NAME

External table name

if delegated

EXTERNAL_ERROR

External error if any

if delegated

EXTERNAL_DURATION

Total duration of a delegated query

if delegated

EXTERNAL_NB_ROWS_FETCHED

External number of rows fetched

if delegated

EXTERNAL_QUERY_EXECUTION_DURATION

Duration of a SELECT query delegated, without downloading the result set

if delegated

LOAD_TOTAL_SIZE

Number of lines loaded

LOAD statement only

LOAD_INSERTS

Number of lines inserted

LOAD statement only

LOAD_ERRORS

Number of lines in error

LOAD statement only

MAX_WORKER_MEMORY

Maximum amount of memory (in bytes) used to compute a query (for all nodes, including master)


MASTER_MEMORY

Amount of memory (in bytes) used on the master node to compute a query 


READ_LINES_FROM_BUCKETS

Number of lines scanned on disk


DISK_USAGE

Amount of data (in bytes) spilled on disk


RESULT_SET_SIZE

Size (in bytes) of the resultset sent back


PENDING_TIME

Pending duration time (ms) of the query before being executed

Since 1.7.8
PARENT

Query identifier

since 1.7.12 

SUSPENDEDTotal time (ms) while the query was in freezing state.since 2021.3
BIG_INDEXTrue is the query used the bigindex feature, false otherwise.since 2021.3

Note that this "export history files" is not an official feature. The file format remains internal to the product and may be changed without notice.

Importing your logs in Indexima

Creating a Log Table

You can create the Log Table using your Indexima cluster with the following command:

CREATE TABLE logs_table (
OPERATION string,
STATEMENT string,
SUBMIT int,
FINISH int,
DURATION bigint,
ExecutionTIME timestamp(second),
ERROR string,
XTABLE string,
IDENTIFIER bigint,
KILL string,
LOADED string,
RESULT bigint,
USER string,
IP string,
INDEXES string,
CACHED string,
HYBRID string,
DELEGATED string,
EXTERNAL_TABLE_NAME string,
EXTERNAL_ERROR string,
EXTERNAL_DURATION bigint,
EXTERNAL_NB_ROWS_FETCHED bigint,
EXTERNAL_QUERY_EXECUTION_DURATION bigint,
LOAD_TOTAL_SIZE bigint,
LOAD_INSERTS bigint,
LOAD_ERRORS bigint,
MAX_WORKER_MEMORY bigint,
MASTER_MEMORY bigint,
READ_LINES_FROM_BUCKETS bigint,
DISK_USAGE bigint,
RESULT_SET_SIZE bigint,
PENDING_TIME bigint,
PARENT_ID bigint,
SUSPENDED bigint,
BIG_INDEX boolean,

-- recommended computed fields
issubquerie AS (PARENT_ID > 0),

qschema AS (split(Xtable,"\\.",0)),
qtable AS (split(Xtable,"\\.",1)),

Index(OPERATION,year(ExecutionTIME),month(ExecutionTIME),day(ExecutionTIME),user,AVG(DURATION))
);
SQL

Note: You can adapt the schema and the table name you want to use
 

Loading Data

Then, you can load data from the CSV file saved automatically in history.export:

LOAD DATA INPATH '.../history_csv' INTO TABLE logs_table FORMAT CSV SEPARATOR ',' ;
COMMIT logs_table;
SQL

Note: Adapt the values "logs_table" and "/var/log/indexima/history_csv" to the values that suit your configuration.

If you want to store your history files in AWS S3, use the same s3a path that you set up in the parameter history.export.

LOAD DATA LOCAL INPATH 's3a://my-bucket/path/to/query/logs' INTO TABLE logs_table FORMAT CSV SEPARATOR ',';
COMMIT logs_table;
SQL