Indexima Analyzer
What is the Analyzer
To know what Indexes to create, Indexima comes with a tool called the Analyzer. It will help you make the best decision in terms of indexing, cost efficiency, and overall optimization.
Based on the queries hit on tables, Analyzer suggests Hyperindexes to create. As a user you will decide.
Access the Analyzer Page
Click on the tab Analyzer
of your INDEXMA MONITOR to open the Analyzer
and initiate the query analysis.
Enter the following parameters
Required parameters
Parameter | Description |
---|---|
Select a schema | Schema where the fact table to be analyzed is located |
Select Table | Fact (or limited) table to analyze |
Days | Number of past days to analyze in order to find the most optimum index configuration to keep performance objectives |
Hits | Minimum hits of past queries to suggest the creation of a new hyperindex. Specifying the value of 1 indicates that all queries will be analyzed. |
Additional parameters - Part1
Minimize changes
- Yes: Analyzer will suggest hyperindexes based on the existing hyperindexes in order to optimize non already optimized queries
- No: Analyzer won't consider the existing hyperindexes and will suggest hyperindexes that would optimize all queries.
Prejoins
- Yes: If Queries include joins (left or inner) with a dimension table, the analyzer will suggest precomputing the result of the join query
- No: Suggestions would be made using only fields of the fact (or limited) table
Additional parameters - Part2
When selecting more options
, additional parameters can be entered
Index coefficient (%)
The factor that controls the number of indexes
- When the parameter
analyser.default-merge-policy
is equal to COEF (default value), An index coefficient of 0 will try to minimize the number of indexes, whereas of coefficient of 100 will propose the maximum number of indexes. - When the parameter
analyser.default-merge-policy
is equal to MAX_INDEX, the analyzer will propose no more thananalyser.default-max-expected-indexes
indexes. See galactica-conf for parameters description.
Mandatory columns
List of columns to be present in the hyperindex even if no SELECT hits one of those. Those columns will be included in the suggestion list.
High cardinality
refers to columns with values that are very uncommon or unique. High cardinality column values are typically identification numbers, email addresses, or user names.
An example of a data table column with high cardinality would be a USERS table with a column named USER_ID. Each time a new user is created in the USERS table, a new number would be created in the USER_ID column to identify them uniquely.
Some columns presenting potential high cardinality can be suggested by the INDEXIMA ANALYZER in the High cardinality
field.
It is possible to add or remove columns from this field and run again the analysis to verify the impact of such suggested indexes.
Index specialization
It is possible to prequalify the content of an index in order to include or exclude suggestions of an index containing some typical contents such as SELECT clause, expression with high cardinality.
Specialization | Description |
---|---|
Fully specialized | Add always expressions in the suggested index |
Half specialized | Add expressions in the suggested index when they are in WHERE clauses and exclude those in the SELECT clauses |
Not specialized | Only add expressions in the suggested index if expressions contain high cardinality fields |
Number of authorized ALL column (only for External tables)
Define a maximum threshold of the "ALL" columns in HyperIndexes proposals.
This makes it possible to determine if we want to delegate more or fewer requests who fetch raw data.
For example: By authorizing the maximum of ALL columns, you allow the analyzer to make HyperIndexes with more retention on the underlying database.
Launch Analysis
Click on the Start Analysis
button. After a few second, The INDEXIMA ANALYSER displays indexes to be deleted and indexes to be created in the result area
Understand the suggestions
Index to create are highlighted in green and displays the following information:
- Number of hits by analyzed queries
Indexes to be deleted are highlighted in red and displays the following information:
- The amount of RAM per node recovered if this index is deleted.
- The number of entries/nodes similar to the number of rows concerned per node.
Apply Analyzer suggestions
It is possible to apply an action:
- index by index
- or to select a group of index actions in one batch
When you click on the Apply
button, a batch of SQL commands is sent to the INDEXIMA CLUSTER. INDEXIMA ANALYSER displays the status of the submission with all indexes altered. Click on the See Progress Here
button to jump in the INDEXIMA MONITOR Query tab.
All the index actions submitted by the INDEXIMA ANALYSER are translated into SQL queries :
ALTER TABLE ADD INDEX
(...)- or
ALTER TABLE DROP INDEX IndexName
All users can continue to access the INDEXIMA CLUSTER.
See the progress of HyperIndexes creations & deletions
Advanced Actions (1)
Download SQL
The index actions submitted by the INDEXIMA ANALYSER are translated into SQL queries ALTER TABLE.
This button allows you to download the SQL queries into a sql file before applying the suggestions
Edit HyperIndexes
Why edit Hyperindexes
User may want to adjust indexes by removing or adding columns:
Reasons to remove a column from a proposed hyperindex are multiple:
Cardinality very high; All hyperindex are stored in INDEXIMA CLUSTER node RAM. A high cardinality index can produce very large indexes. Run SHOW MEMORY to control the size of the produced index
Complex expressions or clauses; Such expressions can lead to wrong and inaccurate results. It's an indication that the data model needs to be reworked and data must be flattened or the star schema must be de-normalized to simplify JOINs.
How to edit Hyperindexes
User may want to adjust indexes. In order to do so:
- Slide the
Edit Indexes
toggle - Untick create box
- Click on the
Edit
button
Remove columns by clicking on the cross nearby. Once done, click on the button Update your index in expression
Edit HyperIndexes Define an Index as a main index
Under certain circumstances, you may need to change your main index
. This is also the purpose of the advanced mode.
The main index
is the index that queries use to find columns not present in hyperindex in order to fetch the fine data. Most of the time, this data is located on disks, explaining why the performance of the query is not optimum even if you have fast SSD disks.
After a period of use of the data, running several times the INDEXIMA ANALYSER to optimize the queries execution time, you can feel that a new main index can allow you to continue to improve your performance.
Slide the button Switch to Main Index
and clean when necessary columns of your new future main index. Then click on the button Update your index expression
Advanced Actions (2)
Reset Analytics
The Analyzer uses past queries and a sample to make suggestions. Users may want to delete past queries used for building those suggestions.
Reset Analytics for a table
Use the SQL Command DROP ANALYTICS to delete all the analytics related to a table.
Reset Analytics for all tables
Use the delete button in the Indexima Monitor interface to delete all table analytics.