Sometimes, the data in the base tables cannot be used directly and you must use an expression to change or "correct" it. For other use cases, you may want to create a calculation based on one or more fields. Instead of adding this expression for every visual in your BI tool, you can easily create a new calculated field in the dataset, and subsequently, use it in dashboards and visuals.
It is possible, at the creation table or later, to add columns with calculated fields.
Creating columns with Calculated Fields
Let's assume you have the following table nyc_yellow:
Let create 2 columns:
- tripduration_minute = Drop off date/time - Pickup Date/time
- Year_Month = trip pickup DateTime with the format YYYY-MM
ALTER TABLE nyc_yellow ADD COLUMNS (tripduration_minute as round((unix_timestamp(Trip_Dropoff_DateTime)-unix_timestamp(Trip_Pickup_DateTime))/60) ); ALTER TABLE nyc_yellow ADD COLUMNS (Year_Month as date_format(Trip_Pickup_DateTime,"YYYY-MM"));
After adding the 2 columns, you can display them:
The 2 columns
Year_Month are in fact virtual because the results of such columns are calculated on the fly. Genuine data from the database is not altered.
Creating hyperindex with Calculated Fields
You may include the calculated fields into hyperindexes:
ALTER TABLE nyc_yellow ADD INDEX (Year_Month,AVG(tripduration_minute));
As a result, the following query:
select Year_Month,Avg(tripduration_minute) AvgDuration from tmp_doc.nyc_yellow;
can be answered immediately,
by a hyperindex:
Deleting Calculated Fields
A drop command is required to remove columns with calculated fields.
This is the case for our example.
Delete the index:CODE
ALTER TABLE nyc_yellow DROP INDEX nyc_yellow_0_1;
Delete the 2 calculated fields:CODE
ALTER TABLE nyc_yellow DROP COLUMN tripduration_minute ; ALTER TABLE nyc_yellow DROP COLUMN Year_Month ;