Introduction

Sometimes, the data in the base tables cannot be used directly and you must use an expression to change or "correct" it. For other use cases, you may want to create a calculation based on one or more fields. Instead of adding this expression for every visual in your BI tool, you can easily create a new calculated field in the dataset, and subsequently, use it in dashboards and visuals.

It is possible, at the creation table or later, to add columns with calculated fields.

Creating columns with Calculated Fields

Let's assume you have the following table nyc_yellow:

Let create 2 columns:

  • tripduration_minute = Drop off date/time - Pickup Date/time
  • Year_Month = trip pickup DateTime with the format YYYY-MM

Add columns

ALTER TABLE nyc_yellow ADD COLUMNS (tripduration_minute  as  round((unix_timestamp(Trip_Dropoff_DateTime)-unix_timestamp(Trip_Pickup_DateTime))/60) ); 
ALTER TABLE nyc_yellow ADD COLUMNS (Year_Month  as date_format(Trip_Pickup_DateTime,"YYYY-MM"));
SQL

After adding the 2 columns, you can display them:

The 2 columns tripduration_minute  and Year_Month are in fact virtual because the results of such columns are calculated on the fly. Genuine data from the database is not altered.

Creating hyperindex with Calculated Fields

You may include the calculated fields into hyperindexes:

ALTER TABLE nyc_yellow ADD INDEX (Year_Month,AVG(tripduration_minute));
SQL

As a result, the following query:

select Year_Month,Avg(tripduration_minute) AvgDuration from tmp_doc.nyc_yellow;
SQL

can be answered immediately,

by a hyperindex:

Deleting Calculated Fields

A drop command is required to remove columns with calculated fields.

Note that any index involving a calculated field to be deleted must be first deleted.

This is the case for our example.

  1. Delete the index:

    ALTER TABLE nyc_yellow DROP INDEX nyc_yellow_0_1;
    CODE
  2. Delete the 2 calculated fields:

    ALTER TABLE nyc_yellow DROP COLUMN tripduration_minute ; ALTER TABLE nyc_yellow DROP COLUMN Year_Month ;
    CODE