During normal operations, it may become necessary to create back-ups of analytical views, and to migrate them from one environment to another. Currently, this is supported through the command line interface.
ArcViz creates several artifacts, including analytical views (aggregates). You may also create analytical views on the command line. Arcadia stores these objects as tables directly in the HDFS, making the metadata available through the Hive metastore.
The following steps demonstrate how to migrate analytical views.
Log into ArcEngine through the beeline in the source system.
Specify the data source.
[localhost:31050] > use tvdata;
Run show [all] analytical views
to retrieve the list of all analytical views.
[localhost:31050] > show analytical views;
The output that appears is similar to the following:
+--------------------------------------------------------------------------------------------------------+ | name |base | select | state | +--------------------------------------------------------------------------------------------------------+ | tvdata.tv_viewing_data_av_1 | tvdata.tv_viewing_data | SELECT hour((TA_0.ts)) '_c0', TA_0. | FRESH | | | | channel channel, TA_0.program progr | | | | | am, sum(1) '_c3', datestring datest | | | | | ring FROM tvdata.tv_viewing_data TA | | | | | _0 GROUP BY 1, 2, 3, 5 | | | tvdata.tv_viewing_data_av_10 | tvdata.tv_viewing_data | SELECT TA_0.channel channel, TA_0.p | UNUSABLE | | | | rogram program, sum(1) '_c2', dates | | | | | tring datestring FROM tvdata.tv_vie | | | | | wing_data TA_0 GROUP BY 1, 2, 4 | | | tvdata.tv_viewing_data_av_11 | tvdata.tv_viewing_data | SELECT TA_0.channel channel, TA_0.p | FRESH | | | | rogram program, sum(1) '_ c2', date | | | | | string datestring FROM tvdata.tv_vi | | | | | ewing_data TA_0 GROUP BY 1, 2, 4 | | +--------------------------------------------------------------------------------------------------------+
Choose the first item in the list, tvdata.tv_viewing_data_av_1
.
Note the SQL SELECT
statement that defines it:
SELECT hour((TA_0.ts)) '_c0', TA_0.channel channel, TA_0.program program, sum(1) '_c3', datestring datestring FROM tvdata.tv_viewing_data TA_0 GROUP BY 1, 2, 3, 5
Run the show
statement for analytical view tvdata.tv_viewing_data_av_1
, and note the partitioning information.
[localhost:31050] > show create table tv_viewing_data_av_1; +--------------------------------------------------------------+ | result | +--------------------------------------------------------------+ | CREATE TABLE tvdata.tv_viewing_data_av_1 ( _c0 INT COMMENT 'SQL Expression: 'hour((TA_0.ts))'', channel STRING COMMENT 'SQL Expression: 'TA_0.channel'', program STRING COMMENT 'SQL Expression: 'TA_0.program'', _c3 BIGINT COMMENT 'SQL Expression: 'sum(1)'' ) PARTITIONED BY ( datestring STRING COMMENT 'SQL Expression: 'datestring'' ) STORED AS PARQUET LOCATION 'hdfs://larry1:8020/apps/hive/warehouse/tvdata.db/tv_viewing_data_av_1'
On the target system, combine the information from the two previous steps to create a new analytical view by issuing the following CREATE ANALYTICAL VIEW
command:
CREATE ANALYTICAL VIEW tv_viewing_data_av_1 PARTITIONED BY (datestring) STORED AS PARQUET AS (SELECT hour((TA_0.ts)) '_c0', TA_0.channel channel, TA_0.program program, sum(1) '_c3', datestring datestring FROM tvdata.tv_viewing_data TA_0 GROUP BY 1, 2, 3, 5)
Although created, the new analytical view is in UNUSABLE
state. Run the refresh
command to materialize the analytical view.
refresh analytical view tv_viewing_data_av_1