Post Installation Tasks for EMR

After deploying or upgrading Arcadia Enterprise on EMR, perform the following tasks:

[Mandatory] Deploy Hive-site on Core Nodes Running ArcEngine

For complete Arcadia Enterprise functionality, it is mandatory to deploy Hive-site on core nodes running ArcEngine. The /etc/hive/conf/hive-site.xml file must be present on each core node in the cluster at the same location. EMR deployments do not push client configurations to core nodes, so this step must occur after Arcadia Enterprise deployment is completed and all arcengined services must be restarted.

You can perform the following service actions on the core nodes:

service arcengined [status|start|stop|restart] 
You can perform the following service actions on the master node:
service catalogd [status|start|stop|restart]
service statestored [status|start|stop|restart]
service arcviz [status|start|stop|restart]

Backup Local Settings File

Backup any changes to Arcadia Service init scripts or settings_local.py file outside of the EMR cluster, whenever they are manually changed. When EMR clusters are terminated, all data on the systems are lost.

Arcadia boostrap.py deploys cron jobs on the EMR cluster for the following action:

Backing up SQLite database

Due to the ephemeral nature of EMR deployments, the default SQLite ArcViz metastore is periodically dumped and backed up in the installation folder in the S3 installation bucket. This dump can be used to recreate the SQLite database when attempting to recover from an EMR cluster failure or when switching to an EMR deployment with a newer Arcadia Installation during upgrade.

Connect to an External ArcViz Metastore

We recommend you to connect to an external ArcViz metastore. Connecting a newer version of Arcadia Enterprise on EMR to an external metastore that contains an ArcViz metastore, automatically upgrades the ArcViz metastore at startup.