Metadata Upgrade and Migrating Analytical Views

Total time necessary to load all metadata depends on the number of databases, tables, and the number of files that constitute the tables in the system. For systems with a large number of tables and files, we have previously observed metadata load time as high as two hours. Arcadia Enterprise Release 4.1.1.0 changed the startup process to significantly improve performance, where the metadata for the same tables loads in under 30 seconds.

To realize this improvement, you must migrate the existing analytical views after upgrading to the Arcadia Enterprise Release 4.1.1.0 release.

Note that immediately after upgrading to the current release (4.1.1.0), catalogd automatically starts with the following option:

--load_catalog_in_background=false

If this is your first installation of Arcadia Enterprise, these steps are not necessary.

  1. Restart catalogd manually with the option set to true, to load the new representation of metadata for all objects and migrate existing analytical views:

    --load_catalog_in_background=true

    To ensure that all the metadata loads before issuing the next statement, monitor the logging to catalogd.INFO. When catalogd initially starts, it logs messages to this file. When the metadata loading completes, it does not write to catalogd.INFO until the system receives a ddl statement.

    Also, note that the contents of the catalogd.INFO should look something like this after loading metadata completes:

    I0720 19:51:04.566013 18594 HdfsTable.java:339] load block md for tab2_av_stale file 41471208ed374ca8-8e4dccc4bfc0f9ae_885057525_data.0.parq
                   ...
                   I0720 19:51:04.640791 18603 HdfsTable.java:1041] load table from Hive Metastore: xdb_1.traditional_phones
                   ...
                   I0720 19:51:04.642539 18616 Table.java:167] Loading column stats for table: tab3_av_invalid
                   ...
                   I0720 19:51:05.822299 18592 catalog-server.cc:313] Publishing update: TABLE:xdb_1.tab2_av_unusable@815
                   ...
                   I0720 19:51:05.822319 18592 catalog-server.cc:313] Publishing update: TABLE:xdb_1.h@800
                   ...
                   I0720 19:51:05.822371 18592 catalog-server.cc:313] Publishing update: CATALOG:aef12bf8b40b4b2c:aaec9d243d25591f@829
                   ...
                   I0720 20:01:02.110677 18602 catalog-server.cc:229] Catalog Version: 830 Last Catalog Version: 830

    The last two lines often signal that metadata has been loaded successfully.

  2. When catalogd.INFO is quiet, restart catalogd manually with the option set to false. This step ensures faster startup times post-migration:

    --load_catalog_in_background=false