Complex Data Types

Arcadia Enterprise understands native complex data type configurations, and automatically translates complex structures into intuitive drag and drop dataset field configurations. This allows us to bypass ETL workflows and avoid unnecessary flattening at query time. It also leverages the powerful native SQL already optimized for complex types.

Availability Notes:
  • Both Impala and Arcadia connections support the STRUCT, ARRAY, and MAP complex data types.
  • For Hive connections, we limit our support to the STRUCT data type.
Tip.

In describing how Arcadia Enterprise supports complex data types, we refer to data and structures adapted from the TPC-H benchmark as a Sample Schema for Complex Types.

Overview of Complex Data Types

By complex data type, we mean one of the following structures:

  • STRUCT

    A group of named fields, where each field can have a different data type, both primitive and complex.

    See STRUCT Data Type.

  • ARRAY

    A list of values that share the same data type.

    See ARRAY Data Type.

  • MAP

    A set of (key,value) pairs.

    See MAP Data Type.

The elements of an ARRAY or a MAP, or the fields of a STRUCT may be primitive, or they may be other complex types. You can construct elaborate data structures with many levels, such as a ARRAY with STRUCT elements.

Appearance of Complex Data Types

Arcadia presents complex types very simply. Users can manipulate the individual component fields of complex types in the same manner as the primitive data type fields: place them on the shelves of the visual, as a component of an expressions, as a filter, and so on. Each level of a complex data type may be expanded to show component details, or collapsed.

Arcadia Enterprise processes complex types as Dimensions, grouping individual elements under the complex type. The components of a complex type cannot be re-assigned to the Measures group even if they logically represent a measurement; however, you can easily use them as measurements on visual shelves, in filters, and so on.

Because we support native SQL processing, you do not have to worry about the complexities of query generation to access these elements.

Restrictions

Complex data types have the following implementation restrictions:

  • MAP, ARRAY, and STRUCT fields require base tables or partitions that use the Parquet file format.
  • You cannot use the fields that contain complex data types as partition keys in a partitioned table.
  • The Compute Statistics operation does not work for complex data types.
  • The column definition for any complex type, including nested types, can have a maximum of 4000 characters.
  • The Import into Dataset feature does not support files that contain complex datatypes.
  • Analytical Views do not support complex type queries, because complex types require JOIN operations. As a result, visuals that use complex data types do not have valid recommendations. However, if you associate a logical view with the dataset, then the recommendation engine proceeds to generate suggestions, and presents the relevant dataset columns as complex data types.
Tip. This linked video demonstrates how to access complex data types such as MAPS, STRUCTS, and ARRAYS, directly through the Arcadia Data user interface, without any coding.