Creating Joins

This article demonstrates how to create new data joins in the dataset Flight Delays, based on data previously imported into Arcadia from the datafile flights-2014-01.csv. The additional datafiles are airline-id.csv, airport-codes.csv, cancellation-code.csv, and airport-lat-lo ng.csv.

The following steps demonstrate how to make new joins.

  1. On the main navigation bar, click Data.

    The Data view appears, open on the Datasets tab.

  2. Find the dataset in the list of datasets, either by scrolling or by using search, and click on it.

    Dataset side navigation appears, open at Dataset Detail view.

  3. In the side navigation menu, click Data Model.

    Data Model view appears, and shows the name of the only table in the dataset. You may click Show Data to display the data of that table.

    data model, showing sample data for one table
    Data Model for Dataset 'Flight Delays', with a Singe Table and its Sample Data
  4. Click Edit Table Links to edit the data model.

    Editing Table Links
    Editing Table Links
  5. Click the sign on the table representation.

    Adding a Table Join
    Adding a Table Join

    Add Dimension Table modal window appears.

  6. In the Add Dimension Table modal window, make the following selections:

    • In the Database Name selector, choose the data source.

      This value is pre-populated to match the dataset's existing table, but it may be changed.

    • In the Table Name selector, choose the table name airline_id.

      This value is pre-populated to match the dataset's existing table, but it may be changed.

    • Click Select.
    Selecting a table for a join
    Selecting a Table to Join

    Edit Column Connections modal window appears.

  7. In the Edit Column Connections modal window, edit Source Column and/or Target Column values by selecting from the list of possible fields.

    These values are pre-populated by default when there is a natural match between the two tables (like identical field name and value types), but they may be changed.

    • On the left side, select the field AIRLINE_ID. On the right side, select the field code.
    • [Optional] Click Add Join Pair to add another column connection between the same two tables.
    • [Optional] Click icon (minus) next to an existing join definition to remove that column connection between the two tables.
    • [Optional] Click Add Join Expression to add a conditional join between the same two tables.
    • [Optional] Click icon (minus) next to an existing join expression to remove that column connection between the two tables.
    • [Optional] Click sample data to view some data in the Source Column and/or Target Column. Click again to hide sample data.
    • [Optional] Click Clear Fields to clear all already defined joins between the two tables.
    • Click Apply.
    Choosing the join key
    Specifying the Join Key
  8. Repeat the previous three steps for the remaining tables:

    • The table airport_codes has two joins, for source column ORIGIN = target column code, and source column DEST = target column code.
    • The table cancellation_code has a join for source column CANCELLATION_CODE = target column code.
    • The table airport_lat_long has two joins, for source column ORIGIN = target column locationid, and source column DEST = target column locationid.

    Click Save.

    Data model with 4 joins to the primary table
    Data Model with Four Distinct Joins
  9. [Optional] Click icon (link) to make further edits to column connections or to change join type.