Using Arcadia Catalog Tables

The following examples demonstrate how to use the catalog tables:

Counting Current Queries

In this example, we show how to count currently-running queries and their duration, and group them by effective user.

> SELECT effective_user, count(1), sum(duration) 
  FROM arcadia_catalog.inflight_queries 
  GROUP BY effective_user;

The result:

+----------------+----------+---------------+
| effective_user | count(1) | sum(duration) |
+----------------+----------+---------------+
| robert         | 1        | 62.805382     |
| chitti         | 2        | 372.454445    |
| john           | 2        | 133.45839     |
+----------------+----------+---------------+

Counting Recent Queries

In this example, we show how to count queries that ran in the previous four (4) hours, and group them by coordinator node.

Note that depending on the number of queries, it may be more appropriate to run queries against a parquet table that has been populated with this data. In this case, the end_time filtering enables the query to perform better.

> SELECT coordinator, count(1) 
  FROM arcadia_catalog.completed_queries end_time > date_sub(now(), interval 4 hour) 
  GROUP BY coordinator;

The result:

+---------------------+----------+
| coordinator         | count(1) |
+---------------------+----------+
| ubuntu-xenial:22001 | 27       |
| ubuntu-xenial:22000 | 2067     |
| ubuntu-xenial:22002 | 99       |
+---------------------+----------+