GCDO()

An acronym for 'Group Concatenate with Distinct and Order', this aggregate function returns a single string produced by:

  1. sorting values for each row of the result set in alphabetic order
  2. removing duplicates
  3. concatenating the results

The separator is a comma, followed by a space.

Syntax

GCDO([ALL] expression)

Returns: String

Usage Notes and Restrictions

  • The gcdo() function operates on columns with datatype STRING. To use it with a numeric column, cast the column value to STRING. See Example 1.
  • By default, gcdo() returns a single string for the entire result set. To include other columns or values in the result set, or to produce multiple concatenated strings for subsets of rows, include a GROUP BY clause in the query. See Example 2.
  • While concat() and concat_ws() concatenate the values of multiple columns within the same row, gcdo() joins together values from different rows.
  • The DISTINCT operator cannot be used in the argument of gcdo().
  • The gcdo() function does not support the OVER clause, and therefore it cannot be used as an analytic function.

Example: GDCO() with Numeric Value Cast to String

You can call the function directly on a STRING column. To use it with a numeric column, first cast the value to STRING.

[localhost:21000] > create table t1 (x int, s string); 
[localhost:21000] > insert into t1 values 
  (1, "one"), (3, "three"), (2, "two"), (1, "one"), (3, "tres"); 
[localhost:21000] > select gcdo(s) from t1;
Query: select gcdo(s) from t1
+-----------------------+
| gcdo(s)               |
+-----------------------+
| one, three, tres, two |
+-----------------------+
[localhost:21000] > select gcdo(cast(x as string)) from t1;
Query: select gcdo(cast(x as string)) from t1
+-------------------------+
| gcdo(CAST(x AS STRING)) |
+-------------------------+
| 1, 2, 3                 |
+-------------------------+
Fetched 1 row(s) in 0.51s

Example: GDCO() with GROUP BY Clause

Including a GROUP BY clause generates a different concatenated result for each group in the result set. In this example, value 'one' occurs repeatedly in string s, so gcdo() removes the duplicate value. The GROUP BY x expression sees two distinct values for '3', and reports them both, alphabetically ordered.

select x, gcdo(s) from t1 group by x;
Query: select x, gcdo(s) from t1 group by x
+---+-------------+
| x | gcdo(s)     |
+---+-------------+
| 2 | two         |
| 1 | one         |
| 3 | three, tres |
+---+-------------+