It would be nice to have an easy way to perform a SELECT DISTINCT on a source table.
Currently, in order to achieve this, there are three options as given by Thomas D.:
You have 3 options.
1: Create a custom view with the 'select distinct' statement.
2: Define Col1 and Col2 as primary key fields. Then the data cleansing will move all dups into 'errors'.
3: Define an aggregation on a field and group by col1 and col2 by using the standard aggregation feature on the DWH table. Right click table on DWH, Add aggregation, Sum or Count a dummy-field and leave col1 and col2 unaggregated (=will be used as group by clause).
Perhaps you could create table-level transformations or have some way of adding a DISTINCT without having to create views, primary keys, or aggregations.
See attachment for an example.
Distinct Values Example.xlsx