OBIEE - Aggregation rules in Logical SQL with the GROUP BY and BY clause

> Reporting and Data Access Tools > Oracle Business Intelligence (OBIEE/Siebel Analytics) > OBIEE - BI Server (OBIS|nqsserver|nqs) > OBIEE - Logical Sql

1 - About

This section outlines the rules that the BI Server follows with respect to whether or not a logical query contains a GROUP BY clause and, if a GROUP BY clause is specified, what results you should expect from the query.

OBIEE permit to fired an aggregation on two type of measure :

In OBIEE 11g, its possible to isolate different columns from the GROUP BY/DISTINCT operations such as LOB datatype column by using the SPARSE/DENSE lookup operators.

Aggregation should only be performed on measure columns, with the possible exception of the aggregation COUNT and Count Distinct.

Advertising

3 - Aggregation with a Measure Column

3.1 - How OBIEE choose the level of aggregation for the nested aggregate ?

If you perform explicit aggregation (SUM, AVG, MIN, MAX, or COUNT) on a measure column through a SQL request, you are actually asking for an aggregate of an aggregate.

For these nested aggregates, the BI Server calculates the aggregation based on the following rules:

  • A request for a measure column without an aggregate function defined in a SQL statement is always grouped at the level of the non-aggregate columns (grain)in the SELECT list, regardless of whether the query specifies a GROUP BY clause.
  • If there is no GROUP BY clause specified, the nested aggregate is a grand total of each group determined by all of the non-aggregate columns in the SELECT list.
  • If there is a GROUP BY clause specified, the nested aggregation calculates the total for each group as specified in the GROUP BY clause.

3.2 - Example in Answer

For example, consider the following query, where the column “Amount Sold” is defined in the repository as a measure column with a default aggregation rule of SUM (SUM aggregation rule specified in the Aggregation tab of the Logical Column dialog):

SELECT Calendar."Calendar Year" saw_0, 
       Products."Prod Category" saw_1, 
       "Sales Facts"."Amount Sold" saw_2, 
       SUM("Sales Facts"."Amount Sold") saw_3 
FROM SH 
ORDER BY 
       saw_0, saw_1

To create it, select in the subject area the columns :

  • Calendar.“Calendar Year”,
  • Products.“Prod Category”,
  • “Sales Facts”.“Amount Sold” ( referred in this document as “Amount Sold” )

And create a formula column ( referred in this document as sum(Amount Sold) ) :

SUM("Sales Facts"."Amount Sold")

You will obtain :

This query returns results grouped by year and category : it returns one row for each product and year combination.

The sum calculated for each row in the sum(Amount Sold) column is the grand total sum. It is logically the same query as the following:

SELECT Calendar."Calendar Year" saw_0, 
       Products."Prod Category" saw_1, 
       SUM("Sales Facts"."Amount Sold") saw_2, 
       SUM(SUM("Sales Facts"."Amount Sold") BY) saw_3 
FROM SH 
GROUP BY 
       Calendar."Calendar Year" saw_0, 
       Products."Prod Category" saw_1
ORDER BY 
       saw_0, saw_1

If you change the GROUP BY clause to only group by year, then the sum calculated in the “Amount Sold” column is the sum of each product for the year, and the sum calculated in the SUM(“Amount Sold”) column is total sales of all products for the given year, as follows:

SELECT 
   Calendar."Calendar Year" saw_0, 
   Products."Prod Category" saw_1, 
   "Sales Facts"."Amount Sold" saw_2, 
   SUM("Sales Facts"."Amount Sold") saw_3 
FROM SH 
GROUP BY 
   Calendar."Calendar Year"  
ORDER BY saw_0, saw_1

After setting the group by year in the advanced tab :

You will obtain :

In this result set, the sum calculated for each row in the “Amount Sold” column is the sum of all the sales for that product in that year because the measure column is always at the level defined by the nonaggregation columns in the query.

The SUM(“Amount Sold”) is the same for each row corresponding to a given year, and that sum represents the total sales for that year.

Advertising

4 - Aggregation Level based on the Baseline Columns (GROUP BY)

4.1 - How OBIEE choose the level of aggregation ?

If you perform an aggregation on a baseline column through a SQL request, the Analytics Server calculates the aggregation grain at the level based on the following rules:

  • If there is no GROUP BY clause specified, the level of aggregation is grouped by all of the nonaggregate columns in the SELECT list.
  • If there is a GROUP BY clause specified, the level of aggregation is based on the columns specified in the GROUP BY clause.

4.2 - Example in Answer

For example, consider the following query from the Oracle Database - Sample Schemas, where the column year and category name are defined in the repository as a baseline column (no aggregation rules specified in the Logical Column > Aggregation tab).

You choose in Answer three columns :

  • Fiscal Year,
  • Category Name
  • and Amount Sold

Create from the Amount Sold Column, a formula column as :

And you will obtain this SQL in the Advanced Tab :

SELECT Fiscal."Fiscal Year" saw_0, Products."Prod Category" saw_1, SUM("Sales Facts"."Amount Sold") saw_2 
FROM SH 
ORDER BY saw_0, saw_1

Which give this result :

This query returns results grouped by year and category name; that is, it returns one row for each category name and year combination. The sum calculated for each row is the sum of all the sales for that category in that year. It is logically the same query as the following :

(where saw_0 and saw_1 correspond respectively to Fiscal Year and Category)

SELECT 
   Fiscal."Fiscal Year" saw_0, 
   Products."Prod Category" saw_1, 
   SUM("Sales Facts"."Amount Sold") saw_2 
FROM SH 
GROUP BY 
   Fiscal."Fiscal Year" saw_0, 
   Products."Prod Category" saw_1, 
ORDER BY saw_0, saw_1

If you change the GROUP BY clause to only group by year, then the sum calculated is the sum of all products for the year.

In this query result set, the sum of revenue is the same for each row corresponding to a given year, and that sum represents the total sales for that year.

SELECT 
  Fiscal."Fiscal Year" saw_0, 
  Products."Prod Category" saw_1, 
  SUM("Sales Facts"."Amount Sold") saw_2 
FROM SH 
  GROUP BY Fiscal."Fiscal Year"  
  ORDER BY saw_0, saw_1

You can set the GROUP BY clause in the Advanced Tab.

Advertising

5 - Alternative Syntax : the BY clause

When using an aggregate function, you can calculate a specified level of aggregation using BY within the aggregate function. If you do this, you do not need a GROUP BY clause.

Create a formula column with this statement :

SUM("Sales Facts"."Amount Sold" BY Calendar."Calendar Year")

Answer will therefore create this query :

SELECT 
   Calendar."Calendar Year" saw_0, 
   Products."Prod Category" saw_1, 
   "Sales Facts"."Amount Sold" saw_2, 
   SUM("Sales Facts"."Amount Sold" BY Calendar."Calendar Year") saw_3 
FROM 
   SH 
ORDER BY 
   saw_0, saw_1

and it will return the amount sold aggregated by year.

You must have the BY column in the answer, otherwise, you can have unexpected results. In our case, the column year belong to the answer.

6 - Complex aggregation

Complex aggregation gives wrong result case because of post-aggregation of pre-aggregation rules. This is in most cast a behavior by design and not a bug.

Customer is using complex aggregation as well as evaluation IFNULLS in columnformula. BI has set of rules, as said earlier to for post/pre-aggregationsand BI pushes IFNULL post-aggregation; hence we first compute the aggregation value and then apply IFNULL. So the percent you expect is not after applying the NULL evaluation, instead the aggregation is computed first and later evaluated for NULL. Oracle Development recommends push the IFNULL evaluation to the actual logical table source (column). By setting the IFNULL evaluation in the LTS, you will not need to evaluate for IFNULL in the BIEE presentation layer (via formula in column) and should get consistent results.

7 - Grand-Totals

Most Pivot calculations are done in the BI Server memory and are not pushed back to the database (though we can control where it is executed to an extent).

It is very much possible that Pivot tables can generate physical SQL for calculating the Grand-Totals. This is to an extent controlled by the “Report Based Total (when applicable)” setting. Unfortunately we cannot exactly say when BI Server will switch between temp file based sub-totals to another physical SQL when the above setting is turned on. But in most cases, we can control whats happening through a combination of repository design and report level aggregations.

8 - Support

8.1 - nQSError: 42020 - GROUP BY clause is inconsistent

Not good :

SELECT Products."Prod Category", Products."Prod Subcategory", "Sales Facts"."Amount Sold"  
FROM Products, "Sales Facts" 
GROUP BY Products."Prod Category",  "Sales Facts"."Amount Sold"
State: HY000. Code: 10058. [NQODBC] [SQL_STATE: HY000] [nQSError: 10058] A general error has occurred. 
[nQSError: 42020] GROUP BY clause, [Products.Prod Category, Products.Prod Subcategory], is inconsistent 
with the projected columns in the SQL query request. (HY000)
SQL Issued: SELECT Products."Prod Category", Products."Prod Subcategory", "Sales Facts"."Amount Sold" 
FROM Products, "Sales Facts" GROUP BY Products."Prod Category", Products."Prod Subcategory"

When you define a GROUP BY clause on a baseline column, you have to add the aggregation (sum, …) clause to the measure as below

SELECT Products."Prod Category", Products."Prod Subcategory", SUM("Sales Facts"."Amount Sold")  
FROM Products, "Sales Facts" 
GROUP BY Products."Prod Category", Products."Prod Subcategory"

8.2 - Error getting drill information

In general, it means that the logical sql is not good. In the case below, the solution is in the above section.

Error getting drill information: SELECT Fiscal."Fiscal Year" saw_0, Products."Prod Category" saw_1, 
"Sales Facts"."Amount Sold" saw_2 FROM SH GROUP BY Fiscal."Fiscal Year" saw_0

Error Details
Error Codes: YQCO4T56:OPR4ONWY:U9IM8TAC:OI2DL65P
Odbc driver returned an error (SQLExecDirectW).
State: HY000. Code: 10058. [NQODBC] [SQL_STATE: HY000] [nQSError: 10058] A general error has occurred.
[nQSError: 27002] Near <saw_0>: Syntax error [nQSError: 26012] . (HY000)
SQL Issued: {call NQSGetLevelDrillability('SELECT Fiscal."Fiscal Year" saw_0, Products."Prod Category" saw_1,
"Sales Facts"."Amount Sold" saw_2 FROM SH GROUP BY Fiscal."Fiscal Year" saw_0')}

8.3 - Error in Totals

An error in totals might occur (only version 10.1.3.4.1 and 10.1.3.4.2) if both of the following are true:

As a workaround, add the following setting anywhere within the WebConfig element in the instanceconfig.xml file:

<ReportAggregateEnabled>true</ReportAggregateEnabled>.

This change ensures that in the desired REPORT_AGGREGATE aggregation rule is used instead of the server AGGREGATE aggregation rule.

9 - Reference