Oracle Database - Index Scans

About

The Index Scans is a access path used by the query optimizer to produce the best Sql Engine - (Physical|Execution) Plan.

In this method, a row is retrieved by traversing the index, using the indexed column values specified by the statement. An index scan retrieves data from an index based on the value of one or more columns in the index. To perform an index scan, Oracle searches the index for the indexed column values accessed by the statement. If the statement accesses only columns of the index, then Oracle reads the indexed column values directly from the index, rather than from the table.

The index contains :

not only the indexed value,
but also the rowids of rows in the table having that value.

Therefore, if the statement accesses other columns in addition to the indexed columns, then Oracle can find the rows in the table by using either a table access by rowid or a Cluster Scan/Access.

An index scan can be one of the following types:

assessing_io_for_blocks_not_rows
index_unique_scans
index_range_scans
index_range_scans_descending
index_skip_scans
full_scans
fast_full_index_scans
index_joins
bitmap_indexes

Assessing I/O for Blocks, not Rows

Oracle does I/O by blocks. Therefore, the optimizer's decision to use full table scans is influenced by the percentage of blocks accessed, not rows. This is called the index clustering factor. If blocks contain single rows, then rows accessed and blocks accessed are the same.

However, most tables have multiple rows in each block. Consequently, the desired number of rows could be clustered together in a few blocks, or they could be spread out over a larger number of blocks.

Although the index clustering factor is a property of the index, the index clustering factor actually relates to the spread of similar indexed column values within data blocks in the table.

A lower clustering factor indicates that the individual rows are concentrated within fewer blocks in the table. Conversely, a high clustering factor indicates that the individual rows are scattered more randomly across blocks in the table. Therefore, a high clustering factor means that it costs more to use a range scan to fetch rows by rowid, because more blocks in the table need to be visited to return the data.

How the clustering factor can affect cost.

Example Effects of Clustering Factor on Cost

Assume the following situation:

There is a table with 9 rows.
There is a non-unique index on col1 for table.
The c1 column currently stores the values A, B, and C.
The table only has three Oracle blocks.

Case 1: The index clustering factor is low for the rows as they are arranged in the following diagram.

Block 1     Block 2       Block 3 
-------     -------       -------- 
A  A  A     B  B  B       C  C  C

This is because the rows that have the same indexed column values for c1 are located within the same physical blocks in the table. The cost of using a range scan to return all of the rows that have the value A is low, because only one block in the table needs to be read.

Case 2: If the same rows in the table are rearranged so that the index values are scattered across the table blocks (rather than collocated), then the index clustering factor is higher.

Block 1       Block 2        Block 3 
-------       -------        --------
A  B  C       A  B  C        A  B  C

This is because all three blocks in the table must be read in order to retrieve all rows with the value A in col1.

Index Unique Scans

This scan returns, at most, a single rowid. Oracle performs a unique scan if a statement contains a UNIQUE or a PRIMARY KEY constraint that guarantees that only a single row is accessed.

Example :

-----------------------------------------------------------------------------------
| Id  | Operation                     |  Name        | Rows  | Bytes | Cost (%CPU)|
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |              |     3 |   189 |    10  (10)|
|   1 |  NESTED LOOPS                 |              |     3 |   189 |    10  (10)|
|   2 |   NESTED LOOPS                |              |     3 |   141 |     7  (15)|
|*  3 |    TABLE ACCESS FULL          | EMPLOYEES    |     3 |    60 |     4  (25)|
|   4 |    TABLE ACCESS BY INDEX ROWID| JOBS         |    19 |   513 |     2  (50)|
|*  5 |     INDEX UNIQUE SCAN         | JOB_ID_PK    |     1 |       |            |
|   6 |   TABLE ACCESS BY INDEX ROWID | DEPARTMENTS  |    27 |   432 |     2  (50)|
|*  7 |    INDEX UNIQUE SCAN          | DEPT_ID_PK   |     1 |       |            |
-----------------------------------------------------------------------------------

An index scan is performed on the jobs and departments tables, using the job_id_pk and dept_id_pk indexes respectively.

When the Optimizer Uses Index Unique Scans

This access path is used when all columns of a unique (B-tree) index or an index created as a result of a primary key constraint are specified with equality conditions.

Index Unique Scan Hints

In general, you should not need to use a hint to do a unique scan. There might be cases where the table is across a database link and being accessed from a local table, or where the table is small enough for the optimizer to prefer a full table scan.

The hint INDEX(alias index_name) specifies the index to use, but not an access path (range scan or unique scan).

Index Range Scans

An index range scan is a common operation for accessing selective data. It can be bounded (bounded on both sides) or unbounded (on one or both sides). Data is returned in the ascending order of index columns. Multiple rows with identical values are sorted in ascending order by rowid.

If data must be sorted by order, then use the ORDER BY clause, and do not rely on an index. If an index can be used to satisfy an ORDER BY clause, then the optimizer uses this option and avoids a sort.

Example

In this example, the order has been imported from a legacy system, and you are querying the order by the reference used in the legacy system. Assume this reference is the order_date.

SELECT order_status, order_id
  FROM orders
 WHERE order_date = :b1;

---------------------------------------------------------------------------------------
| Id  | Operation                   |  Name              | Rows  | Bytes | Cost (%CPU)|
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                    |     1 |    20 |     3  (34)|
|   1 |  TABLE ACCESS BY INDEX ROWID| ORDERS             |     1 |    20 |     3  (34)|
|*  2 |   INDEX RANGE SCAN          | ORD_ORDER_DATE_IX  |     1 |       |     2  (50)|
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("ORDERS"."ORDER_DATE"=:Z)

This should be a highly selective query, and you should see the query using the index on the column to retrieve the desired rows. The data returned is sorted in ascending order by the rowids for the order_date. Because the index column order_date is identical for the selected rows here, the data is sorted by rowid.

When the Optimizer Uses Index Range Scans

The optimizer uses a range scan when it finds one or more leading columns of an index specified in conditions, such as the following:

col1 = :b1
col1 < :b1
col1 > :b1
AND combination of the preceding conditions for leading columns in the index
col1 like 'ASD%' wild-card searches should not be in a leading position otherwise the condition col1 like '%ASD' does not result in a range scan.

Range scans can use unique or non-unique indexes. Range scans avoid sorting when index columns constitute the ORDER BY/GROUP BY clause.

Index Range Scan Hints

A hint might be required if the optimizer chooses some other index or uses a full table scan. The hint INDEX(table_alias index_name) instructs the optimizer to use a specific index. For more information on the INDEX hint, see “Hints for Access Paths”.

Index Range Scans Descending

An index range scan descending is identical to an index range scan, except that the data is returned in descending order. Indexes, by default, are stored in ascending order. Usually, this scan is used when ordering data in a descending order to return the most recent data first, or when seeking a value less than a specified value.

When the Optimizer Uses Index Range Scans Descending

The optimizer uses index range scan descending when an order by descending clause can be satisfied by an index.

Index Range Scan Descending Hints

The hint INDEX_DESC(table_alias index_name) is used for this access path.

Index Skip Scans

Index skip scans improve index scans by nonprefix columns. Often, scanning index blocks is faster than scanning table data blocks.

Skip scanning lets a composite index be split logically into smaller subindexes. In skip scanning, the initial column of the composite index is not specified in the query. In other words, it is skipped.

The number of logical subindexes is determined by the number of distinct values in the initial column. Skip scanning is advantageous if there are few distinct values in the leading column of the composite index and many distinct values in the nonleading key of the index.

Example

Consider, for example, a table employees (sex, employee_id, address) with a composite index on (sex, employee_id). Splitting this composite index would result in two logical subindexes, one for M and one for F.

For this example, suppose you have the following index data:

('F',98)
('F',100)
('F',102)
('F',104)
('M',101)
('M',103)
('M',105)

The index is split logically into the following two subindexes:

The first subindex has the keys with the value F.
The second subindex has the keys with the value M.

The column sex is skipped in the following query:

SELECT * 
   FROM employees 
WHERE employee_id = 101;

A complete scan of the index is not performed, but the subindex with the value F is searched first, followed by a search of the subindex with the value M.

Full Scans

A full index scan eliminates a sort operation, because the data is ordered by the index key. It reads the blocks singly. A full scan is used in any of the following situations:

An ORDER BY clause that meets the following requirements is present in the query:
- All of the columns in the ORDER BY clause must be in the index.
- The order of the columns in the ORDER BY clause must match the order of the leading index columns.

The ORDER BY clause can contain all of the columns in the index or a subset of the columns in the index.

The query requires a sort merge join. A full index scan can be done instead of doing a full table scan followed by a sort if the query meets the following requirements:
- All of the columns referenced in the query must be in the index.
- The order of the columns referenced in the query must match the order of the leading index columns.

The query can contain all of the columns in the index or a subset of the columns in the index.

A GROUP BY clause is present in the query, and the columns in the GROUP BY clause are present in the index. The columns do not need to be in the same order in the index and the GROUP BY clause. The GROUP BY clause can contain all of the columns in the index or a subset of the columns in the index.

Fast Full Index Scans

Fast full index scans are an alternative to a full table scan when the index contains all the columns that are needed for the query, and at least one column in the index key has the NOT NULL constraint. A fast full scan accesses the data in the index itself, without accessing the table. It cannot be used to eliminate a sort operation, because the data is not ordered by the index key. It reads the entire index using multiblock reads, unlike a full index scan, and can be parallelized.

Fast full index scans cannot be performed against bitmap indexes.

A fast full scan is faster than a normal full index scan in that it can :

use multiblock I/O
and be parallelized just like a table scan.

Note: Setting PARALLEL for indexes will not impact the cost calculation.

You can specify fast full index scans with the initialization parameter OPTIMIZER_FEATURES_ENABLE or the INDEX_FFS hint.

Index Joins

An index join is a hash join of several indexes that together contain all the table columns that are referenced in the query. If an index join is used, then no table access is needed, because all the relevant column values can be retrieved from the indexes. An index join cannot be used to eliminate a sort operation.

You can specify an index join with the INDEX_JOIN hint.

Bitmap Indexes

A bitmap join uses a bitmap for key values and a mapping function that converts each bit position to a rowid. Bitmaps can efficiently merge indexes that correspond to several conditions in a WHERE clause, using Boolean operations to resolve AND and OR conditions.