Oracle Database 26ai: How a Missing Histogram Turned a 3-Second Query Into a 4-Minute Crisis Complete Statistics Diagnosis and Fix

On a Wednesday afternoon, a P1 incident was raised. The warehouse dashboard used for tracking active orders was running for over 4 minutes instead of the usual 3 seconds.

There were no application changes. No unusual load. The automated statistics job had completed successfully.

However, the execution plan told a different story.

Oracle was scanning the entire SALES table 220 million rows to return a small subset of data.

The root cause was a missing histogram on a skewed column.

Problem Description

Environment

Oracle Database 26ai
Schema: SCOTT
SALES table: ~220 million rows
ORDERS table: ~50 million rows
Query type: dashboard filtering by ORDER_STATUS and date
Expected runtime: ~3 seconds
Actual runtime: ~4 minutes

The Failing Query

SELECT s.prod_id,
       s.cust_id,
       s.time_id,
       s.channel_id,
       s.amount_sold,
       o.order_status,
       o.order_total
FROM   scott.sales  s
JOIN   scott.orders o ON s.order_id = o.order_id
WHERE  o.order_status IN ('PENDING','PROCESSING')
AND    s.time_id >= ADD_MONTHS(TRUNC(SYSDATE, 'MM'), -1)
ORDER BY s.time_id DESC;

Root Cause Analysis

Oracle assumes uniform data distribution unless a histogram exists.

For ORDER_STATUS:

Distinct values = 7
Optimizer assumption = ~14% per value
Actual data = highly skewed

👉 This caused a massive cardinality misestimate.

Diagnosis Using Oracle Views

Step 1: Identify High-Load SQL

SELECT sql_id,
       executions,
       ROUND(elapsed_time/NULLIF(executions,0)/1e6,2) avg_secs,
       buffer_gets/NULLIF(executions,0) avg_buffer_gets,
       is_reoptimizable
FROM   v$sql
WHERE  parsing_schema_name = 'SCOTT'
AND    executions > 0
ORDER BY elapsed_time DESC
FETCH FIRST 5 ROWS ONLY;

Step 2: Check Execution Plan

SELECT *
FROM   TABLE(
  DBMS_XPLAN.DISPLAY_CURSOR(
    'your_sql_id',
    NULL,
    'ALLSTATS LAST +PREDICATE +NOTE'
  )
);

Key Finding

E-Rows: Millions
A-Rows: Thousands

👉 Optimizer estimate was completely wrong

Step 3: Check Column Statistics

SELECT column_name,
       num_distinct,
       density,
       histogram
FROM   dba_tab_col_statistics
WHERE  owner='SCOTT'
AND    table_name='ORDERS'
AND    column_name='ORDER_STATUS';

👉 Histogram = NONE

Step 4: Check Data Distribution

SELECT order_status,
       COUNT(*) cnt
FROM   scott.orders
GROUP  BY order_status
ORDER  BY cnt DESC;

👉 Reveals heavy skew

Step 5: Check Index Clustering Factor

SELECT index_name,
       clustering_factor
FROM   dba_indexes
WHERE  owner='SCOTT'
AND    table_name='ORDERS';

👉 High clustering factor means the index becomes less efficient for large result sets, but it can still be highly efficient for selective queries.

Solution

Fix 1: Create Histogram

BEGIN
  DBMS_STATS.GATHER_TABLE_STATS(
    ownname    => 'SCOTT',
    tabname    => 'ORDERS',
    method_opt => 'FOR COLUMNS ORDER_STATUS SIZE 7'
  );
END;
/

Fix 2: Use AUTO Statistics Strategy

BEGIN
  DBMS_STATS.SET_TABLE_PREFS(
    ownname => 'SCOTT',
    tabname => 'ORDERS',
    pname   => 'METHOD_OPT',
    pvalue  => 'FOR ALL COLUMNS SIZE AUTO'
  );
END;
/

Fix 3: Re-execute SQL

👉 A fresh execution will generate a new optimal plan after statistics change.

👉 Avoid shared pool purge in production unless under controlled DBA procedures.

Validation

SELECT *
FROM   TABLE(
  DBMS_XPLAN.DISPLAY_CURSOR(
    'your_sql_id',
    NULL,
    'ALLSTATS LAST'
  )
);

Expected Plan

INDEX RANGE SCAN
TABLE ACCESS BY INDEX ROWID (or LOCAL INDEX ROWID if partitioned)
No full table scan

Performance Improvement

Metric	Before	After
Execution Time	4 minutes	~2 seconds
Logical Reads	Millions	Thousands
Disk Reads	High	Near zero
Access Path	Full Scan	Index Scan

Understanding Histograms

Type	When Used
Frequency	Low NDV columns
Top Frequency	Dominant values
Hybrid	Large NDV
None	Uniform assumption

👉 For skewed columns, histograms are critical

Key DBA Lessons

Always check E-Rows vs A-Rows
Avoid disabling histograms globally
Use AUTO statistics strategy
Monitor IS_REOPTIMIZABLE in V$SQL

Conclusion

A single missing histogram caused a massive performance regression.

The optimizer was not wrong—it was misinformed.

By correcting statistics:

Cardinality became accurate
Execution plan improved
Performance restored

👉 SQL tuning is about understanding optimizer inputs, not guessing.