Identifying Top Wait Events in SQL Server
Introduction
Identifying Top Wait Events
Actionable Insights
Conclusion
Introduction
In the world of managing databases, making them run faster and more efficiently is a never-ending goal. For people who manage SQL Server systems,
known as database administrators (DBAs), understanding and handling wait events is incredibly important. Wait events are basically periods when a
session in SQL Server is doing nothing, just waiting for something specific to happen before it can get back to work.
Understanding these wait events is key to figuring out where the problems are in SQL Server and how to make things better. That's why learning how
to use SQL queries to dig into these wait events is really valuable. In this blog post, we'll talk about how to do just that—how to use SQL queries
to find out important stuff about the top wait events.
Identifying Top Wait Events
Now, let's interpret the key components of the query results:
- Wait Type: This column identifies the type of wait event, such as I/O, CPU, or network-related waits.
- Wait Time: Indicates the total time spent waiting for the specified event to occur.
- Signal Wait Time:Represents the time spent waiting for the CPU after the event has occurred.
- Percentage of Signal Waits:Provides insights into the proportion of signal waits relative to total waits. A higher percentage may indicate CPU contention issues.
- Number of Waits: Indicates the frequency of occurrence for each wait event.
- Average Time per Wait:Calculated by dividing total wait time by the number of waits, offering insights into the average duration of each wait event
with wait_stats as
(
select
server_utc_start_time,
sample_utc_time,
x.sample_time,
x.wait_type,
sum(x.sum_wait_time_ms) as wait_time_ms,
sum(x.sum_signal_wait_time_ms) as signal_wait_time_ms,
sum(x.sum_waiting_tasks) as waiting_tasks
from (
select
(select dateadd(hh, datediff(hh, getdate(), getutcdate()), sqlserver_start_time) from sys.dm_os_sys_info) as server_utc_start_time,
getutcdate() as sample_utc_time,
sysdatetimeoffset() as sample_time,
owt.wait_type,
sum(owt.wait_duration_ms) over (partition by owt.wait_type, owt.session_id) as sum_wait_time_ms,
0 as sum_signal_wait_time_ms,
0 as sum_waiting_tasks
from
sys.dm_os_waiting_tasks owt
where
owt.session_id > 50
and
owt.wait_duration_ms >= 0
union all
select
(select dateadd(hh, datediff(hh, getdate(), getutcdate()), sqlserver_start_time) from sys.dm_os_sys_info) as server_utc_start_time,
getutcdate() as sample_utc_time,
sysdatetimeoffset() as sample_time,
os.wait_type,
sum(os.wait_time_ms) over (partition by os.wait_type) as sum_wait_time_ms,
sum(os.signal_wait_time_ms) over (partition by os.wait_type) as sum_signal_wait_time_ms,
sum(os.waiting_tasks_count) over (partition by os.wait_type) as sum_waiting_tasks
from
sys.dm_os_wait_stats os
) x
group by
server_utc_start_time,
sample_utc_time,
x.sample_time,
x.wait_type
)
select
'WAIT_STATS' as wait_stats,
waits.server_utc_start_time as server_utc_startup_time,
waits.sample_utc_time as sample_utc_time,
waits.sample_time as sample_time,
cast(datediff(mi, waits.server_utc_start_time, waits.sample_time) / 60.0 as decimal(18,1)) as hours_sample,
waits.wait_type as wait_type,
cast(c.wait_time_seconds / 60.0 / 60 as decimal(18,1)) as wait_time_hours,
cast(waits.wait_time_ms / 1000.0 / cores.cpu_count / datediff(ss, waits.server_utc_start_time, waits.sample_time) as decimal(18,1)) AS per_core_per_hour,
cast(c.signal_wait_time_seconds / 60.0 / 60 as decimal(18,1)) as signal_wait_time_hours,
case when c.wait_time_seconds > 0 then
cast(100.*(c.signal_wait_time_seconds / c.wait_time_seconds) as numeric(4,1))
else
0
end as percent_signal_waits,
waits.waiting_tasks as number_of_waits,
case when waits.waiting_tasks > 0 then
cast(waits.wait_time_ms / (1.0 * (waits.waiting_tasks)) as numeric(12,1))
else
0
end as avg_ms_per_wait
from
wait_stats waits
cross apply
(
select sum(1) as cpu_count
from sys.dm_os_schedulers
where status = 'VISIBLE ONLINE'
and is_online = 1
) as cores
cross apply
(
select
cast(waits.wait_time_ms/1000. as numeric(12,1)) as wait_time_seconds,
cast(waits.signal_wait_time_ms/1000. as numeric(12,1)) as signal_wait_time_seconds
) as c
where
waits.waiting_tasks > 0
and
waits.wait_time_ms > 0
and
waits.wait_type not in (
'BROKER_EVENTHANDLER', 'BROKER_RECEIVE_WAITFOR', 'BROKER_TASK_STOP',
'BROKER_TO_FLUSH', 'BROKER_TRANSMITTER', 'CHECKPOINT_QUEUE',
'CLR_AUTO_EVENT', 'CLR_MANUAL_EVENT', 'CLR_SEMAPHORE',
'DBMIRROR_DBM_EVENT', 'DBMIRROR_DBM_MUTEX', 'DBMIRROR_EVENTS_QUEUE',
'DBMIRROR_WORKER_QUEUE', 'DBMIRRORING_CMD', 'DIRTY_PAGE_POLL',
'DISPATCHER_QUEUE_SEMAPHORE', 'FT_IFTS_SCHEDULER_IDLE_WAIT',
'FT_IFTSHC_MUTEX', 'FT_IFTSISM_MUTEX', 'HADR_CLUSAPI_CALL',
'HADR_FABRIC_CALLBACK', 'HADR_FILESTREAM_IOMGR_IOCOMPLETION',
'HADR_LOGCAPTURE_WAIT', 'HADR_NOTIFICATION_DEQUEUE',
'HADR_TIMER_TASK', 'HADR_WORK_QUEUE', 'LAZYWRITER_SLEEP',
'LOGMGR_QUEUE', 'ONDEMAND_TASK_QUEUE', 'PARALLEL_REDO_DRAIN_WORKER',
'PARALLEL_REDO_LOG_CACHE', 'PARALLEL_REDO_TRAN_LIST',
'PARALLEL_REDO_TRAN_TURN', 'PARALLEL_REDO_WORKER_SYNC',
'PARALLEL_REDO_WORKER_WAIT_WORK', 'PREEMPTIVE_HADR_LEASE_MECHANISM',
'PREEMPTIVE_SP_SERVER_DIAGNOSTICS', 'PREEMPTIVE_XE_DISPATCHER',
'QDS_ASYNC_QUEUE', 'QDS_CLEANUP_STALE_QUERIES_TASK_MAIN_LOOP_SLEEP',
'QDS_PERSIST_TASK_MAIN_LOOP_SLEEP', 'QDS_SHUTDOWN_QUEUE',
'REDO_THREAD_PENDING_WORK', 'REQUEST_FOR_DEADLOCK_SEARCH',
'SLEEP_SYSTEMTASK', 'SLEEP_TASK', 'SOS_WORK_DISPATCHER',
'SP_SERVER_DIAGNOSTICS_SLEEP', 'SQLTRACE_BUFFER_FLUSH',
'SQLTRACE_INCREMENTAL_FLUSH_SLEEP', 'UCS_SESSION_REGISTRATION',
'WAIT_XTP_OFFLINE_CKPT_NEW_LOG', 'WAITFOR', 'XE_DISPATCHER_WAIT',
'XE_LIVE_TARGET_TVF', 'XE_TIMER_EVENT'
)
order by
wait_time_seconds desc;
Actionable Insights
Armed with these insights, DBAs can take targeted actions to optimize SQL Server performance. For instance:
- Identifying Bottlenecks: By focusing on wait events with the highest wait times, DBAs can pinpoint performance bottlenecks and
prioritize optimization efforts accordingly.
- Tuning Resource Allocation: Understanding the distribution of signal waits versus total waits can help DBAs fine-tune resource
allocation, such as CPU resources, to alleviate contention issues.
- Optimizing Queries: Analyzing wait events associated with specific queries can guide query optimization strategies, leading to
improved overall performance.
Conclusion
In the complex landscape of SQL Server performance tuning, analyzing wait events serves as a cornerstone for identifying and addressing performance
bottlenecks. By leveraging SQL queries to extract and interpret wait event statistics, DBAs can gain valuable insights into system behavior and
take proactive measures to enhance performance, ensuring optimal responsiveness and reliability for SQL Server environments.
Related content