SQLSERVER排查CPU占用高的情况
SQLSERVER排查CPU占用高的情况今天中午,有朋友叫我帮他看一下数据库,操作系统是Windows2008R2,数据库是SQL2008R2 64位 64G内存,16核CPU 硬件配置还是比较高的,他说服务器运行的是金蝶K3软件,数据库实例里有多个数据库 ? 现象 他说是这几天才出现的,而且在每天的某一个时间段才会出现CPU占用高的情况 内存占用不太高,只占用了30个G CPU占用100% 排查方向 ? 一般排查都是用下面的脚本,一般会用到三个视图sys.sysprocesses,dm_exec_sessions,dm_exec_requests 1 USE master 2 GO 3 --如果要指定数据库就把注释去掉 4 SELECT * FROM sys.[sysprocesses] WHERE [spid]>50 --AND DB_NAME([dbid])='gposdb' 5 SELECT COUNT(*) FROM [sys].[dm_exec_sessions] WHERE [session_id]>50 看一下当前的数据库用户连接有多少 然后使用下面语句看一下各项指标是否正常,是否有阻塞,这个语句选取了前10个最耗CPU时间的会话 1 SELECT TOP 10 2 [session_id], 3 [request_id], 4 [start_time] AS '开始时间', 5 [status] AS '状态', 6 [command] AS '命令', 7 dest.[text] AS 'sql语句', 8 DB_NAME([database_id]) AS '数据库名', 9 [blocking_session_id] AS '正在阻塞其他会话的会话ID',10 [wait_type] AS '等待资源类型',11 [wait_time] AS '等待时间',12 [wait_resource] AS '等待的资源',13 [reads] AS '物理读次数',14 [writes] AS '写次数',15 [logical_reads] AS '逻辑读次数',16 [row_count] AS '返回结果行数' 17 FROM sys.[dm_exec_requests] AS der 18 CROSS APPLY 19 sys.[dm_exec_sql_text](der.[sql_handle]) AS dest 20 WHERE [session_id]>50 AND DB_NAME(der.[database_id])='gposdb' 21 ORDER BY [cpu_time] DESC
1 --在SSMS里选择以文本格式显示结果 2 SELECT TOP 10 3 dest.[text] AS 'sql语句' 4 FROM sys.[dm_exec_requests] AS der 5 CROSS APPLY 6 sys.[dm_exec_sql_text](der.[sql_handle]) AS dest 7 WHERE [session_id]>50 8 ORDER BY [cpu_time] DESC 模拟了一些耗CPU时间的动作 ? 还有查看CPU数和user scheduler数和最大工作线程数,检查worker是否用完也可以排查CPU占用情况 1 --查看CPU数和user scheduler数目 2 SELECT cpu_count,scheduler_count FROM sys.dm_os_sys_info 3 --查看最大工作线程数 4 SELECT max_workers_count FROM sys.dm_os_sys_info 查看机器上的所有schedulers包括user 和system
1 SELECT 2 scheduler_address,3 scheduler_id,4 cpu_id,5 status,6 current_tasks_count,7 current_workers_count,active_workers_count 8 FROM sys.dm_os_schedulers ? 如果SQLSERVER存在要等待的资源,那么执行下面语句就会显示出会话中有多少个worker在等待 结合[sys].[dm_os_wait_stats]视图,如果当前SQLSERVER里面没有任何等待资源,那么下面的SQL语句不会显示任何结果 1 SELECT TOP 10 2 [session_id], 3 [request_id], 4 [start_time] AS '开始时间', 5 [status] AS '状态', 6 [command] AS '命令', 7 dest.[text] AS 'sql语句', 8 DB_NAME([database_id]) AS '数据库名', 9 [blocking_session_id] AS '正在阻塞其他会话的会话ID',10 der.[wait_type] AS '等待资源类型',11 [wait_time] AS '等待时间',12 [wait_resource] AS '等待的资源',13 [dows].[waiting_tasks_count] AS '当前正在进行等待的任务数',14 [reads] AS '物理读次数',15 [writes] AS '写次数',16 [logical_reads] AS '逻辑读次数',17 [row_count] AS '返回结果行数' 18 FROM sys.[dm_exec_requests] AS der 19 INNER JOIN [sys].[dm_os_wait_stats] AS dows 20 ON der.[wait_type]=[dows].[wait_type] 21 CROSS APPLY 22 sys.[dm_exec_sql_text](der.[sql_handle]) AS dest 23 WHERE [session_id]>50 24 ORDER BY [cpu_time] DESC 比如我当前执行了查询SalesOrderDetail_test表100次,由于表数据非常多,所以SSMS需要把SQLSERVER执行的结果慢慢的取走, 造成了ASYNC_NETWORK_IO等待 1 USE [AdventureWorks] 2 GO 3 SELECT * FROM dbo.[SalesOrderDetail_test] 4 GO 100 ? 问题源头 经过排查和这几天的观察情况,确定是某些表缺失索引导致,现在在这些表上增加了索引,问题解决 1 select * from t_AccessControl --权限控制表权限控制 2 select * from t_GroupAccess --用户组权限表用户组权限 3 select * from t_GroupAccessType --用户组权限类表用户组权限类 4 select * from t_ObjectAccess --对象权限表对象权限 5 select * from t_ObjectAccessType --对象权限类型表对象权限类型 6 select * from t_ObjectType --对象类型表对象类型 查询CPU占用高的语句 1 SELECT TOP 10 2 total_worker_time/execution_count AS avg_cpu_cost,plan_handle, 3 execution_count, 4 (SELECT SUBSTRING(text,statement_start_offset/2 + 1, 5 (CASE WHEN statement_end_offset = -1 6 THEN LEN(CONVERT(nvarchar(max),text)) * 2 7 ELSE statement_end_offset 8 END - statement_start_offset)/2) 9 FROM sys.dm_exec_sql_text(sql_handle)) AS query_text 10 FROM sys.dm_exec_query_stats 11 ORDER BY [avg_cpu_cost] DESC 查询缺失索引 1 SELECT 2 DatabaseName = DB_NAME(database_id) 3,[Number Indexes Missing] = count(*) 4 FROM sys.dm_db_missing_index_details 5 GROUP BY DB_NAME(database_id) 6 ORDER BY 2 DESC; 1 SELECT TOP 10 2 [Total Cost] = ROUND(avg_total_user_cost * avg_user_impact * (user_seeks + user_scans),0) 3 ,avg_user_impact 4,TableName = statement 5,[EqualityUsage] = equality_columns 6,[InequalityUsage] = inequality_columns 7,[Include Cloumns] = included_columns 8 FROM sys.dm_db_missing_index_groups g 9 INNER JOIN sys.dm_db_missing_index_group_stats s 10 ON s.group_handle = g.index_group_handle 11 INNER JOIN sys.dm_db_missing_index_details d 12 ON d.index_handle = g.index_handle 13 ORDER BY [Total Cost] DESC; 定位问题后,新建非聚集索引 1 CREATE NONCLUSTERED INDEX IX_t_AccessControl_F4 ON dbo.t_AccessControl 2 ( 3 FObjectType 4 )include([FUserID],[FAccessType],[FAccessMask]) WITH( STATISTICS_NORECOMPUTE = OFF,IGNORE_DUP_KEY = OFF,ALLOW_ROW_LOCKS = ON,ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] 5 GO 6 7 drop index IX_t_AccessControl_F4 on t_AccessControl ?CPU占用恢复正常 跟踪模板和跟踪文件下载,请使用SQL2008R2 版本:files.cnblogs.com/lyhabc/跟踪模板和trace.rar ? 总结 从多次历史经验来看,如果CPU负载持续很高,但内存和IO都还好的话,这种情况下,首先想到的一定是索引问题,十有八九错不了。 注意文章开头贴出的客户机器负载情况图 ? http://www.cnblogs.com/lyhabc/archive/2013/06/12/3133273.html (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |