您好,登錄后才能下訂單哦!
本次性能調優項目中由于涉及的環節較多,最好能夠將生成環境中的所有內容進行監控,同時考慮最低開銷,這樣就從應用服務器和數據庫服務器兩個服務器進行,以nmon作為監控基礎數據,同時監控JVM和數據庫告警和快照。
所有監控的內容都是手段,只有從海量的監控日志中得到規律性、有意義的數據才是性能優化的基礎。有了數據就是對數據的分析,本文將首先介紹需要獲取的數據,內容也將是我從項目獲取的經驗。
基礎環境:兩臺數據庫服務器,做的數據庫集群。
項目中主要使用tongweb(老系統版本很低),監控內容類似如下:
...
"2018-01-11T02:25:55.663+0800","com.tongtech.tongweb:name=***,type=jdbc-connection-pool,category=monitor,server=server","NumConnCreated","10",
"2018-01-11T02:25:55.663+0800","com.tongtech.tongweb:name=***,type=jdbc-connection-pool,category=monitor,server=server","NumConnAcquired","111292",
"2018-01-11T02:25:55.663+0800","com.tongtech.tongweb:name=***,type=jdbc-connection-pool,category=monitor,server=server","NumConnNotSuccessfullyMatched","0",
"2018-01-11T02:26:25.670+0800","com.tongtech.tongweb:type=jvm,category=monitor,server=server","UpTime","222520621",
"2018-01-11T02:26:25.670+0800","com.tongtech.tongweb:type=jvm,category=monitor,server=server","HeapSize","2143485952",
"2018-01-11T02:26:25.671+0800","com.tongtech.tongweb:name=***,type=jdbc-connection-pool,category=monitor,server=server","NumConnUsed","0",
"2018-01-11T02:26:25.671+0800","com.tongtech.tongweb:name=***,type=jdbc-connection-pool,category=monitor,server=server","NumConnSuccessfullyMatched","0",
"2018-01-11T02:26:25.671+0800","com.tongtech.tongweb:name=***,type=jdbc-connection-pool,category=monitor,server=server","WaitQueueLength","0",
"2018-01-11T02:26:25.671+0800","com.tongtech.tongweb:name=***,type=jdbc-connection-pool,category=monitor,server=server","NumConnDestroyed","0",
"2018-01-11T02:26:25.671+0800","com.tongtech.tongweb:name=***,type=jdbc-connection-pool,category=monitor,server=server","ConnRequestWaitTime","4",
"2018-01-11T02:26:25.672+0800","com.tongtech.tongweb:name=***,type=jdbc-connection-pool,category=monitor,server=server","NumConnFailedValidation","0",
"2018-01-11T02:26:25.672+0800","com.tongtech.tongweb:name=***,type=jdbc-connection-pool,category=monitor,server=server","NumConnReleased","111292",
"2018-01-11T02:26:25.672+0800","com.tongtech.tongweb:name=***,type=jdbc-connection-pool,category=monitor,server=server","NumConnFree","10",
...
tongweb的監控數據獲取連接池狀態等信息,我們的方法是通過Excel宏的方式將日志內轉換成可讀數據,并進行圖形分析。具體內容將單獨說明。
JVM線程監控說明
通過對tongweb的JVM監控,可初步判定性能高峰時間點、連接池是否滿,同時進一步判定連接高峰期的性能瓶頸是否出現在應用上,這對今后的性能分析尤為重要,可將主要性能問題歸類,減少不必要的工作。
在Internet RFC標準中,Netstat的定義是: Netstat是在內核中訪問網絡連接狀態及其相關信息的程序,它能提供TCP連接,TCP和UDP監聽,進程內存管理的相關報告。
以下是在項目中獲取的日志摘取
...
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 0.0.0.0:2049 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:139 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:427 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:427 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:58862 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:2544 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:21 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:631 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:445 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:669 0.0.0.0:* LISTEN
...
作為本次性能優化主要的分析手段,nmon起著尤為重要的作用,以下是wiki的解釋,有時間可以了解
nmon collects the following operating system statistics:
CPU and CPU threads Utilisation
CPU frequency for servers or virtual machines that can alter their clock rate
GPU stats including utilisation, MHz and temperatures
Physical and Virtual Memory use
Disk read & write and transfers
Disk Groups decided by the user
Swap and Paging
Network read & write and transfers
Local File-systems
Network File-system (NFS)
Top Processes by CPU use, Memory size and I/O rates
Kernel stats including Run Queue, context-switch, fork, Load Average & Uptime
Large and Huge memory pages
Virtual Machine stats (depending on the hardware) - useful for Linux running KVM to host virtual machines
Resources in the Server and virtual machine
總結其實nmon更像是系統性能開銷的快照,結合對nmon的分析工具可以很清楚的掌握系統的各項指標。
下載分析工具
了解數據庫的告警日志也是掌握當前性能的關鍵環節。
日志如下,如出現error可以針對具體情況進行分析解決。
2018-01-11-00.36.36.090562+480 I13363168A459 LEVEL: Error
PID : 2228842 TID : 142490 PROC : db2sysc
INSTANCE: db2 NODE : 000 DB : TRADE
EDUID : 142490 EDUNAME: db2agent (**) 0
FUNCTION: DB2 UDB, Query Gateway, sqlqg_fedstp_hook, probe:40
MESSAGE : Unexpected error returned from outer RC=
DATA #1 : Hexdump, 4 bytes
0x07000007053F28D0 : 8126 0012 .&..
數據庫日志快照將作為主要分析依據,在快照中可以分析數據庫時間的開銷情況,如下:
...
Number of automatic storage paths = 1
Automatic storage path = /db2data
Node number = 0
State = In Use
File system ID = 9223372079804448776
Storage path free space (bytes) = 69730709504
File system used space (bytes) = 139648946176
File system total space (bytes) = 209379655680
...
本文只是列出了分析的方法,具體操作有時間我會慢慢總結。
工具的利用固然重要,但是性能調優并不是僅僅如此,必須步步為營
做好長期作戰的準備。
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。