您好,登錄后才能下訂單哦!
1. 增加進程運行狀態檢測,異常時自動重啟
? ? 由于osw不是守護進程,因此在停機維護以后,很有可能忘記啟動,后續在進行問題分析的時候,缺少了有效的數據,因此可以將osw配置成crontab,這樣就避免了主機重啟以后采集數據丟失的情況。
[oracle@rac11g1 oswbb]$ cat oswcheck.sh?
#!/bin/sh
######################################################################
# Copyright (c) ?2016 by Ducw
# oswcheck.sh
# This program check OSWatcher run status, if not run, start OSWatcher
# oswcheck crontab config
# 1 * * * * /oracle/oswbb/oswcheck.sh > /oracle/oswbb/oswcheck.log 2>/oracle/oswbb/oswcheck.err
######################################################################
OSWRUNFLAG=`ps -ef | grep OSWatcher.sh | grep -v grep | wc -l`
CHECKDATA=`date "+%Y-%m-%d %H:%M:%S"`
if [[ ${OSWRUNFLAG} -eq 1 ]]; then
? echo "================================================================================="
? echo "OSWatcher is running at "${CHECKDATA}
? echo "================================================================================="
? echo ""
else
? echo "================================================================================="
? echo "OSWatcher is not running at "${CHECKDATA}
? echo "================================================================================="
? echo ""
? echo "Begin to start startOSWbb.sh"
? cd /oracle/oswbb
? nohup ./startOSWbb.sh 10 72 &
fi
配置crontab
[oracle@rac11g1 oswbb]$ crontab -l
1* * * * /oracle/oswbb/oswcheck.sh > /oracle/oswbb/oswcheck.log 2>/oracle/oswbb/oswcheck.err
2. 調整 ps 輸出進程信息
? ??默認情況下,會輸出所有的進程信息,針對數據庫主機,信息會比較多,每次開sr,巨大的文件上傳也是一個問題。因此只輸出CPU使用率較高的前100的進程信息或內存使用前100的進程,這樣速度就會快很多。當然這也存在某些情況下,丟失部分有用信息。
[oracle@rac11g1 oswbb]$ cat psmemsub.sh
? ? ? HP-UX|HI-UX)
UNIX95=1 ps -e -o user,pid,ppid,pri,pcpu,cpu,vsz,sz,wchan,state,etime,args | head -1 >> $1
UNIX95=1 ps -e -o user,pid,pcpu,ppid,pri,cpu,vsz,sz,wchan,state,etime,args |?sort -nr -k 3?| head -100 >> $1
3. 新增心跳網卡監控
? ? 這段信息來至MOS,心跳網卡的監控在RAC環境中尤為重要。
? ??設置私網間通訊檢查:?
a) 拷貝Exampleprivate.net 為 private.net 到同一個目錄下。?
b).在private.net中找到您對應的系統平臺,替換下面的private_nodename1 , private_nodename2 為具體的私網IP或者主機名?
traceroute -r -F private_nodename1?
traceroute -r -F private_nodename2?
c). 將private.net中其它的平臺部分刪除。?
d). 千萬不要刪除下面的內容:?
rm locks/lock.file?
4. 快速分析某段時間的CPU/內存的使用率
?
?
Linux中的paste命令,可以將多個文件的記錄進行拼接,同時顯示出來。oswtop的數據形式如下,包含時間信息,以及cpu使用率信息。因此我們可以截取這兩部的內容,拼接到一個文件,然后對特定字段進行排序操作。這樣就可以幫助我們快速定位負載高的時間點。當然也可以使用java圖形工具的形式。只是感覺腳本定制的方式更便捷,也不受環境的約束。
? ? 當然這種方法也可以適用于內存等信息的分析。
zzz ***Wed Jan 27 08:53:08 CST 2016
top - 08:53:10 up 9 min, ?3 users, ?load average: 1.22, 2.01, 1.27
Tasks: 259 total, ? 2 running, 257 sleeping, ? 0 stopped, ? 0 zombie
Cpu(s): ?1.0%us, ?0.0%sy, ?0.0%ni, 99.0%id, ?0.0%wa, ?0.0%hi, ?0.0%si, ?0.0%st
Mem: ? 4050948k total, ?3643476k used, ? 407472k free, ? 217272k buffers
Swap: ?4095992k total, ? ? ? ?0k used, ?4095992k free, ?1929572k cached
[root@rac11g1 oswtop]# cat rac11g1_top_16.01.27.0800.dat | grep ^zzz > top_time.txt
[root@rac11g1 oswtop]# cat rac11g1_top_16.01.27.0800.dat | grep ^Cpu > top_cpu.txt
[root@rac11g1 oswtop]# ?paste -d ' ' top_time.txt top_cpu.txt > result.txt
[root@rac11g1 oswtop]# cat result.txt?
zzz ***Wed Jan 27 08:49:08 CST 2016 Cpu(s): ?3.0%us, ?1.0%sy, ?0.0%ni, 94.1%id, ?2.0%wa, ?0.0%hi, ?0.0%si, ?0.0%st
zzz ***Wed Jan 27 08:49:38 CST 2016 Cpu(s): ?1.0%us, ?1.0%sy, ?0.0%ni, 98.0%id, ?0.0%wa, ?0.0%hi, ?0.0%si, ?0.0%st
zzz ***Wed Jan 27 08:50:08 CST 2016 Cpu(s): ?2.0%us, ?0.0%sy, ?0.0%ni, 96.1%id, ?0.0%wa, ?1.0%hi, ?1.0%si, ?0.0%st
zzz ***Wed Jan 27 08:50:38 CST 2016 Cpu(s): ?0.0%us, ?0.0%sy, ?0.0%ni,100.0%id, ?0.0%wa, ?0.0%hi, ?0.0%si, ?0.0%st
zzz ***Wed Jan 27 08:51:08 CST 2016 Cpu(s): ?1.0%us, ?0.0%sy, ?0.0%ni, 99.0%id, ?0.0%wa, ?0.0%hi, ?0.0%si, ?0.0%st
zzz ***Wed Jan 27 08:51:38 CST 2016 Cpu(s): 14.0%us, ?1.0%sy, ?0.0%ni, 85.0%id, ?0.0%wa, ?0.0%hi, ?0.0%si, ?0.0%st
zzz ***Wed Jan 27 08:52:08 CST 2016 Cpu(s): ?1.0%us, ?0.0%sy, ?0.0%ni, 99.0%id, ?0.0%wa, ?0.0%hi, ?0.0%si, ?0.0%st
zzz ***Wed Jan 27 08:52:38 CST 2016 Cpu(s): ?1.0%us, ?1.9%sy, ?0.0%ni, 78.6%id, 17.5%wa, ?1.0%hi, ?0.0%si, ?0.0%st
zzz ***Wed Jan 27 08:53:08 CST 2016 Cpu(s): ?1.0%us, ?0.0%sy, ?0.0%ni, 99.0%id, ?0.0%wa, ?0.0%hi, ?0.0%si, ?0.0%st
zzz ***Wed Jan 27 08:53:38 CST 2016 Cpu(s): ?2.0%us, ?2.0%sy, ?0.0%ni, 95.1%id, ?0.0%wa, ?1.0%hi, ?0.0%si, ?0.0%st
zzz ***Wed Jan 27 08:54:09 CST 2016 Cpu(s): ?1.0%us, ?0.0%sy, ?0.0%ni, 98.0%id, ?0.0%wa, ?0.0%hi, ?1.0%si, ?0.0%st
zzz ***Wed Jan 27 08:54:39 CST 2016 Cpu(s): ?1.0%us, ?1.0%sy, ?0.0%ni, 98.0%id, ?0.0%wa, ?0.0%hi, ?0.0%si, ?0.0%st
zzz ***Wed Jan 27 08:55:09 CST 2016 Cpu(s): ?2.0%us, ?2.0%sy, ?0.0%ni, 96.0%id, ?0.0%wa, ?0.0%hi, ?0.0%si, ?0.0%st
zzz ***Wed Jan 27 08:55:39 CST 2016 Cpu(s): ?1.0%us, ?2.0%sy, ?0.0%ni, 96.1%id, ?0.0%wa, ?1.0%hi, ?0.0%si, ?0.0%st
zzz ***Wed Jan 27 08:56:09 CST 2016 Cpu(s): ?5.9%us, ?2.0%sy, ?0.0%ni, 87.3%id, ?4.9%wa, ?0.0%hi, ?0.0%si, ?0.0%st
[root@rac11g1 oswtop]# sort -n -k 9 result.txt ?經過排序后的內容
......
zzz ***Wed Jan 27 08:55:39 CST 2016 Cpu(s): ?1.0%us, ?2.0%sy, ?0.0%ni, 96.1%id, ?0.0%wa, ?1.0%hi, ?0.0%si, ?0.0%st
zzz ***Wed Jan 27 08:50:08 CST 2016 Cpu(s): ?2.0%us, ?0.0%sy, ?0.0%ni, 96.1%id, ?0.0%wa, ?1.0%hi, ?1.0%si, ?0.0%st
zzz ***Wed Jan 27 08:53:38 CST 2016 Cpu(s): ?2.0%us, ?2.0%sy, ?0.0%ni, 95.1%id, ?0.0%wa, ?1.0%hi, ?0.0%si, ?0.0%st
zzz ***Wed Jan 27 08:55:09 CST 2016 Cpu(s): ?2.0%us, ?2.0%sy, ?0.0%ni, 96.0%id, ?0.0%wa, ?0.0%hi, ?0.0%si, ?0.0%st
zzz ***Wed Jan 27 08:49:08 CST 2016 Cpu(s):??3.0%us, ?1.0%sy, ?0.0%ni, 94.1%id, ?2.0%wa, ?0.0%hi, ?0.0%si, ?0.0%st
zzz ***Wed Jan 27 08:56:09 CST 2016 Cpu(s): ?5.9%us, ?2.0%sy, ?0.0%ni, 87.3%id, ?4.9%wa, ?0.0%hi, ?0.0%si, ?0.0%st
zzz ***Wed Jan 27 08:51:38 CST 2016 Cpu(s):?14.0%us, ?1.0%sy, ?0.0%ni, 85.0%id, ?0.0%wa, ?0.0%hi, ?0.0%si, ?0.0%st
????由于文件是時間順序生成的,因此時間和對應的資源使用率可以一致對應。剩下的工作就是自由發揮了。當然你也可以寫出更優雅的shell腳本。
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。