EMC isilon/PowerScale 如何收集日志
开门见山,如何收集日志,遇到复杂问题,如性能问题,node 重启等需要找到原因的,一定要收集完整的日志。有了日志,就可以添加vx:StorageExpert来详细解读了。
这个日志一般很大,1个G左右,如果节点很多,会更大。使用命令比较灵活,可以选择收集不同的日志,就不用每次都收集一个很大的日志了。
两种方法:
如果高级些,就用命令,这个很灵活,可以灵活配置,想要什么就收集什么。
先来命令列表:
Usage: isi_gather_info [OPTION]...
Options:
-h Print this message and exit.
-v Print version info and exit.
X200-1# isi_gather_info -v
Unlocking gather-status
Gather-status unlocked
isi_gather_info version: $Revision: 430852 $
-u USER Login as USER instead of default root.
-p PASSWORD Use PASSWORD.
-i Include only the listed utility. See also the -l option for a list of utilities to include.
The special value "all" may be used to include every known utility.
--incremental Gather only logs changed since last log upload
-l List utilities and groups which can be included (see -i and --group) 这个功能可以列出有那些 utilities 或者 groups
X200-1# isi_gather_info -l
Unlocking gather-status
Gather-status unlocked
Known utilities (included with the -i option):
df_all
history
ib_subnet_master_local
ifsvar_modules_fsa
ifsvar_modules_stats
iostat
isi_alerts_local
isi_bug_info
isi_client_stats
isi_flexprotect
isi_flexprotect_local
isi_hw_status_all
isi_ib_bug_info_local
isi_nfs_exports_local
isi_radish_local
isi_services
isi_status_local
isi_sync_job_local
isi_sync_policy_local
isi_tape_local
isi_testjournal
netstat_listen_queue
netstat_routes
netstat_sockets
uptime_local
varlog_messages
varlog_recent
Known groups (included with the --group option):
abr
application
auth
cluster
fs
ib
logs
messages
network
node
protocol
storage
usage
-f FILENAME Gather FILENAME from each node.
-n NODES Gather information only from the given nodes. //指定仅仅收集某个node的日志
Must be a list or range of LNNs (what you see in "isi stat" output). For example: '1,4-10,12,14'. If no nodes are explicitly listed, the whole array is used. Note that nodes are automatically excluded if they are down.
--local-only Do not gather any information from other nodes.
--skip-node-check Skip check for node availability.
-s gather_script Run gather_script on every node.
-S gather_expr Run gather_expr on every node.
-1 gather_expr Run gather_expr on the local node.
-a analysis_script Run analysis_script on results.
-A analysis_expr Run analysis_expr on every node.
-t TARFILE Save all results to TARFILE instead of default tar file.
-x exclude_tool Excludes the specified tool from being gathered from each node.
-I Save results to IFS (default)
-L Save all results to local storage: /var/crash/support/
-T TEMPDIR Save all results to TEMPDIR instead of default dir. (overrides -L and -I)
--tardir <dir> Place the final package directly into <dir>.
--symlinkdir <dir> Create a symlink to the final package in <dir>.
--varlog_recent Gather all logs in /var/log, but not the compressed and rotated old logs (Default is for all logs).
--varlog_all Gather all logs in /var/log, including compressed and rotated old logs (Default).
--nologs Do not gather normal required minimal logs.
--group <name> Add a specific group of utilities to the tarball.
Configuration:
--noconfig Use built-in defaults and bypass the configuration file.
--save-only Save the CLI specified configuration to file and exit.
--save Save the CLI specified configuration to file and run.
Upload Options:
--upload Upload logs to isilon automatically. (Default)
--noupload Do not upload logs to isilon automatically.
--re-upload FILE Re-uploads the specified FILE.
--verify-upload Creates a tar file and uploads to test connectivity.
HTTP Upload Options:
--http Attempt HTTP upload. (Default)
--nohttp Do not attempt HTTP upload.
--http-host HOST Specifies alternate HTTP site for upload.
--http-path DIR Specifies alternate HTTP upload directory.
--http-proxy HOST Specifies Proxy server to use.
--http-proxy-port PORT Specifies Proxy port to use.
FTP Upload Options:
--ftp Attempt FTP upload. (Default)
--noftp Do not attempt FTP upload.
--ftp-user USER Specify alternate user for FTP. (Default: anonymous)
--ftp-pass PASS Specify alternate password for FTP.
--ftp-host HOST Specifies alternate FTP site for upload.
--ftp-path DIR Specifies alternate FTP upload directory.
--ftp-port PORT Specifies alternate FTP port for upload.
--ftp-proxy HOST Specifies Proxy server to use.
--ftp-proxy-port PORT Specifies Proxy port to use.
--ftp-mode MODE Mode of FTP file transfer (default: attempt both) valid: both, active, passive
ESRS Upload Options:
--esrs Attempt ESRS upload.
SMTP Upload Options:
--email Attempt SMTP upload. (If set, SMTP is tried first.)
--noemail Do not attempt SMTP upload. (Default)
--email-addresses Specify email addresses. (separated by comma)
--email-from Specify sender email address.
--email-subject Specify alternative email subject.
--email-body Specify alternative email text shown on head of body.
--skip-size-check Do no check size of gathered file.
Multiple instances of -i, -f, -s, -S, -1 are allowed.
gather_expr, analysis_expr can be quoted.
Default temporary directory is /ifs/data/Isilon_Support/
(change with -L or -T)
命令行收集 Isi_gather_info 收集日志
- Node层面的:包含 health, status, events 和error conditions
- Cluster层面,OS, 文件系统,entire cluster
Log中包含的信息
- General logs (e.g. messages)
- Process specific logs (e.g. SMB, CELOG, alert, drive_history, etc)
Log entry
- Transaction log data
- Information log data
- Error log data
- System state log data
- Change log data
- Configuration log data
可以用 df -h 来查看每个partition中的空间占用情况,要保证/var/log <95%,超过就会不断重启(forced reboots evey 30 seconds)
- / root partition 不能超过 97%
- /ifs 不能超过 95%
- /var 不能超过 90%
- /var/crash 不能超过 90%
- Var/log的大小根据不同的版本,有些是 500M,有些是2Gb
Log保存在每个node的/var/log 目录中,这个目录下还有很多的子目录
- 使用 ls -F 可以列出所有的文件和目录
- Find -name 命令可以查找相关的日志
Find /var/log -name "*celog.log" -print
如果这个node有notification, 说明这个node是CELOG的master 节点
日志详细detail分级:
- logging or warning (default):最小的细节,最小的cluster影响
- error:更多的细节,对clusetr 有一些影响
- Verbose:明显的细节增加,显著影响cluster 性能
- Trace和 debug. Trace和debug是engineering用的,一般不打开。
日志级别越高,就越详细,越复杂,分析时间也会增加,同时也会影响cluster的performance
可以使用web 方式,也可以使用cli方式,日志保存在
/ifs/data/Isilon_Support/pkg/IsilonLogs-mycluster-<timestamp>.tgz
如何把日志拿下来?
用root账号登陆到任意一个节点的命令行并输入以下命令:
# isi_gather_info
看到很多省略号就说明日志已经在收集的过程中,请保持这个窗口打開不要关闭。 看到屏幕上出现文件名称就说明日志已经收集完成。 日志生成后会保存在Isilon的路径: /ifs/data/Isilon_Support/pkg/
文件名:
IsilonLogs-<clustername>-<timestamp>.tgz
默认是把所有的log都tar到一个包中。也可以指定某个特定的utility log或者group。
重要的日志:
Nodename/varlog.tar/log/messages //general activity log
Nodename/varlog.tar/log/vsftod.log //ftp damon log
Local/isi_quota_verbose // Cluster wide quota log
Nodename/sockstate //network socket status
Ndoename/isi_sync_policy //SyncIQ policy
Nodename/isi_alerts //node alerts
Nodename/isi_radish //node drive health
Nodename/isi_devices //node hardware record
Local/ifsvar_modules.tar/modules/jobengine/status.gc //cluster wide job status
Nodename/varlog.tar/log/isi_migrate.log //Migration activity log
Nodename/isi_status //node status summary
Nodename/isi_checkjournal //node journal status
Nodename/isi_ib_bug_info //node infiniband link status
Nodename/ifconfig //node network link status
Nodename/top //node busiest processes
Nodename/fstat //node open file handles
Nodename/varlog.tar/log/daily.log //node daily script output
Nodename/varlog.tar/log/vmlog //node process internals summary
Nodename/isi_hw_status //node hardware status summary
Nodename/varcrash_ls //node file list for /var/crash
图形界面收集
oneFS的版本不同,图形界面是有些差异的,大基本上都在Diagnostics里面,自己找找把。
OneFS web administration interface
OneFS 7.x - 8.0.X
- Log in to the OneFS web administration interface.
- Click Help > Diagnostics.
- Click Start Gather.
OneFS 8.1.X
- Log into the OneFS web administration interface
- Click Cluster Management > Diagnostics.
- Click Start Gather.