当前位置: 首页 > news >正文

EMC isilon/PowerScale 如何收集日志

开门见山,如何收集日志,遇到复杂问题,如性能问题,node 重启等需要找到原因的,一定要收集完整的日志。有了日志,就可以添加vx:StorageExpert来详细解读了。

这个日志一般很大,1个G左右,如果节点很多,会更大。使用命令比较灵活,可以选择收集不同的日志,就不用每次都收集一个很大的日志了。

两种方法:

如果高级些,就用命令,这个很灵活,可以灵活配置,想要什么就收集什么。

先来命令列表:

Usage: isi_gather_info [OPTION]...

Options:
  -h                  Print this message and exit.
  -v                  Print version info and exit.
    X200-1# isi_gather_info -v
    Unlocking gather-status
    Gather-status unlocked
    isi_gather_info version: $Revision: 430852 $
  -u USER             Login as USER instead of default root.
  -p PASSWORD         Use PASSWORD.
  -i                  Include only the listed utility.  See also the -l option for a list of utilities to include.
                      The special value "all" may be used to include every known utility.
  --incremental       Gather only logs changed since last log upload
  -l     List utilities and groups which can be included (see -i and --group)  这个功能可以列出有那些 utilities 或者 groups
        X200-1# isi_gather_info -l
        Unlocking gather-status
        Gather-status unlocked        
        Known utilities (included with the -i option):
        
          df_all
          history
          ib_subnet_master_local
          ifsvar_modules_fsa
          ifsvar_modules_stats
          iostat
          isi_alerts_local
          isi_bug_info
          isi_client_stats
          isi_flexprotect
          isi_flexprotect_local
          isi_hw_status_all
          isi_ib_bug_info_local
          isi_nfs_exports_local
          isi_radish_local
          isi_services
          isi_status_local
          isi_sync_job_local
          isi_sync_policy_local
          isi_tape_local
          isi_testjournal
          netstat_listen_queue
          netstat_routes
          netstat_sockets
          uptime_local
          varlog_messages
          varlog_recent
                
        Known groups (included with the --group option):
                  abr
          application
          auth
          cluster
          fs
          ib
          logs
          messages
          network
          node
          protocol
          storage
          usage
        

  -f FILENAME         Gather FILENAME from each node.
  -n NODES            Gather information only from the given nodes.  //指定仅仅收集某个node的日志
                      Must be a list or range of LNNs (what you see in "isi stat" output). For example: '1,4-10,12,14'. If no nodes are explicitly listed, the whole array is used.  Note that nodes are automatically excluded if they are down.
  --local-only        Do not gather any information from other nodes.
  --skip-node-check   Skip check for node availability.
  -s gather_script    Run gather_script on every node.
  -S gather_expr      Run gather_expr on every node.
  -1 gather_expr      Run gather_expr on the local node.
  -a analysis_script  Run analysis_script on results.
  -A analysis_expr      Run analysis_expr on every node.
  -t TARFILE          Save all results to TARFILE instead of default tar file.
  -x exclude_tool     Excludes the specified tool from being gathered from each node.
  -I                  Save results to IFS (default)
  -L                  Save all results to local storage: /var/crash/support/
  -T TEMPDIR          Save all results to TEMPDIR instead of default dir.   (overrides -L and -I)
  --tardir <dir>      Place the final package directly into <dir>.
  --symlinkdir <dir>  Create a symlink to the final package in <dir>.
  --varlog_recent     Gather all logs in /var/log, but not the compressed and rotated old logs (Default is for all logs).
  --varlog_all        Gather all logs in /var/log, including compressed and rotated old logs (Default).
  --nologs         Do not gather normal required minimal logs.
  --group <name>      Add a specific group of utilities to the tarball.

Configuration:
  --noconfig         Use built-in defaults and bypass the configuration file.
  --save-only        Save the CLI specified configuration to file and exit.
  --save             Save the CLI specified configuration to file and run.

Upload Options:
  --upload            Upload logs to isilon automatically. (Default)
  --noupload          Do not upload logs to isilon automatically.
  --re-upload FILE    Re-uploads the specified FILE.
  --verify-upload     Creates a tar file and uploads to test connectivity.

  HTTP Upload Options:
    --http                 Attempt HTTP upload. (Default)
    --nohttp               Do not attempt HTTP upload.
    --http-host HOST       Specifies alternate HTTP site for upload.
    --http-path DIR        Specifies alternate HTTP upload directory.
    --http-proxy HOST      Specifies Proxy server to use.
    --http-proxy-port PORT Specifies Proxy port to use.

  FTP Upload Options:
    --ftp                 Attempt FTP upload. (Default)
    --noftp               Do not attempt FTP upload.
    --ftp-user USER       Specify alternate user for FTP.  (Default: anonymous)
    --ftp-pass PASS       Specify alternate password for FTP.
    --ftp-host HOST       Specifies alternate FTP site for upload.
    --ftp-path DIR        Specifies alternate FTP upload directory.
    --ftp-port PORT       Specifies alternate FTP port for upload.
    --ftp-proxy HOST      Specifies Proxy server to use.
    --ftp-proxy-port PORT Specifies Proxy port to use.
    --ftp-mode MODE       Mode of FTP file transfer (default: attempt both)   valid: both, active, passive

  ESRS Upload Options:
    --esrs                Attempt ESRS upload.

  SMTP Upload Options:
    --email             Attempt SMTP upload. (If set, SMTP is tried first.)
    --noemail           Do not attempt SMTP upload. (Default)
    --email-addresses   Specify email addresses. (separated by comma)
    --email-from        Specify sender email address.
    --email-subject     Specify alternative email subject.
    --email-body        Specify alternative email text shown on head of body.
    --skip-size-check   Do no check size of gathered file.

  Multiple instances of -i, -f, -s, -S, -1 are allowed.
  gather_expr, analysis_expr can be quoted.
  Default temporary directory is /ifs/data/Isilon_Support/
  (change with -L or -T)

命令行收集 Isi_gather_info 收集日志

  • Node层面的:包含 health, status, events 和error conditions
  • Cluster层面,OS, 文件系统,entire cluster

Log中包含的信息

  • General logs (e.g. messages)
  • Process specific logs (e.g. SMB, CELOG, alert, drive_history, etc)

Log entry

  • Transaction log data
  • Information log data
  • Error log data
  • System state log data
  • Change log data
  • Configuration log data

可以用 df -h 来查看每个partition中的空间占用情况,要保证/var/log <95%,超过就会不断重启(forced reboots evey 30 seconds)

  • / root partition 不能超过 97%
  • /ifs  不能超过 95%
  • /var 不能超过 90%
  • /var/crash  不能超过 90%
  • Var/log的大小根据不同的版本,有些是 500M,有些是2Gb

Log保存在每个node的/var/log 目录中,这个目录下还有很多的子目录

  • 使用 ls -F 可以列出所有的文件和目录
  • Find -name 命令可以查找相关的日志

Find /var/log -name "*celog.log" -print

如果这个node有notification, 说明这个node是CELOG的master 节点

日志详细detail分级:

  • logging or warning (default):最小的细节,最小的cluster影响
  • error:更多的细节,对clusetr 有一些影响
  • Verbose:明显的细节增加,显著影响cluster 性能
  • Trace和 debug. Trace和debug是engineering用的,一般不打开。

日志级别越高,就越详细,越复杂,分析时间也会增加,同时也会影响cluster的performance

可以使用web 方式,也可以使用cli方式,日志保存在

/ifs/data/Isilon_Support/pkg/IsilonLogs-mycluster-<timestamp>.tgz

如何把日志拿下来?

用root账号登陆到任意一个节点的命令行并输入以下命令: 

               # isi_gather_info 

看到很多省略号就说明日志已经在收集的过程中,请保持这个窗口打開不要关闭。 看到屏幕上出现文件名称就说明日志已经收集完成。 日志生成后会保存在Isilon的路径: /ifs/data/Isilon_Support/pkg/ 

               文件名: 

               IsilonLogs-<clustername>-<timestamp>.tgz 

默认是把所有的log都tar到一个包中。也可以指定某个特定的utility log或者group。

重要的日志:

Nodename/varlog.tar/log/messages                        //general activity log

Nodename/varlog.tar/log/vsftod.log                      //ftp damon log

Local/isi_quota_verbose                                 // Cluster wide quota log

Nodename/sockstate                                     //network socket status

Ndoename/isi_sync_policy                                //SyncIQ policy

Nodename/isi_alerts                                    //node alerts

Nodename/isi_radish                                    //node drive health

Nodename/isi_devices                                   //node hardware record

Local/ifsvar_modules.tar/modules/jobengine/status.gc        //cluster wide job status

Nodename/varlog.tar/log/isi_migrate.log                 //Migration activity log

Nodename/isi_status                                     //node status summary

Nodename/isi_checkjournal                               //node journal status

Nodename/isi_ib_bug_info                                //node infiniband link status

Nodename/ifconfig                                      //node network link status

Nodename/top                                          //node busiest processes

Nodename/fstat                                       //node open file handles

Nodename/varlog.tar/log/daily.log                    //node daily script output

Nodename/varlog.tar/log/vmlog                       //node process internals summary

Nodename/isi_hw_status                               //node hardware status summary

Nodename/varcrash_ls                                //node file list for /var/crash

图形界面收集

oneFS的版本不同,图形界面是有些差异的,大基本上都在Diagnostics里面,自己找找把。

OneFS web administration interface

OneFS 7.x - 8.0.X

  1. Log in to the OneFS web administration interface.
  2. Click Help > Diagnostics.
  3. Click Start Gather.

OneFS 8.1.X

  1. Log into the OneFS web administration interface 
  2. Click Cluster Management > Diagnostics.
  3. Click Start Gather. 

相关文章:

  • 【神经网络与深度学习】五折交叉验证(5-Fold Cross-Validation)
  • 数据结构 RBT 插入操作的 Python 代码实现
  • 设计模式(行为型)解释器模式
  • 如何免费把PPT的页面输出为透明的图片-快速制作图新说汇报内容
  • 【图论 拓扑排序 bfs】P6037 Ryoku 的探索|普及+
  • Docker的分解分析
  • 鹧鸪云光伏项目智慧施工软件:数字化驱动的光伏建设新范式
  • 量子算法调试:Grover算法搜索空间压缩过程可视化方案
  • elasticsearch底层模块解析与实践系列
  • python程序设习题答案
  • C#核心知识
  • [250428] Nginx 1.28.0 发布:性能优化、安全增强及新特性
  • Typecho博客使用阿里云cdn和oss:handsome主题进阶版
  • 从大众传媒到数字生态:开源AI智能名片链动2+1模式S2B2C商城小程序驱动的营销革命
  • 100天精通Python挑战总览 | 零基础到应用实战!
  • Nature Communications 仿生电子天线:赋予机器人敏锐 “触觉”
  • 探寻健康养生之道,拥抱活力人生
  • LVDS系列10:Xilinx 7系可编程输入延迟(三)
  • 大模型在肝硬化腹水风险预测及临床方案制定中的应用研究
  • IIS服务器提示ERR_HTTP2 PROTOCOL ERROR解决方案
  • 金科服务:大股东博裕资本提出无条件强制性现金要约收购,总代价约17.86亿港元
  • 新华社评论员:汇聚起工人阶级和广大劳动群众的磅礴力量
  • 四川省社科联期刊:不建议在读硕士、博士将导师挂名为第一作者
  • 早睡1小时,变化有多惊人?第一个就没想到
  • 上海数学教育及数学科普专家陈永明去世,享年85岁
  • 商务部:将积极会同相关部门加快推进离境退税政策落实落地