Dec 10 11:45:56 clickhouse kernel: [ 3541.839637] print_req_error: I/O error, dev vda, sector 6287898104 Dec 10 11:45:56 clickhouse rasdaemon[969]: rasdaemon: diskerror_eventstore: 0xaaada9e69ff8 // 由于发生了I/O错误,EXT4文件系统中止了其日志操作。日志是EXT4保证文件系统一致性的关键机制,中止日志意味着文件系统可能处于不一致状态。 Dec 10 11:45:56 clickhouse kernel: [ 3541.839790] Aborting journal on device vda1-8. Dec 10 11:45:56 clickhouse rasdaemon[969]: rasdaemon: register inserted at db Dec 10 11:45:56 clickhouse rasdaemon[969]: <idle>-0 [001] 0.000354: block_rq_complete: 2024-12-10 11:45:56 +0800 253:0 WS () 6287898104 + 8 [I/O error] // 进一步确认了日志被中止 Dec 10 11:45:56 clickhouse kernel: [ 3541.916239] EXT4-fs error (device vda1): ext4_journal_check_start:61: comm ext4lazyinit: Detected aborted journal // 为了防止进一步的数据损坏,内核将文件系统/dev/vda1重新挂载为只读模式。 Dec 10 11:45:56 clickhouse kernel: [ 3541.919341] EXT4-fs (): Remounting filesystem read-only
在Dec 11 10:06:01, Dec 11 10:09:49, Dec 11 10:19:34这三个时间点,尝试重新挂载失败。
//试重新挂载文件系统/dev/vda1为读写模式,但由于底层仍然存在问题,挂载操作被用户强制中止。这表明问题并没有自动恢复。 Dec 11 10:06:01 clickhouse kernel: [83947.454615] EXT4-fs error (device vda1): ext4_remount:5643: comm mount: Abort forced by user Dec 11 10:09:49 clickhouse kernel: [84175.222579] EXT4-fs error (device vda1): ext4_remount:5643: comm mount: Abort forced by user Dec 11 10:19:34 clickhouse kernel: [84759.838210] EXT4-fs error (device vda1): ext4_remount:5643: comm mount: Abort forced by user
//EXT4记录了自上次文件系统检查 (fsck) 以来发生的错误数量为4 Dec 12 10:36:57 clickhouse kernel: [ 310.239571] EXT4-fs (vda1): error count since last fsck: 4 // initial error at time ...和last error at time ...: 记录了首次和最后一次错误的时间戳,与之前的日志记录一致 Dec 12 10:36:57 clickhouse kernel: [ 310.239603] EXT4-fs (vda1): initial error at time 1733802356: ext4_journal_check_start:61 Dec 12 10:36:57 clickhouse kernel: [ 310.239609] EXT4-fs (vda1): last error at time 1733883574: ext4_remount:5643