当前位置: 首页 > news >正文

optool为macho文件增加动态库

对macho文件有一定理解后,了解下optool是如何给macho文件增加动态库等功能的

optool 源码

环境

macOS 13.4 (22F66)
Xcode 14.3.1

0x0 编译

下载源码

$ git clone --recurse-submodules https://github.com/alexzielenski/optool.git

修改下Deployment Target,比如改成11.0,不修改会报

2025-04-20 11.16.16.png

当然可以从Xcode旧版本里找到libarclite_macos.x复制一个到对应目录下。

⌘ + B后,从Derived Data中找到编译后的二进制文件

2025-04-20 11.20.31.png

libArgumentParser-Static.a 这个静态库主要是解析命令行参数的,而且链接后会在optool中,可以忽略

0x1 命令到源码分析

注入动态库

# optool install -c <command> -p <payload> -t <target>
$ optool install -c load -p 动态库的地址 -t macho文件

编译出一个非常简单的app,用MachOView查看可知Load Comamnds的LC_LOAD_DYLIB加载了Foundaiton,libobjc.A.dylib… UIKit这些系统库

2025-04-20 11.35.29.png

现在追加一个LearnOptionFramework

./optool install -c load -p LearnOptoolFramework.framework/LearnOptoolFramework -t LearnOptool_patch_cmd

2025-04-20 11.37.10.png

再次用MachOView查看,看出Load Commands增加了LearnOptoolFramework

Snip20250420_1.png

源码分析

打开Xcode工程,工程代码结构简单

Snip20250420_2.png

通过启动传参来调试添加动态库的过程

2025-04-20 17.54.16.png

从main函数开始看,前面部分是解析参数的

int main(int argc, const char * argv[]) {@autoreleasepool {BOOL showHelp = NO;// FlagsXPMArgumentSignature *weak = [XPMArgumentSignature argumentSignatureWithFormat:@"[-w --weak]"];XPMArgumentSignature *resign = [XPMArgumentSignature argumentSignatureWithFormat:@"[--resign]"];XPMArgumentSignature *target = [XPMArgumentSignature argumentSignatureWithFormat:@"[-t --target]={1,1}"];...XPMArgumentPackage *package = [[NSProcessInfo processInfo] xpmargs_parseArgumentsWithSignatures:@[resign, command, strip, restore, install, uninstall, output, backup, aslr, help, unrestrict, rename]];NSString *targetPath = [package firstObjectForSignature:target];...

上面这部分是建立命令行解析的规则,比如是这样的字符串 [-t --target]={1,1}

XPMArgumentSignature *target = [XPMArgumentSignature argumentSignatureWithFormat:@"[-t --target]={1,1}"];

进入

+ (id)argumentSignatureWithFormat:(NSString *)format, ...
{va_list args;va_start(args, format);XPMArgumentSignature * signature = [XPMArgumentSignature argumentSignatureWithFormat:format arguments:args];va_end(args);return signature;
}

这里会用到可变参数

可变参数

  1. va_list args:定义一个指向个数可变的参数列表指针;
  2. va_start(args, format)format是第一个可选参数前的固定参数,va_start 使指针指向第一个可选参数;
  3. va_arg(args, type)返回参数列表中指针args所指的参数,返回类型为type,并使指针args指向参数列表中下一个参数;
  4. va_end(ap) 清空参数列表,并置参数指针ap无效.

按流程走下去主要就是

xpmargs_ScanFormatCtorHead(scanner, foundSwitches, foundAliases, &foundRange, &didFindRange);...
for (NSString * s in [enclosedString componentsSeparatedByString:@" "]) {if ([s hasPrefix:@"--"]) {[switches addObject:[s substringFromIndex:2]];} else if ([s hasPrefix:@"-"]) {[switches addObject:[s substringFromIndex:1]];} else {[aliases addObject:s];}}xpmargs_ScanFormatCtorTail(scanner, valueRange, didFindRange);

整个要做的是如何识别[-t --target]={1,1},然后命令行传入能匹配成功。

命令行参数

通过NSProcessInfo 对象的arguments 获取这次传入的参数信息,

@property (readonly, copy) NSArray<NSString *> *arguments;

打印效果大致如下

<__NSFrozenArrayM 0x600000c00c00>(
xxx/optool,
install,
-c,
load,
-p,
LearnOptoolFramework.framework/LearnOptoolFramework,
-t,
LearnOptool_patch_cmd
)

根据规则解析出targetPath

NSString *targetPath = [package firstObjectForSignature:target];

2025-04-20 15.28.31.png

就是-t 参数传入的

如下所示,接着因为工具支持是否要备份,使用({})匿名函数来返回备份文件路径,target后面加_backup,因为没有使用到忽略。

...
NSBundle *bundle = [NSBundle bundleWithPath:targetPath];
NSString *executablePath = [[bundle.executablePath ?: targetPath stringByExpandingTildeInPath] stringByResolvingSymlinksInPath];
NSString *backupPath = ({NSString *bkp = [executablePath stringByAppendingString:@"_backup"];if (bundle) {NSString *vers = [bundle objectForInfoDictionaryKey:(NSString *)kCFBundleVersionKey];if (vers)bkp = [bkp stringByAppendingPathExtension:vers];}bkp;
});;

紧接着是一段保护处理,也略过,然后读取executablePath文件到NSData对象中,然后创建一个可变的二进制对象,后面对binary进行操作

...
NSData *originalData = [NSData dataWithContentsOfFile:executablePath];
NSMutableData *binary = originalData.mutableCopy;
...// 检查
if (!binary)return OPErrorRead;

然后自定义了一个thin_header

// we pass around this header which includes some extra information
// and a 32-bit header which we used for both 32-bit and 64-bit files
// since the 64-bit just adds an extra field to the end which we don't need
struct thin_header {uint32_t offset;uint32_t size;struct mach_header header;
};
struct thin_header headers[4];
uint32_t numHeaders = 0;
headersFromBinary(headers, binary, &numHeaders);if (numHeaders == 0) {LOG("No compatible architecture found");return OPErrorIncompatibleBinary;
}...

主要看下headersFromBinary这个函数

// 这部分的逻辑主要是检查macho文件的魔数字段,然后确定是FAT,还是单独架构的文件,然后将填充到自定义的header结构体
struct thin_header *headersFromBinary(struct thin_header *headers, NSData *binary, uint32_t *amount) {// In a MachO/FAT binary the first 4 bytes is a magic number// which gives details about the type of binary it is// CIGAM and co. mean the target binary has a byte order// in reverse relation to the host machine so we have to swap the bytesuint32_t magic = [binary intAtOffset:0];bool shouldSwap = magic == MH_CIGAM || magic == MH_CIGAM_64 || magic == FAT_CIGAM;
#define SWAP(NUM) shouldSwap ? CFSwapInt32(NUM) : NUMuint32_t numArchs = 0;// a FAT file is basically a collection of thin MachO binariesif (magic == FAT_CIGAM || magic == FAT_MAGIC) {LOG("Found FAT Header");// WE GOT A FAT ONEstruct fat_header fat = *(struct fat_header *)binary.bytes;fat.nfat_arch = SWAP(fat.nfat_arch);int offset = sizeof(struct fat_header);// Loop through the architectures within the FAT binary to find// a thin macho header that we can work with (x86 or x86_64)for (int i = 0; i < fat.nfat_arch; i++) {struct fat_arch arch;arch = *(struct fat_arch *)([binary bytes] + offset);arch.cputype = SWAP(arch.cputype);arch.offset = SWAP(arch.offset);struct thin_header macho = headerAtOffset(binary, arch.offset);if (macho.size > 0) {LOG("Found thin header...");headers[numArchs] = macho;numArchs++;}offset += sizeof(struct fat_arch);}// The binary is thin, meaning it contains only one architecture} else if (magic == MH_MAGIC || magic == MH_MAGIC_64) {// 根据设备的情况这个是这个,然后给thin_header赋值struct thin_header macho = headerAtOffset(binary, 0);if (macho.size > 0) {LOG("Found thin header...");numArchs++;headers[0] = macho;}} else {LOG("No headers found.");}*amount = numArchs;return headers;
}

headerAtOffset函数给thin_header赋值

struct thin_header headerAtOffset(NSData *binary, uint32_t offset) {struct thin_header macho;macho.offset = offset;macho.header = *(struct mach_header *)(binary.bytes + offset);if (macho.header.magic == MH_MAGIC || macho.header.magic == MH_CIGAM) {macho.size = sizeof(struct mach_header);} else {macho.size = sizeof(struct mach_header_64);}if (macho.header.cputype != CPU_TYPE_X86_64 && macho.header.cputype != CPU_TYPE_I386 && macho.header.cputype != CPU_TYPE_ARM && macho.header.cputype != CPU_TYPE_ARM64){macho.size = 0;}return macho;
}

上面这些准备工作执行完成后,可以开始修改二进制文件了。

开始遍历上面准备好的 thin_header 数组,里面有 mach_header 结构体

for (uint32_t i = 0; i < numHeaders; i++) {struct thin_header macho = headers[i];...else if ([package booleanValueForSignature:install]) {NSString *lc = [package firstObjectForSignature:command];uint32_t command = LC_LOAD_DYLIB;if (lc)command = COMMAND(lc);if (command == -1) {LOG("Invalid load command.");return OPErrorInvalidLoadCommand;}if (insertLoadEntryIntoBinary(dylibPath, binary, macho, command)) {LOG("Successfully inserted a %s command for %s", LC(command),   CPU(macho.header.cputype));} else {LOG("Failed to insert a %s command for %s", LC(command), CPU(macho.header.cputype));return OPErrorInsertFailure;}

根据下面这段COMMAND的宏,因为命令传的是load所以就是LC_LOAD_DYLIB,说明是一条加载动态库的指令,将待加载的动态库,二进制文件,macho文件头信息,加载命令都传到insertLoadEntryIntoBinary函数中

#define COMMAND(str) ({ \uint32_t cmd = -1; \if ([str isEqualToString: @"reexport"]) \cmd = LC_REEXPORT_DYLIB; \else if ([str isEqualToString: @"weak"]) \cmd = LC_LOAD_WEAK_DYLIB; \else if ([str isEqualToString: @"upward"]) \cmd = LC_LOAD_UPWARD_DYLIB; \else if ([str isEqualToString: @"load"]) \cmd = LC_LOAD_DYLIB; \cmd; \
})

进入到 insertLoadEntryIntoBinary 函数,这段就是在处理macho来增加动态库加载的逻辑

BOOL insertLoadEntryIntoBinary(NSString *dylibPath, NSMutableData *binary, struct thin_header macho, uint32_t type) {// 异常保护if (type != LC_REEXPORT_DYLIB &&type != LC_LOAD_WEAK_DYLIB &&type != LC_LOAD_UPWARD_DYLIB &&type != LC_LOAD_DYLIB) {LOG("Invalid load command type");return NO;}// parse load commands to see if our load command is already thereuint32_t lastOffset = 0;// 是否重复添加if (binaryHasLoadCommandForDylib(binary, dylibPath, &lastOffset, macho)) {// there already exists a load command for this payload so change the command typeuint32_t originalType = *(uint32_t *)(binary.bytes + lastOffset);if (originalType != type) {LOG("A load command already exists for %s. Changing command type from %s to desired %s", dylibPath.UTF8String, LC(originalType), LC(type));[binary replaceBytesInRange:NSMakeRange(lastOffset, sizeof(type)) withBytes:&type];} else {LOG("Load command already exists");}return YES;}// create a new load command// 新添加,创建加载动态库命令unsigned int length = (unsigned int)sizeof(struct dylib_command) + (unsigned int)dylibPath.length;// 我添加的dylibPath.length = 51,整个length = 75, padding = 5unsigned int padding = (8 - (length % 8));// check if data we are replacing is null// 获取要添加command的位置的内容NSData *occupant = [binary subdataWithRange:NSMakeRange(macho.header.sizeofcmds + macho.offset + macho.size,length + padding)];// All operations in optool try to maintain a constant byte size of the executable// so we don't want to append new bytes to the binary (that would break the executable// since everything is offset-based–we'd have to go in and adjust every offset)// So instead take advantage of the huge amount of padding after the load commands// 比较如果非空,说明到了代码的内容部分,就有问题。添加命令只能在Load Command后面且在代码内容前if (strcmp([occupant bytes], "\0")) {NSLog(@"cannot inject payload into %s because there is no room", dylibPath.fileSystemRepresentation);return NO;}LOG("Inserting a %s command for architecture: %s", LC(type), CPU(macho.header.cputype));struct dylib_command command;struct dylib dylib;/** A variable length string in a load command is represented by an lc_str* union.  The strings are stored just after the load command structure and* the offset is from the start of the load command structure.  The size* of the string is reflected in the cmdsize field of the load command.* Once again any padded bytes to bring the cmdsize field to a multiple* of 4 bytes must be zero.*  lc_str 的注释*/// 所以offset是dylib command的大小dylib.name.offset = sizeof(struct dylib_command);dylib.timestamp = 2; // load commands I've seen use 2 for some reasondylib.current_version = 0;dylib.compatibility_version = 0;command.cmd = type;command.dylib = dylib;// 命令的长度等于内容 + 对齐command.cmdsize = length + padding;unsigned int zeroByte = 0;NSMutableData *commandData = [NSMutableData data];// 动态库 command的内容[commandData appendBytes:&command length:sizeof(struct dylib_command)];// 动态库的路径字符串[commandData appendData:[dylibPath dataUsingEncoding:NSASCIIStringEncoding]];// 默认补0[commandData appendBytes:&zeroByte length:padding];// remove enough null bytes to account of our inserted data// 这部分内容用0覆盖,最后一个参数传0就是指定覆盖区间就是Range的length[binary replaceBytesInRange:NSMakeRange(macho.offset + macho.header.sizeofcmds + macho.size, commandData.length)withBytes:0length:0];// insert the data// 添加data[binary replaceBytesInRange:NSMakeRange(lastOffset, 0) withBytes:commandData.bytes length:commandData.length];// fix the existing header// 元数据的修改macho.header.ncmds += 1;macho.header.sizeofcmds += command.cmdsize;// this is safe to do in 32bit because the 4 bytes after the header are still being put back// 替换mach的header部分[binary replaceBytesInRange:NSMakeRange(macho.offset, sizeof(macho.header)) withBytes:&macho.header];return YES;
}

效果

最后是加载在LC_LOAD_DYLIB这块

2025-04-20 18.12.34.png

insertLoadEntryIntoBinary函数

// 替换的时候有个lastOffset[binary replaceBytesInRange:NSMakeRange(lastOffset, 0) withBytes:commandData.bytes length:commandData.length];// 上面检测是否是重复添加时,会去计算if (binaryHasLoadCommandForDylib(binary, dylibPath, &lastOffset, macho)) {
...
// 只有这个LC_LOAD_DYLIB命令会修改loadOffset
case LC_LOAD_DYLIB: {struct dylib_command command = *(struct dylib_command *)(binary.bytes + binary.currentOffset);char *name = (char *)[[binary subdataWithRange:NSMakeRange(binary.currentOffset + command.dylib.name.offset, command.cmdsize - command.dylib.name.offset)] bytes];if ([@(name) isEqualToString:dylib]) {*lastOffset = (unsigned int)binary.currentOffset;return YES;}binary.currentOffset += size;loadOffset = (unsigned int)binary.currentOffset;break;}
... 
// 这个lastOffset 就是最后一个LC_LOAD_DYLIB的位置
if (lastOffset != NULL)*lastOffset = loadOffset;

optool 添加有限制,但是LoadCommand末尾到代码部分的空间对添加几条命令还是足够的

2025-04-20 18.23.08.png

2025-04-20 18.23.14.png

02x 参考

  1. va_list、va_start和va_end使用
  2. replaceBytes(in:withBytes:length:)

相关文章:

  • 从零创建 Docker 镜像(基于 OCI 格式)
  • Datawhale 春训营 创新药赛道
  • Linux 进程概念补充 (自用)
  • 代理模式(Proxy Pattern)
  • vue3 excel文件导入
  • 贝叶斯分类器:原理、算法与应用详解
  • 位运算,状态压缩dp(算法竞赛进阶指南学习笔记)
  • 【Java】接口interface学习
  • 残烛与风儿的对话
  • OC底层原理【一】 alloc init new
  • java单元测试不能点击run运行测试方法
  • 【第二天】一月速通Python第二天,函数,数据容器,列表,元组,字典。
  • 论文阅读:2023 arxiv A Survey of Reinforcement Learning from Human Feedback
  • 集成运放的关键技术参数
  • 7.0/Q1,Charls最新文章解读
  • 【Oracle专栏】Oracle中的虚拟列
  • pnpm确认全局下载安装了还是显示cnpm不是内部或外部命令,也不是可运行的程序
  • 算法分析传输加密数据格式密文存储代码混淆逆向保护
  • Mac上Cursor无法安装插件解决方法
  • 【大模型】RAG(Retrieval-Augmented Generation)检索增强生成
  • 开发国内首个泌尿专科智能体,医生们将临床经验转变为知识图谱
  • 致敬劳动者!今年拟表彰2426名全国劳动模范和先进工作者
  • 新闻1+1丨居民水电气计量收费乱象,如何治?
  • 第八届进博会将致力于打造“五个高”,为展商增值赋能
  • 我国自主研制的重大航空装备AG600性能怎样?专家解读
  • 中国船东协会:强烈要求美方停止基于政治偏见的调查和行动