optool为macho文件增加动态库
对macho文件有一定理解后,了解下optool
是如何给macho文件增加动态库等功能的
optool 源码
环境
macOS 13.4 (22F66)
Xcode 14.3.1
0x0 编译
下载源码
$ git clone --recurse-submodules https://github.com/alexzielenski/optool.git
修改下Deployment Target,比如改成11.0,不修改会报
当然可以从Xcode旧版本里找到libarclite_macos.x复制一个到对应目录下。
⌘ + B后,从Derived Data中找到编译后的二进制文件
libArgumentParser-Static.a 这个静态库主要是解析命令行参数的,而且链接后会在optool中,可以忽略
0x1 命令到源码分析
注入动态库
# optool install -c <command> -p <payload> -t <target>
$ optool install -c load -p 动态库的地址 -t macho文件
编译出一个非常简单的app,用MachOView查看可知Load Comamnds的LC_LOAD_DYLIB加载了Foundaiton,libobjc.A.dylib… UIKit这些系统库
现在追加一个LearnOptionFramework
./optool install -c load -p LearnOptoolFramework.framework/LearnOptoolFramework -t LearnOptool_patch_cmd
再次用MachOView查看,看出Load Commands增加了LearnOptoolFramework
源码分析
打开Xcode工程,工程代码结构简单
通过启动传参来调试添加动态库的过程
从main函数开始看,前面部分是解析参数的
int main(int argc, const char * argv[]) {@autoreleasepool {BOOL showHelp = NO;// FlagsXPMArgumentSignature *weak = [XPMArgumentSignature argumentSignatureWithFormat:@"[-w --weak]"];XPMArgumentSignature *resign = [XPMArgumentSignature argumentSignatureWithFormat:@"[--resign]"];XPMArgumentSignature *target = [XPMArgumentSignature argumentSignatureWithFormat:@"[-t --target]={1,1}"];...XPMArgumentPackage *package = [[NSProcessInfo processInfo] xpmargs_parseArgumentsWithSignatures:@[resign, command, strip, restore, install, uninstall, output, backup, aslr, help, unrestrict, rename]];NSString *targetPath = [package firstObjectForSignature:target];...
上面这部分是建立命令行解析的规则,比如是这样的字符串 [-t --target]={1,1}
XPMArgumentSignature *target = [XPMArgumentSignature argumentSignatureWithFormat:@"[-t --target]={1,1}"];
进入
+ (id)argumentSignatureWithFormat:(NSString *)format, ...
{va_list args;va_start(args, format);XPMArgumentSignature * signature = [XPMArgumentSignature argumentSignatureWithFormat:format arguments:args];va_end(args);return signature;
}
这里会用到可变参数
可变参数
- va_list args:定义一个指向个数可变的参数列表指针;
- va_start(args, format)format是第一个可选参数前的固定参数,va_start 使指针指向第一个可选参数;
- va_arg(args, type)返回参数列表中指针args所指的参数,返回类型为type,并使指针args指向参数列表中下一个参数;
- va_end(ap) 清空参数列表,并置参数指针ap无效.
按流程走下去主要就是
xpmargs_ScanFormatCtorHead(scanner, foundSwitches, foundAliases, &foundRange, &didFindRange);...
for (NSString * s in [enclosedString componentsSeparatedByString:@" "]) {if ([s hasPrefix:@"--"]) {[switches addObject:[s substringFromIndex:2]];} else if ([s hasPrefix:@"-"]) {[switches addObject:[s substringFromIndex:1]];} else {[aliases addObject:s];}}xpmargs_ScanFormatCtorTail(scanner, valueRange, didFindRange);
整个要做的是如何识别[-t --target]={1,1}
,然后命令行传入能匹配成功。
命令行参数
通过NSProcessInfo
对象的arguments
获取这次传入的参数信息,
@property (readonly, copy) NSArray<NSString *> *arguments;
打印效果大致如下
<__NSFrozenArrayM 0x600000c00c00>(
xxx/optool,
install,
-c,
load,
-p,
LearnOptoolFramework.framework/LearnOptoolFramework,
-t,
LearnOptool_patch_cmd
)
根据规则解析出targetPath
NSString *targetPath = [package firstObjectForSignature:target];
就是-t 参数传入的
如下所示,接着因为工具支持是否要备份,使用({})匿名函数来返回备份文件路径,target后面加_backup,因为没有使用到忽略。
...
NSBundle *bundle = [NSBundle bundleWithPath:targetPath];
NSString *executablePath = [[bundle.executablePath ?: targetPath stringByExpandingTildeInPath] stringByResolvingSymlinksInPath];
NSString *backupPath = ({NSString *bkp = [executablePath stringByAppendingString:@"_backup"];if (bundle) {NSString *vers = [bundle objectForInfoDictionaryKey:(NSString *)kCFBundleVersionKey];if (vers)bkp = [bkp stringByAppendingPathExtension:vers];}bkp;
});;
紧接着是一段保护处理,也略过,然后读取executablePath文件到NSData对象中,然后创建一个可变的二进制对象,后面对binary进行操作
...
NSData *originalData = [NSData dataWithContentsOfFile:executablePath];
NSMutableData *binary = originalData.mutableCopy;
...// 检查
if (!binary)return OPErrorRead;
然后自定义了一个thin_header
// we pass around this header which includes some extra information
// and a 32-bit header which we used for both 32-bit and 64-bit files
// since the 64-bit just adds an extra field to the end which we don't need
struct thin_header {uint32_t offset;uint32_t size;struct mach_header header;
};
struct thin_header headers[4];
uint32_t numHeaders = 0;
headersFromBinary(headers, binary, &numHeaders);if (numHeaders == 0) {LOG("No compatible architecture found");return OPErrorIncompatibleBinary;
}...
主要看下headersFromBinary这个函数
// 这部分的逻辑主要是检查macho文件的魔数字段,然后确定是FAT,还是单独架构的文件,然后将填充到自定义的header结构体
struct thin_header *headersFromBinary(struct thin_header *headers, NSData *binary, uint32_t *amount) {// In a MachO/FAT binary the first 4 bytes is a magic number// which gives details about the type of binary it is// CIGAM and co. mean the target binary has a byte order// in reverse relation to the host machine so we have to swap the bytesuint32_t magic = [binary intAtOffset:0];bool shouldSwap = magic == MH_CIGAM || magic == MH_CIGAM_64 || magic == FAT_CIGAM;
#define SWAP(NUM) shouldSwap ? CFSwapInt32(NUM) : NUMuint32_t numArchs = 0;// a FAT file is basically a collection of thin MachO binariesif (magic == FAT_CIGAM || magic == FAT_MAGIC) {LOG("Found FAT Header");// WE GOT A FAT ONEstruct fat_header fat = *(struct fat_header *)binary.bytes;fat.nfat_arch = SWAP(fat.nfat_arch);int offset = sizeof(struct fat_header);// Loop through the architectures within the FAT binary to find// a thin macho header that we can work with (x86 or x86_64)for (int i = 0; i < fat.nfat_arch; i++) {struct fat_arch arch;arch = *(struct fat_arch *)([binary bytes] + offset);arch.cputype = SWAP(arch.cputype);arch.offset = SWAP(arch.offset);struct thin_header macho = headerAtOffset(binary, arch.offset);if (macho.size > 0) {LOG("Found thin header...");headers[numArchs] = macho;numArchs++;}offset += sizeof(struct fat_arch);}// The binary is thin, meaning it contains only one architecture} else if (magic == MH_MAGIC || magic == MH_MAGIC_64) {// 根据设备的情况这个是这个,然后给thin_header赋值struct thin_header macho = headerAtOffset(binary, 0);if (macho.size > 0) {LOG("Found thin header...");numArchs++;headers[0] = macho;}} else {LOG("No headers found.");}*amount = numArchs;return headers;
}
headerAtOffset函数给thin_header赋值
struct thin_header headerAtOffset(NSData *binary, uint32_t offset) {struct thin_header macho;macho.offset = offset;macho.header = *(struct mach_header *)(binary.bytes + offset);if (macho.header.magic == MH_MAGIC || macho.header.magic == MH_CIGAM) {macho.size = sizeof(struct mach_header);} else {macho.size = sizeof(struct mach_header_64);}if (macho.header.cputype != CPU_TYPE_X86_64 && macho.header.cputype != CPU_TYPE_I386 && macho.header.cputype != CPU_TYPE_ARM && macho.header.cputype != CPU_TYPE_ARM64){macho.size = 0;}return macho;
}
上面这些准备工作执行完成后,可以开始修改二进制文件了。
开始遍历上面准备好的 thin_header 数组,里面有 mach_header 结构体
for (uint32_t i = 0; i < numHeaders; i++) {struct thin_header macho = headers[i];...else if ([package booleanValueForSignature:install]) {NSString *lc = [package firstObjectForSignature:command];uint32_t command = LC_LOAD_DYLIB;if (lc)command = COMMAND(lc);if (command == -1) {LOG("Invalid load command.");return OPErrorInvalidLoadCommand;}if (insertLoadEntryIntoBinary(dylibPath, binary, macho, command)) {LOG("Successfully inserted a %s command for %s", LC(command), CPU(macho.header.cputype));} else {LOG("Failed to insert a %s command for %s", LC(command), CPU(macho.header.cputype));return OPErrorInsertFailure;}
根据下面这段COMMAND的宏,因为命令传的是load所以就是LC_LOAD_DYLIB,说明是一条加载动态库的指令,将待加载的动态库,二进制文件,macho文件头信息,加载命令都传到insertLoadEntryIntoBinary
函数中
#define COMMAND(str) ({ \uint32_t cmd = -1; \if ([str isEqualToString: @"reexport"]) \cmd = LC_REEXPORT_DYLIB; \else if ([str isEqualToString: @"weak"]) \cmd = LC_LOAD_WEAK_DYLIB; \else if ([str isEqualToString: @"upward"]) \cmd = LC_LOAD_UPWARD_DYLIB; \else if ([str isEqualToString: @"load"]) \cmd = LC_LOAD_DYLIB; \cmd; \
})
进入到 insertLoadEntryIntoBinary
函数,这段就是在处理macho来增加动态库加载的逻辑
BOOL insertLoadEntryIntoBinary(NSString *dylibPath, NSMutableData *binary, struct thin_header macho, uint32_t type) {// 异常保护if (type != LC_REEXPORT_DYLIB &&type != LC_LOAD_WEAK_DYLIB &&type != LC_LOAD_UPWARD_DYLIB &&type != LC_LOAD_DYLIB) {LOG("Invalid load command type");return NO;}// parse load commands to see if our load command is already thereuint32_t lastOffset = 0;// 是否重复添加if (binaryHasLoadCommandForDylib(binary, dylibPath, &lastOffset, macho)) {// there already exists a load command for this payload so change the command typeuint32_t originalType = *(uint32_t *)(binary.bytes + lastOffset);if (originalType != type) {LOG("A load command already exists for %s. Changing command type from %s to desired %s", dylibPath.UTF8String, LC(originalType), LC(type));[binary replaceBytesInRange:NSMakeRange(lastOffset, sizeof(type)) withBytes:&type];} else {LOG("Load command already exists");}return YES;}// create a new load command// 新添加,创建加载动态库命令unsigned int length = (unsigned int)sizeof(struct dylib_command) + (unsigned int)dylibPath.length;// 我添加的dylibPath.length = 51,整个length = 75, padding = 5unsigned int padding = (8 - (length % 8));// check if data we are replacing is null// 获取要添加command的位置的内容NSData *occupant = [binary subdataWithRange:NSMakeRange(macho.header.sizeofcmds + macho.offset + macho.size,length + padding)];// All operations in optool try to maintain a constant byte size of the executable// so we don't want to append new bytes to the binary (that would break the executable// since everything is offset-based–we'd have to go in and adjust every offset)// So instead take advantage of the huge amount of padding after the load commands// 比较如果非空,说明到了代码的内容部分,就有问题。添加命令只能在Load Command后面且在代码内容前if (strcmp([occupant bytes], "\0")) {NSLog(@"cannot inject payload into %s because there is no room", dylibPath.fileSystemRepresentation);return NO;}LOG("Inserting a %s command for architecture: %s", LC(type), CPU(macho.header.cputype));struct dylib_command command;struct dylib dylib;/** A variable length string in a load command is represented by an lc_str* union. The strings are stored just after the load command structure and* the offset is from the start of the load command structure. The size* of the string is reflected in the cmdsize field of the load command.* Once again any padded bytes to bring the cmdsize field to a multiple* of 4 bytes must be zero.* lc_str 的注释*/// 所以offset是dylib command的大小dylib.name.offset = sizeof(struct dylib_command);dylib.timestamp = 2; // load commands I've seen use 2 for some reasondylib.current_version = 0;dylib.compatibility_version = 0;command.cmd = type;command.dylib = dylib;// 命令的长度等于内容 + 对齐command.cmdsize = length + padding;unsigned int zeroByte = 0;NSMutableData *commandData = [NSMutableData data];// 动态库 command的内容[commandData appendBytes:&command length:sizeof(struct dylib_command)];// 动态库的路径字符串[commandData appendData:[dylibPath dataUsingEncoding:NSASCIIStringEncoding]];// 默认补0[commandData appendBytes:&zeroByte length:padding];// remove enough null bytes to account of our inserted data// 这部分内容用0覆盖,最后一个参数传0就是指定覆盖区间就是Range的length[binary replaceBytesInRange:NSMakeRange(macho.offset + macho.header.sizeofcmds + macho.size, commandData.length)withBytes:0length:0];// insert the data// 添加data[binary replaceBytesInRange:NSMakeRange(lastOffset, 0) withBytes:commandData.bytes length:commandData.length];// fix the existing header// 元数据的修改macho.header.ncmds += 1;macho.header.sizeofcmds += command.cmdsize;// this is safe to do in 32bit because the 4 bytes after the header are still being put back// 替换mach的header部分[binary replaceBytesInRange:NSMakeRange(macho.offset, sizeof(macho.header)) withBytes:&macho.header];return YES;
}
效果
最后是加载在LC_LOAD_DYLIB这块
insertLoadEntryIntoBinary
函数
// 替换的时候有个lastOffset[binary replaceBytesInRange:NSMakeRange(lastOffset, 0) withBytes:commandData.bytes length:commandData.length];// 上面检测是否是重复添加时,会去计算if (binaryHasLoadCommandForDylib(binary, dylibPath, &lastOffset, macho)) {
...
// 只有这个LC_LOAD_DYLIB命令会修改loadOffset
case LC_LOAD_DYLIB: {struct dylib_command command = *(struct dylib_command *)(binary.bytes + binary.currentOffset);char *name = (char *)[[binary subdataWithRange:NSMakeRange(binary.currentOffset + command.dylib.name.offset, command.cmdsize - command.dylib.name.offset)] bytes];if ([@(name) isEqualToString:dylib]) {*lastOffset = (unsigned int)binary.currentOffset;return YES;}binary.currentOffset += size;loadOffset = (unsigned int)binary.currentOffset;break;}
...
// 这个lastOffset 就是最后一个LC_LOAD_DYLIB的位置
if (lastOffset != NULL)*lastOffset = loadOffset;
optool 添加有限制,但是LoadCommand末尾到代码部分的空间对添加几条命令还是足够的
02x 参考
- va_list、va_start和va_end使用
- replaceBytes(in:withBytes:length:)