esm使用-包括esmfold和embedding
提示:文章写完后,目录可以自动生成,如何生成可参考右边的帮助文档
文章目录
- 前言
- 零、安装
- 一、esmfold的使用
- 二、esm2的embedding
- 1.模型加载与准备
- 2.读入数据
- 3.提取残基级表示
- 4.生成序列级表示(均值池化)
- 5.可视化自注意力接触图
- 6.潜在问题与改进建议
- 7.小结
- 总结
前言
主要参考的是https://github.com/facebookresearch/esm/tree/main
零、安装
先拉取上面的仓库,然后cd到当前路径下开始装环境:
conda env create -f environment.yml
conda activate emsfold
pip install "fair-esm[esmfold]"
pip install 'dllogger @ git+https://github.com/NVIDIA/dllogger.git'
pip install 'openfold @ git+https://github.com/aqlaboratory/openfold.git@4b41059694619831a7db195b7e0988fc4ff3a307'
有很多预训练模型需要科学上网,下载下来再传到安装路径
比如esm2_t33_650M_UR50D.pt
https://dl.fbaipublicfiles.com/fair-esm/models/esm2_t33_650M_UR50D.pt
打开上面的网址下好之后,上传到当前路径:
mv esm2_t33_650M_UR50D.pt ~/.cache/torch/hub/checkpoints/
一、esmfold的使用
测试脚本:
import torch
import esmmodel = esm.pretrained.esmfold_v1()
model = model.eval().cuda()# Optionally, uncomment to set a chunk size for axial attention. This can help reduce memory.
# Lower sizes will have lower memory requirements at the cost of increased speed.
# model.set_chunk_size(128)sequence = "MKTVRQERLKSIVRILERSKEPVSGAQLAEELSVSRQVIVQDIAYLRSLGYNIVATPRGYVLAGG"
# Multimer prediction can be done with chains separated by ':'with torch