当前位置：首页 > news >正文

Pgvector+R2R搭建RAG知识库

news 2025/7/11 17:05:15

背景

R2R是一个采用Python编写的开源AI RAG框架项目，与PostgreSQL技术栈集成度高，运行需求资源少（主要是本人的Macbook air m1内存只有8G）的特点，对部署本地私有化化AI RAG应用友好。

Resource Recommendations

When running R2R, we recommend:

At least 4 vCPU cores
8+GB of RAM (16GB preferred)
50gb + 4x raw data size (size of data to be ingested after converting to TXT) of disk space

安装R2R light

First, install the R2R CLI with the additional light dependencies:

hbu@Pauls-MacBook-Air dragon-flight % pip install 'r2r[core,ingestion-bundle]'

The core and ingestion-bundle dependencies, combined with a Postgres database, provide the necessary components to deploy a user-facing R2R application into production.
If you need advanced features like orchestration or parsing with Unstructured.io then refer to the full installation .

Ref: Configure Postgres database in R2R

环境设置

主要是向量数据库

R2R requires connections to various services. Set up the following environment variables based on your needs:

 # Set Postgres+pgvector settingsexport R2R_POSTGRES_USER=r2rexport R2R_POSTGRES_PASSWORD="*********"export R2R_POSTGRES_HOST=127.0.0.1export R2R_POSTGRES_PORT=5432export R2R_POSTGRES_DBNAME=hbuexport R2R_PROJECT_NAME="dragon"export R2R_POSTGRES_MAX_CONNECTIONS=80

Postgres+pgvector

With R2R you can connect to your own instance of Postgres+pgvector or a remote cloud instance. Refer here for detailed documentation on configuring Postgres inside R2R.

ALTER USER r2r WITH PASSWORD 'password';

R2R配置

R2R uses a TOML configuration file for managing settings, which you can read about here. For local setup, we’ll use the default local_llm configuration. This can be customized to your needs by setting up a standalone project.

R2R Configure

配置内容

hbu@Pauls-MacBook-Air dragon-flight % cat local_llm.toml

[completion]
provider = "litellm"
concurrent_request_limit = 1[completion.generation_config]model = "qwen2:0.5b"temperature = 0.1top_p = 1max_tokens_to_sample = 1_024stream = falseadd_generation_kwargs = { }[database]
provider = "postgres"
user = "hbu"
password = "hbu"
host = "localhost"
port = "5432"
db_name = "hbu"
your_project_name = "dragon_flight"[embedding]
provider = "ollama"
base_model = "mxbai-embed-large"
base_dimension = 1_024
batch_size = 32
add_title_as_prefix = true
concurrent_request_limit = 32[ingestion]
excluded_parsers = [ "mp4" ]

运行R2R

本地大模型

r2r serve --config-path=./ai/R2R/local_llm.toml

检查运行状态

http://localhost:7272/v3/health

准备本地大模型

hbu@Pauls-MacBook-Air dragon-flight % ollama list
NAME                 	ID          	SIZE  	MODIFIED     
qwen2:1.5b           	f6daf2b25194	934 MB	4 months ago	
deepseek-coder:latest	3ddd2d3fc8d2	776 MB	4 months ago	
qwen2:0.5b           	6f48b936a09f	352 MB	4 months ago	
llama2-chinese:latest	cee11d703eee	3.8 GB	4 months ago	
hbu@Pauls-MacBook-Air dragon-flight % ollama run qwen2:0.5b 
>>> Send a message (/? for help)

检查向量数据库postgres


hbu=# \dn架构模式列表名称        |      拥有者       
-------------------+-------------------dragon            | hbudragon-flight     | hbupublic            | pg_database_ownerr2r_dragon_flight | hbu
(4 行记录)hbu=# \dn+架构模式列表名称        |      拥有者       |                存取权限                |          描述          
-------------------+-------------------+----------------------------------------+------------------------dragon            | hbu               |                                        | dragon-flight     | hbu               |                                        | public            | pg_database_owner | pg_database_owner=UC/pg_database_owner+| standard public schema|                   | =U/pg_database_owner                   | r2r_dragon_flight | hbu               |                                        | 
(4 行记录)

导入Samples数据

导入数据

#  R2R toml配置文件问题，导致数据导入失败
hbu@Pauls-MacBook-Air dragon-flight % r2r documents create-samples
/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/pydantic/_internal/_config.py:345: UserWarning: Valid config keys have changed in V2:
* 'fields' has been removedwarnings.warn(message, UserWarning)
Ingesting file: /var/folders/mh/l6_b7c0x7m3bq41r5m25snqc0000gn/T/tmpglpon8br_pg_essay_3.html
Request failed: [Errno 8] nodename nor servname provided, or not known## 采用dragon-flight/ai/R2R/alpha_llm.toml，启动R2R服务，导入数据成功
hbu@Pauls-MacBook-Air R2R % r2r documents create py/core/examples/data/got.txt      
/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/pydantic/_internal/_config.py:345: UserWarning: Valid config keys have changed in V2:
* 'fields' has been removedwarnings.warn(message, UserWarning)
Processing file 1/1: py/core/examples/data/got.txt
{"results": {"message": "Document created and ingested successfully.","task_id": null,"document_id": "6e134818-b245-571d-8073-6581c0a175d8"}
}
----------------------------------------
Time taken: 27.58 secondsProcessed 1 files successfully.