Skip to main content

RWKV-4

RWKV (pronounced RwaKuv) language model is an RNN with GPT-level LLM performance, and it can also be directly trained like a GPT transformer (parallelizable).

It's combining the best of RNN and transformer - great performance, fast inference, fast training, saves VRAM, "infinite" ctxlen, and free text embedding. Moreover it's 100% attention-free, and a LFAI project.

Installation and Setup

  • Install the Python rwkv and tokenizer packages
pip install rwkv tokenizer
Model8bitbf16/fp16fp32
14B16GB28GB>50GB
7B8GB14GB28GB
3B2.8GB6GB12GB
1b51.3GB3GB6GB

See the rwkv pip page for more information about strategies, including streaming and CUDA support.

Usage

RWKV

To use the RWKV wrapper, you need to provide the path to the pre-trained model file and the tokenizer's configuration.

from langchain_community.llms import RWKV

# Test the model

```python

def generate_prompt(instruction, input=None):
if input:
return f"""Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

# Instruction:
{instruction}

# Input:
{input}

# Response:
"""
else:
return f"""Below is an instruction that describes a task. Write a response that appropriately completes the request.

# Instruction:
{instruction}

# Response:
"""


model = RWKV(model="./models/RWKV-4-Raven-3B-v7-Eng-20230404-ctx4096.pth", strategy="cpu fp32", tokens_path="./rwkv/20B_tokenizer.json")
response = model.invoke(generate_prompt("Once upon a time, "))
API Reference:RWKV

Was this page helpful?


You can also leave detailed feedback on GitHub.