教程：如何有效运行阿里千问QwQ-32B

2025-03-08 11K banq

阿里千问Qwen 发布了 QwQ-32B - 一个在许多基准测试中性能可与 DeepSeek-R1 相媲美的推理模型。然而，人们一直在经历无限的生成、多次重复、标记问题和微调问题。我们希望本指南能够帮助调试和修复大多数问题！

如果你在 QwQ-32B 上重复了无数次，你并不孤单！我制作了一个指南来帮助调试东西！我还上传了动态 4 位量化和其他 GGUF！指南链接：https://docs.unsloth.ai/basics/tutorial-how-to-run-qwq-32b-effectively

要点：

当使用重复惩罚来抵消循环时，它反而会导致循环！
Qwen 团队确认，对于长上下文（128K），您应该使用 YaRN。
当使用重复惩罚时，添加--samplers "top_k;top_p;min_p;temperature;dry;typ_p;xtc"以停止无限生成。
使用min_p = 0.1有助于删除低概率的标记。
尝试使用 --repeat-penalty 1.1 --dry-multiplier 0.5以减少重复。
请--temp 0.6 --top-k 40 --top-p 0.95按照 Qwen 团队的建议使用。

以下是该教程介绍运行的9个步骤，详细点击标题：

1、在此处获取 GitHubllama.cpp上的最新版本。您也可以按照下面的构建说明进行操作。如果您没有 GPU 或只想要 CPU 推理，请更改为。-DGGML_CUDA=ON-DGGML_CUDA=OFF

apt-get update
apt-get install build-essential cmake curl libcurl4-openssl-dev -y
git clone https://github.com/ggerganov/llama.cpp
cmake llama.cpp -B llama.cpp/build \
    -DBUILD_SHARED_LIBS=ON -DGGML_CUDA=ON -DLLAMA_CURL=ON
cmake --build llama.cpp/build --config Release -j --clean-first --target llama-quantize llama-cli llama-gguf-split
cp llama.cpp/build/bin/llama-* llama.cpp

2、通过（安装后）下载模型pip install huggingface_hub hf_transfer。您可以选择 Q4_K_M，或其他量化版本（如 BF16 全精度）。更多版本请访问： https: //huggingface.co/unsloth/QwQ-32B-GGUF

# !pip install huggingface_hub hf_transfer
import os
os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"
from huggingface_hub import snapshot_download
snapshot_download(
    repo_id = "unsloth/QwQ-32B-GGUF",
    local_dir = "unsloth-QwQ-32B-GGUF",
    allow_patterns = ["*Q4_K_M*"], # For Q4_K_M
)

3、运行 Unsloth 的 Flappy Bird 测试，它会将输出保存到Q4_K_M_yes_samplers.txt

4、编辑--threads 32CPU 线程数、--ctx-size 16384上下文长度、--n-gpu-layers 99GPU 在多少层上卸载。如果您的 GPU 内存不足，请尝试调整它。如果您只有 CPU 推理，也可以将其删除。

5、我们使用--repeat-penalty 1.1并且--dry-multiplier 0.5您可以调整。

./llama.cpp/llama-cli \
    --model unsloth-QwQ-32B-GGUF/QwQ-32B-Q4_K_M.gguf \
    --threads 32 \
    --ctx-size 16384 \
    --n-gpu-layers 99 \
    --seed 3407 \
    --prio 2 \
    --temp 0.6 \
    --repeat-penalty 1.1 \
    --dry-multiplier 0.5 \
    --min-p 0.1 \
    --top-k 40 \
    --top-p 0.95 \
    -no-cnv \
    --samplers "top_k;top_p;min_p;temperature;dry;typ_p;xtc" \
    --prompt "<|im_start|>user\nCreate a Flappy Bird game in Python. You must include these things:\n1. You must use pygame.\n2. The background color should be randomly chosen and is a light shade. Start with a light blue color.\n3. Pressing SPACE multiple times will accelerate the bird.\n4. The bird's shape should be randomly chosen as a square, circle or triangle. The color should be randomly chosen as a dark color.\n5. Place on the bottom some land colored as dark brown or yellow chosen randomly.\n6. Make a score shown on the top right side. Increment if you pass pipes and don't hit them.\n7. Make randomly spaced pipes with enough space. Color them randomly as dark green or light brown or a dark gray shade.\n8. When you lose, show the best score. Make the text inside the screen. Pressing q or Esc will quit the game. Restarting is pressing SPACE again.\nThe final game should be inside a markdown section in Python. Check your code for errors and fix them before the final markdown section.<|im_end|>\n<|im_start|>assistant\n<think>\n"  \
        2>&1 | tee Q4_K_M_yes_samplers.txt

来自我们https://unsloth.ai/blog/deepseekr1-dynamic 1.58bit 博客的完整输入是：

<|im_start|>user
Create a Flappy Bird game in Python. You must include these things:
1. You must use pygame.
2. The background color should be randomly chosen and is a light shade. Start with a light blue color.
3. Pressing SPACE multiple times will accelerate the bird.
4. The bird's shape should be randomly chosen as a square, circle or triangle. The color should be randomly chosen as a dark color.
5. Place on the bottom some land colored as dark brown or yellow chosen randomly.
6. Make a score shown on the top right side. Increment if you pass pipes and don't hit them.
7. Make randomly spaced pipes with enough space. Color them randomly as dark green or light brown or a dark gray shade.
8. When you lose, show the best score. Make the text inside the screen. Pressing q or Esc will quit the game. Restarting is pressing SPACE again.
The final game should be inside a markdown section in Python. Check your code for errors and fix them before the final markdown section.<|im_end|>
<|im_start|>assistant
<think>

删除思考部分之后，最终 Python 输出的开头和结尾：

import pygame
import random
import sys

pygame.init()
<strong>Continues</strong>

class Bird:
    def <strong>init</strong>(self):
        <strong>Continues</strong>

def main():
    best_score = 0
    current_score = 0
    game_over = False
    pipes = []
    first_time = True  # Track first game play

    # Initial setup
    background_color = (173, 216, 230)  # Light blue initially
    land_color = random.choice(land_colors)
    bird = Bird()

    while True:
        for event in pygame.event.get():
            <strong>Continues</strong>

        if not game_over:
            # Update bird and pipes
            bird.update()
            <strong>Continues</strong>

        # Drawing
        <strong>Continues</strong>
        pygame.display.flip()
        clock.tick(60)

if <strong>name</strong> == "<strong>main</strong>":
    main()

6、运行它，我们得到一个可运行的游戏！

7、现在尝试同样的操作，无需我们的修复！因此删除--samplers "top_k;top_p;min_p;temperature;dry;typ_p;xtc" 这将保存输出到Q4_K_M_no_samplers.txt

./llama.cpp/llama-cli \
    --model unsloth-QwQ-32B-GGUF/QwQ-32B-Q4_K_M.gguf \
    --threads 32 \
    --ctx-size 16384 \
    --n-gpu-layers 99 \
    --seed 3407 \
    --prio 2 \
    --temp 0.6 \
    --repeat-penalty 1.1 \
    --dry-multiplier 0.5 \
    --min-p 0.1 \
    --top-k 40 \
    --top-p 0.95 \
    -no-cnv \
    --prompt "<|im_start|>user\nCreate a Flappy Bird game in Python. You must include these things:\n1. You must use pygame.\n2. The background color should be randomly chosen and is a light shade. Start with a light blue color.\n3. Pressing SPACE multiple times will accelerate the bird.\n4. The bird's shape should be randomly chosen as a square, circle or triangle. The color should be randomly chosen as a dark color.\n5. Place on the bottom some land colored as dark brown or yellow chosen randomly.\n6. Make a score shown on the top right side. Increment if you pass pipes and don't hit them.\n7. Make randomly spaced pipes with enough space. Color them randomly as dark green or light brown or a dark gray shade.\n8. When you lose, show the best score. Make the text inside the screen. Pressing q or Esc will quit the game. Restarting is pressing SPACE again.\nThe final game should be inside a markdown section in Python. Check your code for errors and fix them before the final markdown section.<|im_end|>\n<|im_start|>assistant\n<think>\n"  \
        2>&1 | tee Q4_K_M_no_samplers.txt

您将遇到一些循环，但Python 语法不正确，并且存在许多其他问题。例如，下面的代码看起来正确，但其实是错误的！例如第 39 行pipes.clear() ### <<< NameError: name 'pipes' is not defined. Did you forget to import 'pipes'?

import pygame
import random

pygame.init()

# Constants
WIDTH, HEIGHT = 800, 600
GROUND_HEIGHT = 20
GRAVITY = 0.7
PIPE_SPEED = -3
BIRD_SIZE = 45
MIN_GAP = 130
MAX_GAP = 200
PIPE_COLORS = [(0, 96, 0), (205, 133, 63), (89, 97, 107)]
DARK_BROWN = (94, 72, 4)
YELLOW = (252, 228, 6)

screen = pygame.display.set_mode((WIDTH, HEIGHT))
clock = pygame.time.Clock()

def random_light_color():
    return (
        random.randint(180, 230),
        random.randint(190, 300),
        random.randint(250, 255)
    )

def reset_game():
    global bird_x, bird_y
    global pipes, score
    global background_color, land_color
    global bird_shape, bird_color

    # Bird properties
    bird_x = WIDTH * 0.3
    bird_y = HEIGHT // 2
    bird_vel = -5  # Initial upward thrust

    pipes.clear() ### <<< NameError: name 'pipes' is not defined. Did you forget to import 'pipes'?

8、如果使用--repeat-penalty 1.5，情况会变得更糟、更明显，实际上语法完全不正确。

import pygame
from random import randint  # For generating colors/shapes/positions randomly 
pygame.init()

# Constants:
WIDTH, HEIGHT =456 ,702   #
BACKGROUND_COLOR_LIGHTS=['lightskyblue']
GAP_SIZE=189           #

BIRD_RADIUS=3.  
PIPE_SPEED=- ( )    ? 
class Game():
def <strong>init</strong>(self):
        self.screen_size=( )

def reset_game_vars():
    global current_scor e
   # set to zero and other initial states.

# Main game loop:
while running :
     for event in pygame.event.get() : 
        if quit ... etc

pygame.quit()
print("Code is simplified. Due time constraints, full working version requires further implementation.")

9、您可能想知道也许是 Q4_K_M？B16 即全精度应该可以正常工作，对吗？不正确 - 如果我们不使用我们的修复方法，则输出再次失败 --samplers "top_k;top_p;min_p;temperature;dry;typ_p;xtc"使用重复惩罚时。

教程：如何有效运行阿里千问QwQ-32B

什么是Context上下文？

抽象两种方法：上下文与类型

Content与Context一字之差暗藏逆天极道

语境崩塌：你的注意力正被劫持

Context逻辑之道