白嫖資源訓(xùn)練 DeepSeek R1 推理模型 精華
DeepSeek 顛覆了 AI 領(lǐng)域,通過(guò)推出一系列全新高級(jí)推理模型挑戰(zhàn) OpenAI 的主導(dǎo)地位。最棒的是?這些模型完全免費(fèi)使用,沒(méi)有任何限制,每個(gè)人都可以使用。您可以在下面觀看有關(guān)如何微調(diào) DeepSeek 的視頻教程。
在本教程中,我們將在 Hugging Face 的醫(yī)療思維鏈數(shù)據(jù)集上對(duì)模型進(jìn)行微調(diào),微調(diào)的基礎(chǔ)模型為 DeepSeek-R1-Distill-Llama-8B。這個(gè)精簡(jiǎn)的 DeepSeek-R1 模型是通過(guò)在使用 DeepSeek-R1 生成的數(shù)據(jù)上對(duì) Llama 3.1 8B 模型進(jìn)行微調(diào)而創(chuàng)建的。它展示了與原始模型類(lèi)似的推理能力。
如果您是 LLM 和微調(diào)的新手,我強(qiáng)烈建議您參加 Python 中的大型語(yǔ)言模型導(dǎo)論 課程。
DeepSeek R1 簡(jiǎn)介
中國(guó)人工智能公司 DeepSeek AI (深度求索)已開(kāi)源其第一代推理模型 DeepSeek-R1 和 DeepSeek-R1-Zero,它們?cè)跀?shù)學(xué)、編碼和邏輯等推理任務(wù)上的表現(xiàn)可與 OpenAI 的 o1 相媲美。您可以訪問(wèn) DeepSeek 的官方網(wǎng)站 了解更詳細(xì)的內(nèi)容。
DeepSeek-R1-Zero
DeepSeek-R1-Zero 是第一個(gè)完全用大規(guī)模強(qiáng)化學(xué)習(xí)(而不是監(jiān)督式微調(diào))來(lái)訓(xùn)練的開(kāi)源模型。這種方式讓模型能夠自己探索思路鏈推理,解決復(fù)雜問(wèn)題,并不斷改進(jìn)輸出。不過(guò),它也有一些問(wèn)題,比如會(huì)重復(fù)推理步驟、生成的內(nèi)容不容易讀懂,還有可能會(huì)混雜不同的語(yǔ)言,這些都會(huì)影響它的清晰度和實(shí)用性。
DeepSeek-R1
DeepSeek-R1 的推出是為了改進(jìn) DeepSeek-R1-Zero 的不足,通過(guò)在強(qiáng)化學(xué)習(xí)前加入一些初始數(shù)據(jù),為處理推理和非推理任務(wù)打下更好的基礎(chǔ)。這種分階段的訓(xùn)練方法讓模型在數(shù)學(xué)、代碼和推理測(cè)試中的表現(xiàn)達(dá)到了與 OpenAI-o1 相當(dāng)?shù)母咚?,同時(shí)還提高了輸出內(nèi)容的可讀性和連貫性。
DeepSeek 蒸餾
除了那些需要大量計(jì)算資源和內(nèi)存支持的大型語(yǔ)言模型外,DeepSeek 還開(kāi)發(fā)了一系列精簡(jiǎn)版模型。這些更緊湊且高效的模型已經(jīng)證明能夠在推理性能上保持高水平。它們的參數(shù)規(guī)模從 1.5B 到 70B 不等,同時(shí)保留了卓越的推理能力。特別值得一提的是,DeepSeek-R1-Distill-Qwen-32B 模型在多個(gè)基準(zhǔn)測(cè)試中均超過(guò)了 OpenAI-o1-mini 的表現(xiàn)。較小規(guī)模的模型成功地繼承了大規(guī)模模型的推理特性,充分展示了知識(shí)蒸餾技術(shù)的有效性。
來(lái)源:deepseek-ai/DeepSeek-R1
閱讀DeepSeek -R1:功能、o1 比較、提煉模型等博客,了解其主要功能、開(kāi)發(fā)過(guò)程、提煉模型、訪問(wèn)、定價(jià)以及與 OpenAI o1 的比較。
微調(diào)所需資源
模型 | GPU | CPU | 內(nèi)存 | 磁盤(pán) | 耗時(shí) |
DeepSeek-R1-Distill-Llama-8B | T4 x 2 15G | 4核 | 32G | 200G | 23分鐘 |
什么?你說(shuō)上面的配置太高??? 好吧,跟我往下走,教你如何白嫖!??????
微調(diào) DeepSeek R1:分步指南
要微調(diào)DeepSeek R1模型,您可以按照以下步驟操作:
1. 設(shè)置
對(duì)于這個(gè)項(xiàng)目,我們使用 Kaggle 作為我們的 Cloud IDE,因?yàn)樗梢悦赓M(fèi)訪問(wèn) GPU,而這些 GPU 通常比 Google Colab 中提供的 GPU 更強(qiáng)大。首先,啟動(dòng)一個(gè)新的 Kaggle 筆記本,并將您的 Hugging Face 令牌和 Weights & Biases 令牌添加為機(jī)密。關(guān)于如何獲取令牌參考文末 QA 環(huán)節(jié)。
您可以通過(guò)導(dǎo)航到 Add-ons?Kaggle 筆記本界面中的選項(xiàng)卡并選擇Secrets選項(xiàng)來(lái)添加機(jī)密。
設(shè)置機(jī)密后,安裝 unslothPython 包。Unsloth 是一個(gè)開(kāi)源框架,旨在使微調(diào)大型語(yǔ)言模型 (LLM) 的速度提高 2 倍,并且更節(jié)省內(nèi)存。
閱讀我們的 Unsloth 指南:優(yōu)化和加速 LLM 微調(diào),以了解 Unsloth 的主要特性、各種功能以及如何優(yōu)化您的微調(diào)工作流程。
!pip install unsloth
!pip install --force-reinstall --no-cache-dir --no-deps git+https://github.com/unslothai/unsloth.git
使用我們從 Kaggle Secrets 中安全提取的 Hugging Face API 登錄到 Hugging Face CLI。
from huggingface_hub import login
from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()
hf_token = user_secrets.get_secret("HUGGINGFACE_TOKEN")
login(hf_token)
使用您的 API 密鑰登錄 Weights & Biases(wandb)并創(chuàng)建一個(gè)新項(xiàng)目來(lái)跟蹤實(shí)驗(yàn)和微調(diào)進(jìn)度。
import wandb
wb_token = user_secrets.get_secret("wandb")
wandb.login(key=wb_token)
run = wandb.init(
project='Fine-tune-DeepSeek-R1-Distill-Llama-8B on Medical COT Dataset',
job_type="training",
annotallow="allow"
)
2. 加載模型和標(biāo)記器
對(duì)于這個(gè)項(xiàng)目,我們正在加載DeepSeek-R1-Distill-Llama-8B 的 Unsloth 版本。此外,我們將以 4 位量化加載模型,以?xún)?yōu)化內(nèi)存使用和性能。
from unsloth import FastLanguageModel
max_seq_length = 2048
dtype = None
load_in_4bit = True
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "unsloth/DeepSeek-R1-Distill-Llama-8B",
max_seq_length = max_seq_length,
dtype = dtype,
load_in_4bit = load_in_4bit,
token = hf_token,
)
3. 微調(diào)前的模型推理
為了為模型創(chuàng)建提示樣式,我們將定義一個(gè)系統(tǒng)提示,并包含用于生成問(wèn)題和響應(yīng)的占位符。提示將引導(dǎo)模型逐步思考并提供合乎邏輯且準(zhǔn)確的響應(yīng)。
prompt_style = """Below is an instruction that describes a task, paired with an input that provides further context.
Write a response that appropriately completes the request.
Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.
### Instruction:
You are a medical expert with advanced knowledge in clinical reasoning, diagnostics, and treatment planning.
Please answer the following medical question.
### Question:
{}
### Response:
<think>{}"""
## =========================以下為中文翻譯======================================
prompt_style = """以下是一條描述任務(wù)的指令,以及為其提供更多背景信息的輸入內(nèi)容。請(qǐng)給出一個(gè)能恰當(dāng)完成該請(qǐng)求的回復(fù)。在回答之前,仔細(xì)思考問(wèn)題,并創(chuàng)建一個(gè)逐步的思路鏈,以確?;貜?fù)符合邏輯且準(zhǔn)確。
### 指令:
你是一位在臨床推理、診斷和治療計(jì)劃方面擁有高級(jí)知識(shí)的醫(yī)學(xué)專(zhuān)家。請(qǐng)回答以下醫(yī)學(xué)問(wèn)題。
### 問(wèn)題:
{}
### 回復(fù):
<think>{}"""
在這個(gè)例子中,我們將向提供一個(gè)醫(yī)療問(wèn)題 prompt_style,將其轉(zhuǎn)換為標(biāo)記,然后將標(biāo)記傳遞給模型進(jìn)行響應(yīng)生成。
question = "A 61-year-old woman with a long history of involuntary urine loss during activities like coughing or sneezing but no leakage at night undergoes a gynecological exam and Q-tip test. Based on these findings, what would cystometry most likely reveal about her residual volume and detrusor contractions?"
FastLanguageModel.for_inference(model)
inputs = tokenizer([prompt_style.format(question, "")], return_tensors="pt").to("cuda")
outputs = model.generate(
input_ids=inputs.input_ids,
attention_mask=inputs.attention_mask,
max_new_tokens=1200,
use_cache=True,
)
response = tokenizer.batch_decode(outputs)
print(response[0].split("### Response:")[1])
## =========================以下為中文翻譯======================================
一位 61 歲的女性,有長(zhǎng)期在咳嗽或打噴嚏等活動(dòng)時(shí)不自主漏尿但夜間無(wú)漏尿的病史,進(jìn)行了婦科檢查和棉簽試驗(yàn)?;谶@些發(fā)現(xiàn),膀胱測(cè)壓最有可能揭示她的殘余尿量和逼尿肌收縮情況如何?
英文效果
中文效果
即使沒(méi)有微調(diào),我們的模型也成功地生成了思路鏈,并在給出最終答案之前進(jìn)行了推理。推理過(guò)程封裝在 <think></think> 標(biāo)簽中。
那么,為什么我們還需要微調(diào)呢?推理過(guò)程雖然詳細(xì),但卻冗長(zhǎng)而不簡(jiǎn)潔。此外,最終答案是以項(xiàng)目符號(hào)格式呈現(xiàn)的,這偏離了我們想要微調(diào)的數(shù)據(jù)集的結(jié)構(gòu)和風(fēng)格。
<think>
Okay, so I have this medical question to answer. Let me try to break it down. The patient is a 61-year-old woman with a history of involuntary urine loss during activities like coughing or sneezing, but she doesn't leak at night. She's had a gynecological exam and a Q-tip test. I need to figure out what cystometry would show regarding her residual volume and detrusor contractions.
First, I should recall what I know about urinary incontinence. Involuntary urine loss during activities like coughing or sneezing makes me think of stress urinary incontinence. Stress incontinence typically happens when the urethral sphincter isn't strong enough to resist increased abdominal pressure from activities like coughing, laughing, or sneezing. This usually affects women, especially after childbirth when the pelvic muscles and ligaments are weakened.
The Q-tip test is a common diagnostic tool for stress urinary incontinence. The test involves inserting a Q-tip catheter, which is a small balloon catheter, into the urethra. The catheter is connected to a pressure gauge. The patient is asked to cough, and the pressure reading is taken. If the pressure is above normal (like above 100 mmHg), it suggests that the urethral sphincter isn't closing properly, which is a sign of stress incontinence.
So, based on the history and the Q-tip test, the diagnosis is likely stress urinary incontinence. Now, moving on to what cystometry would show. Cystometry, also known as a filling cystometry, is a diagnostic procedure where a catheter is inserted into the bladder, and the bladder is filled with a liquid to measure how much it can hold (residual volume) and how it responds to being filled (like during a cough or sneeze). This helps in assessing the capacity and compliance of the bladder.
In a patient with stress incontinence, the bladder's capacity might be normal, but the sphincter's function is impaired. So, during the cystometry, the residual volume might be within normal limits because the bladder isn't overfilled. However, when the patient is asked to cough or perform a Valsalva maneuver, the detrusor muscle (the smooth muscle layer of the bladder) might not contract effectively, leading to an increase in intra-abdominal pressure, which might cause leakage.
Wait, but detrusor contractions are usually associated with voiding. In stress incontinence, the issue isn't with the detrusor contractions but with the sphincter's inability to prevent leakage. So, during cystometry, the detrusor contractions would be normal because they are part of the normal voiding process. However, the problem is that the sphincter doesn't close properly, leading to leakage.
So, putting it all together, the residual volume might be normal, but the detrusor contractions would be normal as well. The key finding would be the impaired sphincter function leading to incontinence, which is typically demonstrated during the Q-tip test and clinical history. Therefore, the cystometry would likely show normal residual volume and normal detrusor contractions, but the underlying issue is the sphincter's inability to prevent leakage.
</think>
Based on the provided information, the cystometry findings in this 61-year-old woman with stress urinary incontinence would likely demonstrate the following:
1. **Residual Volume**: The residual volume would be within normal limits. This is because the bladder's capacity is typically normal in cases of stress incontinence, where the primary issue lies with the sphincter function rather than the bladder's capacity.
2. **Detrusor Contractions**: The detrusor contractions would also be normal. These contractions are part of the normal voiding process and are not impaired in stress urinary incontinence. The issue is not with the detrusor muscle but with the sphincter's inability to prevent leakage.
In summary, the key findings of the cystometry would be normal residual volume and normal detrusor contractions, highlighting the sphincteric defect as the underlying cause of the incontinence.<|end▁of▁sentence|>
4.加載和處理數(shù)據(jù)集
我們將通過(guò)添加復(fù)雜思路鏈列的第三個(gè)占位符來(lái)稍微改變處理數(shù)據(jù)集的提示樣式。
train_prompt_style = """Below is an instruction that describes a task, paired with an input that provides further context.
Write a response that appropriately completes the request.
Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.
### Instruction:
You are a medical expert with advanced knowledge in clinical reasoning, diagnostics, and treatment planning.
Please answer the following medical question.
### Question:
{}
### Response:
<think>
{}
</think>
{}"""
## =========================以下為中文翻譯======================================
train_prompt_style = """以下是一個(gè)描述任務(wù)的指令,與提供進(jìn)一步上下文的輸入相配對(duì)。寫(xiě)出一個(gè)適當(dāng)完成請(qǐng)求的回應(yīng)。在回答之前,仔細(xì)思考問(wèn)題并創(chuàng)建一個(gè)逐步的思維鏈,以確保邏輯準(zhǔn)確的回應(yīng)。
### 指令:
您是醫(yī)學(xué)專(zhuān)家,在臨床推理、診斷和治療計(jì)劃方面擁有先進(jìn)的知識(shí)。
請(qǐng)回答以下醫(yī)學(xué)問(wèn)題。
### 問(wèn)題:
{}
### 響應(yīng):
<think>
{}
</think>
{}"""
編寫(xiě) Python 函數(shù),在數(shù)據(jù)集中創(chuàng)建一個(gè)“文本”列,該列由訓(xùn)練提示樣式組成。用問(wèn)題、文本鏈和答案填充占位符。
我們從 Hugging Face 獲取 醫(yī)療行業(yè)的思維鏈數(shù)據(jù)集 中加載前 500 個(gè)樣本。之后,我們將 text ?使用formatting_prompts_func 函數(shù)映射列。
from datasets import load_dataset
dataset = load_dataset("FreedomIntelligence/medical-o1-reasoning-SFT","en", split = "train[0:500]",trust_remote_code=True)
dataset = dataset.map(formatting_prompts_func, batched = True,)
dataset["text"][0]
數(shù)據(jù)集樣例
正如我們所看到的,文本列有一個(gè)系統(tǒng)提示、說(shuō)明、思路鏈以及答案。
"Below is an instruction that describes a task, paired with an input that provides further context. \n
Write a response that appropriately completes the request. \n
Before answering, think carefully about the question and create a step-by-step chain of thoughts to ensure a logical and accurate response.\n\n
### Instruction:\n
You are a medical expert with advanced knowledge in clinical reasoning, diagnostics, and treatment planning. \n
Please answer the following medical question. \n\n
### Question:\n
A 61-year-old woman with a long history of involuntary urine loss during activities like coughing or sneezing but no leakage at night undergoes a gynecological exam and Q-tip test. Based on these findings, what would cystometry most likely reveal about her residual volume and detrusor contractions?\n\n
### Response:\n
<think>\n
Okay, let's think about this step by step. There's a 61-year-old woman here who's been dealing with involuntary urine leakages whenever she's doing something that ups her abdominal pressure like coughing or sneezing. This sounds a lot like stress urinary incontinence to me. Now, it's interesting that she doesn't have any issues at night; she isn't experiencing leakage while sleeping. This likely means her bladder's ability to hold urine is fine when she isn't under physical stress. Hmm, that's a clue that we're dealing with something related to pressure rather than a bladder muscle problem. \n\nThe fact that she underwent a Q-tip test is intriguing too. This test is usually done to assess urethral mobility. In stress incontinence, a Q-tip might move significantly, showing urethral hypermobility. This kind of movement often means there's a weakness in the support structures that should help keep the urethra closed during increases in abdominal pressure. So, that's aligning well with stress incontinence.\n\nNow, let's think about what would happen during cystometry. Since stress incontinence isn't usually about sudden bladder contractions, I wouldn't expect to see involuntary detrusor contractions during this test. Her bladder isn't spasming or anything; it's more about the support structure failing under stress. Plus, she likely empties her bladder completely because stress incontinence doesn't typically involve incomplete emptying. So, her residual volume should be pretty normal. \n\n
All in all, it seems like if they do a cystometry on her, it will likely show a normal residual volume and no involuntary contractions. Yup, I think that makes sense given her symptoms and the typical presentations of stress urinary incontinence.\n
</think>\n
Cystometry in this case of stress urinary incontinence would most likely reveal a normal post-void residual volume, as stress incontinence typically does not involve issues with bladder emptying. Additionally, since stress urinary incontinence is primarily related to physical exertion and not an overactive bladder, you would not expect to see any involuntary detrusor contractions during the test.
<|end▁of▁sentence|>"
5. 建立模型
使用目標(biāo)模型,我們將通過(guò)向模型添加低秩適配器來(lái)建立模型。
model = FastLanguageModel.get_peft_model(
model,
r=16,
target_modules=[
"q_proj",
"k_proj",
"v_proj",
"o_proj",
"gate_proj",
"up_proj",
"down_proj",
],
lora_alpha=16,
lora_dropout=0,
bias="none",
use_gradient_checkpointing="unsloth", # True or "unsloth" for very long context
random_state=3407,
use_rslora=False,
loftq_config=None,
)
接下來(lái),我們將設(shè)置訓(xùn)練參數(shù)并創(chuàng)建訓(xùn)練器,通過(guò)提供模型、分詞器、數(shù)據(jù)集以及其他重要的訓(xùn)練參數(shù),這些參數(shù)將優(yōu)化我們的微調(diào)過(guò)程。
from trl import SFTTrainer
from transformers import TrainingArguments
from unsloth import is_bfloat16_supported
trainer = SFTTrainer(
model=model,
tokenizer=tokenizer,
train_dataset=dataset,
dataset_text_field="text",
max_seq_length=max_seq_length,
dataset_num_proc=2,
args=TrainingArguments(
per_device_train_batch_size=2,
gradient_accumulation_steps=4,
# Use num_train_epochs = 1, warmup_ratio for full training runs!
warmup_steps=5,
max_steps=60,
learning_rate=2e-4,
fp16=not is_bfloat16_supported(),
bf16=is_bfloat16_supported(),
logging_steps=10,
optim="adamw_8bit",
weight_decay=0.01,
lr_scheduler_type="linear",
seed=3407,
output_dir="outputs",
),
)
如果報(bào)錯(cuò)提示:AttributeError: _unwrapped_old_generate 則更新下庫(kù)
# 更新庫(kù)到最新版本
pip install --upgrade unsloth transformers
# 或者回退到特定版本
pip install unsloth==x.y.z transformers==a.b.c
6.模型訓(xùn)練
運(yùn)行以下命令開(kāi)始訓(xùn)練。
trainer_stats = trainer.train()
等待,訓(xùn)練中,不知道為啥只用一個(gè) GPU 可能是沒(méi)有開(kāi)并行訓(xùn)練的事,后續(xù)可以改下腳本試下
訓(xùn)練過(guò)程耗時(shí) 23 分鐘。訓(xùn)練損失逐漸減少,這是模型性能提高的一個(gè)好兆頭。
登錄 wandb.ai 并查看項(xiàng)目,查看模型評(píng)估報(bào)告。
如果您在運(yùn)行上述代碼時(shí)遇到問(wèn)題,請(qǐng)參閱 微調(diào) DeepSeek R1(推理模型) Kaggle 筆記本。
7. 微調(diào)后的模型推理
為了比較結(jié)果,我們將向微調(diào)模型提出與之前相同的問(wèn)題,看看有什么變化。
question = "A 61-year-old woman with a long history of involuntary urine loss during activities like coughing or sneezing but no leakage at night undergoes a gynecological exam and Q-tip test. Based on these findings, what would cystometry most likely reveal about her residual volume and detrusor contractions?"
FastLanguageModel.for_inference(model) # Unsloth has 2x faster inference!
inputs = tokenizer([prompt_style.format(question, "")], return_tensors="pt").to("cuda")
outputs = model.generate(
input_ids=inputs.input_ids,
attention_mask=inputs.attention_mask,
max_new_tokens=1200,
use_cache=True,
)
response = tokenizer.batch_decode(outputs)
print(response[0].split("### Response:")[1])
這要好得多且更加準(zhǔn)確。思路連貫,答案直截了當(dāng)且在一個(gè)段落中。微調(diào)是成功的。
<think>
Okay, so let's think about this. We have a 61-year-old woman who's been dealing with involuntary urine loss during things like coughing or sneezing, but she's not leaking at night. That suggests she might have some kind of problem with her pelvic floor muscles or maybe her bladder.
Now, she's got a gynecological exam and a Q-tip test. Let's break that down. The Q-tip test is usually used to check for urethral obstruction. If it's positive, that means there's something blocking the urethra, like a urethral stricture or something else.
Given that she's had a positive Q-tip test, it's likely there's a urethral obstruction. That would mean her urethra is narrow, maybe due to a stricture or some kind of narrowing. So, her bladder can't empty properly during activities like coughing because the urethral obstruction is making it hard.
Now, let's think about what happens when her bladder can't empty. If there's a urethral obstruction, the bladder is forced to hold more urine, increasing the residual volume. That's because her bladder doesn't empty completely. So, her residual volume is probably increased.
Also, if her bladder can't empty properly, she might have increased detrusor contractions. These contractions are usually stronger to push the urine out. So, we expect her detrusor contractions to be increased.
Putting it all together, if she has a urethral obstruction and a positive Q-tip test, we'd expect her cystometry results to show increased residual volume and increased detrusor contractions. That makes sense because of the obstruction and how her bladder is trying to compensate by contracting more.
</think>
Based on the findings of the gynecological exam and the positive Q-tip test, it is most likely that the cystometry would reveal increased residual volume and increased detrusor contractions. The positive Q-tip test indicates urethral obstruction, which would force the bladder to retain more urine, thereby increasing the residual volume. Additionally, the obstruction can lead to increased detrusor contractions as the bladder tries to compensate by contracting more to expel the urine.<|end▁of▁sentence|>
8. 本地保存模型
現(xiàn)在,讓我們?cè)诒镜乇4?adopter、full model 和 tokenizer ,以便我們可以在其他項(xiàng)目中使用它們。
new_model_local = "DeepSeek-R1-Medical-COT"
model.save_pretrained(new_model_local)
tokenizer.save_pretrained(new_model_local)
model.save_pretrained_merged(new_model_local, tokenizer, save_method = "merged_16bit",)
9. 將模型推送至 Hugging Face Hub
我們還可以把 adopter, tokenizer, and model 推送到 Hugging Face Hub,以便 AI 社區(qū)可以將此模型集成到他們的系統(tǒng)中來(lái)利用它。
new_model_online = "skyxiaowang/DeepSeek-R1-Medical-COT"
model.push_to_hub(new_model_online)
tokenizer.push_to_hub(new_model_online)
model.push_to_hub_merged(new_model_online, tokenizer, save_method = "merged_16bit"))
注意:要提交到自己的命名空間下,提供的 HF 的 token 必須要有 write 權(quán)限
等待上傳....
ok,上傳完成,登錄 HF 查看,模型已經(jīng)存在
學(xué)習(xí)之旅的下一步是將模型部署到云端。您可以按照 如何使用 BentoML 部署 LLM 指南進(jìn)行操作,該指南提供了使用 BentoML 和 vLLM 等工具高效且經(jīng)濟(jì)高效地部署大型語(yǔ)言模型的分步流程。
或者,如果您更喜歡在本地使用該模型,您可以將其轉(zhuǎn)換為 GGUF 格式并在您的機(jī)器上運(yùn)行。為此,請(qǐng)查看 微調(diào) Llama 3.2 并在本地使用:分步指南 指南,其中提供了有關(guān)本地使用的詳細(xì)說(shuō)明。
微調(diào)結(jié)束,記著手動(dòng)關(guān)閉 kaggle 環(huán)境,節(jié)省 GPU 資源
結(jié)論
在人工智能領(lǐng)域,情況正在迅速變化。開(kāi)源社區(qū)正在崛起,挑戰(zhàn)過(guò)去三年中一直統(tǒng)治人工智能領(lǐng)域的專(zhuān)有模型的主導(dǎo)地位。開(kāi)源大型語(yǔ)言模型(LLMs)正變得更好、更快、更高效,使得在較低的計(jì)算和內(nèi)存資源上對(duì)其進(jìn)行微調(diào)比以往任何時(shí)候都更容易。在本教程中,我們探索了 DeepSeek R1 推理模型,并學(xué)習(xí)了如何對(duì)其精簡(jiǎn)版本進(jìn)行微調(diào)以用于醫(yī)療問(wèn)答任務(wù)。經(jīng)過(guò)微調(diào)的推理模型不僅能提高性能,還能使其在醫(yī)學(xué)、緊急服務(wù)和醫(yī)療保健等關(guān)鍵領(lǐng)域得到應(yīng)用。為了應(yīng)對(duì) DeepSeek R1 的推出,OpenAI 推出了兩個(gè)強(qiáng)大的工具:OpenAI 的 o3,一個(gè)更先進(jìn)的推理模型,以及由新的計(jì)算機(jī)使用代理(CUA)模型驅(qū)動(dòng)的 OpenAI 的 Operator AI 代理,它可以自主瀏覽網(wǎng)站并執(zhí)行任務(wù)。xAI 推出了帶深度思考的 Grok 3,一個(gè)用 20 萬(wàn)塊顯卡訓(xùn)練的大模型,性能超過(guò)所有同類(lèi)開(kāi)源和閉源模型,但是實(shí)測(cè)也差強(qiáng)人意,每日智能免費(fèi)問(wèn)兩次,收費(fèi)也貴的嚇人,得到了 30 美元/月,我摸了摸錢(qián)包還是很自覺(jué)的去用 DeepSeek R1 了,免費(fèi)又好用,誰(shuí)能不愛(ài)?
如果你覺(jué)著一步一步的寫(xiě)代碼比較費(fèi)時(shí),不要緊我已經(jīng)給你準(zhǔn)備好了懶人腳本,如下:
https://www.kaggle.com/code/kingabzpro/fine-tuning-deepseek-r1-reasoning-model
你說(shuō)我對(duì)你好不好???
關(guān)于小白問(wèn)題的 QA 解答
1. 如何獲取 HF 令牌
訪問(wèn) Hugging Face 官網(wǎng) 并登錄你的賬戶(hù)。
點(diǎn)擊右上角你的頭像,選擇 “Settings”(設(shè)置)。
在左側(cè)菜單中選擇 “Access Tokens”(訪問(wèn)令牌)。
點(diǎn)擊 “New token”(新令牌),為令牌設(shè)置一個(gè)名稱(chēng),選擇合適的權(quán)限(通常選擇 “read” 即可),然后點(diǎn)擊 “Generate a token”(生成令牌),復(fù)制生成的令牌。
2. 如何獲取 Weights & Biases 令牌
訪問(wèn) Weights & Biases 官網(wǎng) 并登錄你的賬戶(hù)。
點(diǎn)擊右上角你的頭像,選擇 “Settings”(設(shè)置)。
在 “API Keys”(API 密鑰)部分,點(diǎn)擊 “Generate”(生成),復(fù)制生成的 API 密鑰。
3. Kaggle 使用
添加密鑰
開(kāi)啟免費(fèi) GPU
點(diǎn)星標(biāo),不迷路,獲取最新最前沿的人工智能技術(shù)
圖片
[1] Python 中的大型語(yǔ)言模型導(dǎo)論:https://www.datacamp.com/courses/introduction-to-llms-in-python
[2] 強(qiáng)化學(xué)習(xí):基于 Python 示例的介紹:https://www.datacamp.com/tutorial/reinforcement-learning-python-introduction
[3] 思維鏈推理習(xí):https://www.datacamp.com/tutorial/chain-of-thought-prompting
[4] DeepSeek-R1:https://github.com/deepseek-ai/DeepSeek-R1
[5] DeepSeek-R1 功能和 o1 的比較、蒸餾模型等:https://www.datacamp.com/blog/deepseek-r1
[6] Weights & Biases 官網(wǎng)(wandb) : https://wandb.ai/home
[7] kaggle:https://www.kaggle.com/
[8] 原文鏈接:https://www.datacamp.com/tutorial/fine-tuning-deepseek-r1-reasoning-model?utm_source=chatgpt.com
[9] Unsloth 指南:https://www.datacamp.com/tutorial/unsloth-guide-optimize-and-speed-up-llm-fine-tuning
[10] 基模 HF 地址:https://huggingface.co/unsloth/DeepSeek-R1-Distill-Llama-8B
[11] Kaggle 使用指南:https://blog.csdn.net/weixin_42426841/article/details/143591586
[12] 醫(yī)學(xué)思維鏈數(shù)據(jù)集:https://huggingface.co/datasets/FreedomIntelligence/medical-o1-reasoning-SFT?row=46
[13] 微調(diào) DeepSeek R1(推理模型)Kaggle 筆記本 :https://www.kaggle.com/code/kingabzpro/fine-tuning-deepseek-r1-reasoning-model
[14] 如何使用 BentoML 部署 LLM:https://www.datacamp.com/tutorial/deploy-llms-with-bentoml
[15] 微調(diào) Llama 3.2 并在本地使用:分步指南:https://www.datacamp.com/tutorial/fine-tuning-llama-3-2
[16] Hugging Face 官網(wǎng):https://huggingface.co/
[17] OpenAI 的 O3:特性、與 O1 的比較、發(fā)布日期及更多內(nèi)容:https://www.datacamp.com/blog/o3-openai
[18] OpenAI 的 Operator:示例、用例、競(jìng)爭(zhēng)及更多:https://www.datacamp.com/blog/operator
[19] 懶人腳本:https://www.kaggle.com/code/kingabzpro/fine-tuning-deepseek-r1-reasoning-model
[20] DeepSeek 的官方網(wǎng)站:?https://www.deepseek.com/
本文轉(zhuǎn)載自 ??AIGC前沿技術(shù)追蹤??,作者: 喜歡學(xué)習(xí)的小仙女
