偷偷摘套内射激情视频,久久精品99国产国产精,中文字幕无线乱码人妻,中文在线中文a,性爽19p

<bdo id="lc0ws"></bdo>

<pre id="lc0ws"></pre>

<button id="lc0ws"></button>

51CTO首頁(yè)

AI.x社區(qū)

軟考社區(qū)

免費(fèi)課

企業(yè)培訓(xùn)

鴻蒙開(kāi)發(fā)者社區(qū)

信創(chuàng)認(rèn)證

公眾號(hào)矩陣

移動(dòng)端

視頻課免費(fèi)課排行榜短視頻直播課軟考學(xué)堂

全部課程軟考信創(chuàng)認(rèn)證華為認(rèn)證廠商認(rèn)證 IT技術(shù)PMP項(xiàng)目管理免費(fèi)題庫(kù)

在線學(xué)習(xí)

文章資源問(wèn)答課堂專(zhuān)欄直播

51CTO

鴻蒙開(kāi)發(fā)者社區(qū)

51CTO技術(shù)棧

51CTO官微

51CTO學(xué)堂

51CTO博客

CTO訓(xùn)練營(yíng)

鴻蒙開(kāi)發(fā)者社區(qū)訂閱號(hào)

51CTO軟考

51CTO學(xué)堂APP

51CTO學(xué)堂企業(yè)版APP

鴻蒙開(kāi)發(fā)者社區(qū)視頻號(hào)

51CTO軟考題庫(kù)

賬號(hào)設(shè)置退出

LLMLingua：集成LlamaIndex，對(duì)提示進(jìn)行壓縮，提供大語(yǔ)言模型的高效推理

作者：佚名 2023-11-27 15:06:24

大型語(yǔ)言模型(llm)的出現(xiàn)刺激了多個(gè)領(lǐng)域的創(chuàng)新。但是在思維鏈(CoT)提示和情境學(xué)習(xí)(ICL)等策略的驅(qū)動(dòng)下，提示的復(fù)雜性不斷增加，這給計(jì)算帶來(lái)了挑戰(zhàn)。這些冗長(zhǎng)的提示需要大量的資源來(lái)進(jìn)行推理，因此需要高效的解決方案，本文將介紹LLMLingua與專(zhuān)有的LlamaIndex的進(jìn)行集成執(zhí)行高效推理。

大型語(yǔ)言模型(llm)的出現(xiàn)刺激了多個(gè)領(lǐng)域的創(chuàng)新。但是在思維鏈(CoT)提示和情境學(xué)習(xí)(ICL)等策略的驅(qū)動(dòng)下，提示的復(fù)雜性不斷增加，這給計(jì)算帶來(lái)了挑戰(zhàn)。這些冗長(zhǎng)的提示需要大量的資源來(lái)進(jìn)行推理，因此需要高效的解決方案，本文將介紹LLMLingua與專(zhuān)有的LlamaIndex的進(jìn)行集成執(zhí)行高效推理。

LLMLingua是微軟的研究人員發(fā)布在EMNLP 2023的一篇論文，LongLLMLingua是一種通過(guò)快速壓縮增強(qiáng)llm在長(zhǎng)上下文場(chǎng)景中感知關(guān)鍵信息的能力的方法。

LLMLingua與llamindex的協(xié)同工作

LLMLingua作為解決LLM應(yīng)用程序中冗長(zhǎng)提示的開(kāi)創(chuàng)性解決方案而出現(xiàn)。該方法側(cè)重于壓縮冗長(zhǎng)提示，同時(shí)保證語(yǔ)義完整性和提高推理速度。它結(jié)合了各種壓縮策略，提供了一種微妙的方法來(lái)平衡提示長(zhǎng)度和計(jì)算效率。

以下是LLMLingua與LlamaIndex集成的優(yōu)勢(shì):

LLMLingua與LlamaIndex的集成標(biāo)志著llm在快速優(yōu)化方面邁出了重要的一步。LlamaIndex是一個(gè)包含為各種LLM應(yīng)用程序量身定制的預(yù)優(yōu)化提示的專(zhuān)門(mén)的存儲(chǔ)庫(kù)，通過(guò)這種集成LLMLingua可以訪問(wèn)豐富的特定于領(lǐng)域的、經(jīng)過(guò)微調(diào)的提示，從而增強(qiáng)其提示壓縮能力。

LLMLingua的提示壓縮技術(shù)和LlamaIndex的優(yōu)化提示庫(kù)之間的協(xié)同作用提高了LLM應(yīng)用程序的效率。利用LLAMA的專(zhuān)門(mén)提示，LLMLingua可以微調(diào)其壓縮策略，確保保留特定于領(lǐng)域的上下文，同時(shí)減少提示長(zhǎng)度。這種協(xié)作極大地加快了推理速度，同時(shí)保留了關(guān)鍵領(lǐng)域的細(xì)微差別。

LLMLingua與LlamaIndex的集成擴(kuò)展了其對(duì)大規(guī)模LLM應(yīng)用程序的影響。通過(guò)利用LLAMA的專(zhuān)業(yè)提示，LLMLingua優(yōu)化了其壓縮技術(shù)，減輕了處理冗長(zhǎng)提示的計(jì)算負(fù)擔(dān)。這種集成不僅加速了推理，而且確保了關(guān)鍵領(lǐng)域特定信息的保留。

LLMLingua與LlamaIndex的工作流程

使用LlamaIndex實(shí)現(xiàn)LLMLingua涉及到一個(gè)結(jié)構(gòu)化的過(guò)程，該過(guò)程利用專(zhuān)門(mén)的提示庫(kù)來(lái)實(shí)現(xiàn)高效的提示壓縮和增強(qiáng)的推理速度。

1. 框架集成

首先需要在LLMLingua和LlamaIndex之間建立連接。這包括訪問(wèn)權(quán)限、API配置和建立連接，以便及時(shí)檢索。

2. 預(yù)先優(yōu)化提示的檢索

LlamaIndex充當(dāng)專(zhuān)門(mén)的存儲(chǔ)庫(kù)，包含為各種LLM應(yīng)用程序量身定制的預(yù)優(yōu)化提示。LLMLingua訪問(wèn)這個(gè)存儲(chǔ)庫(kù)，檢索特定于域的提示，并利用它們進(jìn)行提示壓縮。

3. 提示壓縮技術(shù)

LLMLingua使用它的提示壓縮方法來(lái)簡(jiǎn)化檢索到的提示。這些技術(shù)專(zhuān)注于壓縮冗長(zhǎng)的提示，同時(shí)確保語(yǔ)義一致性，從而在不影響上下文或相關(guān)性的情況下提高推理速度。

4. 微調(diào)壓縮策略

LLMLingua基于從LlamaIndex獲得的專(zhuān)門(mén)提示來(lái)微調(diào)其壓縮策略。這種細(xì)化過(guò)程確保保留特定于領(lǐng)域的細(xì)微差別，同時(shí)有效地減少提示長(zhǎng)度。

5. 執(zhí)行與推理

一旦使用LLMLingua的定制策略與LlamaIndex的預(yù)優(yōu)化提示進(jìn)行壓縮，壓縮后的提示就可以用于LLM推理任務(wù)。此階段涉及在LLM框架內(nèi)執(zhí)行壓縮提示，以實(shí)現(xiàn)高效的上下文感知推理。

6. 迭代改進(jìn)和增強(qiáng)

代碼實(shí)現(xiàn)不斷地經(jīng)歷迭代的細(xì)化。這個(gè)過(guò)程包括改進(jìn)壓縮算法，優(yōu)化從LlamaIndex中檢索提示，微調(diào)集成，確保壓縮后的提示和LLM推理的一致性和增強(qiáng)的性能。

7. 測(cè)試和驗(yàn)證

如果需要還可以進(jìn)行測(cè)試和驗(yàn)證，這樣可以評(píng)估LLMLingua與LlamaIndex集成的效率和有效性。評(píng)估性能指標(biāo)以確保壓縮提示保持語(yǔ)義完整性并在不影響準(zhǔn)確性的情況下提高推理速度。

代碼實(shí)現(xiàn)

下面我們將開(kāi)始深入研究LLMLingua與LlamaIndex的代碼實(shí)現(xiàn)

安裝程序包：

# Install dependency.
 !pip install llmlingua llama-index openai tiktoken -q 
 
 # Using the OAI
 import openai
 openai.api_key = "<insert_openai_key>"

獲取數(shù)據(jù)：

!wget "https://www.dropbox.com/s/f6bmb19xdg0xedm/paul_graham_essay.txt?dl=1" -O paul_graham_essay.txt

加載模型：

from llama_index import (
    VectorStoreIndex,
    SimpleDirectoryReader,
    load_index_from_storage,
    StorageContext,
 )
 
 # load documents
 documents = SimpleDirectoryReader(
    input_files=["paul_graham_essay.txt"]
 ).load_data()

向量存儲(chǔ)：

index = VectorStoreIndex.from_documents(documents)
 
 retriever = index.as_retriever(similarity_top_k=10)
 
 question = "Where did the author go for art school?"
 
 # Ground-truth Answer
 answer = "RISD"
 
 contexts = retriever.retrieve(question)
 
 contexts = retriever.retrieve(question)
 
 context_list = [n.get_content() for n in contexts]
 len(context_list)
 
 #Output 
 #10

原始提示和返回

# The response from original prompt
 from llama_index.llms import OpenAI
 
 llm = OpenAI(model="gpt-3.5-turbo-16k")
 prompt = "\n\n".join(context_list + [question])
 
 response = llm.complete(prompt)
 print(str(response))
 
 #Output
 The author went to the Rhode Island School of Design (RISD) for art school.

設(shè)置 LLMLingua

from llama_index.query_engine import RetrieverQueryEngine
 from llama_index.response_synthesizers import CompactAndRefine
 from llama_index.indices.postprocessor import LongLLMLinguaPostprocessor
 
 node_postprocessor = LongLLMLinguaPostprocessor(
    instruction_str="Given the context, please answer the final question",
    target_token=300,
    rank_method="longllmlingua",
    additional_compress_kwargs={
        "condition_compare": True,
        "condition_in_question": "after",
        "context_budget": "+100",
        "reorder_context": "sort", # enable document reorder,
        "dynamic_context_compression_ratio": 0.3,
    },
 )

通過(guò)LLMLingua進(jìn)行壓縮

retrieved_nodes = retriever.retrieve(question)
 synthesizer = CompactAndRefine()
 
 from llama_index.indices.query.schema import QueryBundle
 
 
 # postprocess (compress), synthesize
 new_retrieved_nodes = node_postprocessor.postprocess_nodes(
    retrieved_nodes, query_bundle=QueryBundle(query_str=question)
 )
 
 original_contexts = "\n\n".join([n.get_content() for n in retrieved_nodes])
 compressed_contexts = "\n\n".join([n.get_content() for n in new_retrieved_nodes])
 
 original_tokens = node_postprocessor._llm_lingua.get_token_length(original_contexts)
 compressed_tokens = node_postprocessor._llm_lingua.get_token_length(compressed_contexts)

打印2個(gè)結(jié)果對(duì)比：

print(compressed_contexts)
 print()
 print("Original Tokens:", original_tokens)
 print("Compressed Tokens:", compressed_tokens)
 print("Comressed Ratio:", f"{original_tokens/(compressed_tokens + 1e-5):.2f}x")

打印的結(jié)果如下：

next Rtm's advice hadn' included anything that. I wanted to do something completely different, so I decided I'd paint. I wanted to how good I could get if I focused on it. the day after stopped on YC, I painting. I was rusty and it took a while to get back into shape, but it was at least completely engaging.1]
 
 I wanted to back RISD, was now broke and RISD was very expensive so decided job for a year and return RISD the fall. I got one at Interleaf, which made software for creating documents. You like Microsoft Word? Exactly That was I low end software tends to high. Interleaf still had a few years to live yet. []
 
  the Accademia wasn't, and my money was running out, end year back to the
  lot the color class I tookD, but otherwise I was basically myself to do that for in993 I dropped I aroundidence bit then my friend Par did me a big A rent-partment building New York. Did I want it Itt more my place, and York be where the artists. wanted [For when you that ofs you big painting of this type hanging in the apartment of a hedge fund manager, you know he paid millions of dollars for it. That's not always why artists have a signature style, but it's usually why buyers pay a lot for such work. [6]
 
 Original Tokens: 10719
 Compressed Tokens: 308
 Comressed Ratio: 34.80x

驗(yàn)證輸出：

response = synthesizer.synthesize(question, new_retrieved_nodes)
 print(str(response))
 
 #Output
 #The author went to RISD for art school.

總結(jié)

LLMLingua與LlamaIndex的集成證明了協(xié)作關(guān)系在優(yōu)化大型語(yǔ)言模型(LLM)應(yīng)用程序方面的變革潛力。這種協(xié)作徹底改變了即時(shí)壓縮方法和推理效率，為上下文感知、簡(jiǎn)化的LLM應(yīng)用程序鋪平了道路。

這種集成不僅加快了推理速度，而且確保了在壓縮提示中保持語(yǔ)義完整性。基于LlamaIndex特定領(lǐng)域提示的壓縮策略微調(diào)在提示長(zhǎng)度減少和基本上下文保留之間取得了平衡，從而提高了LLM推理的準(zhǔn)確性。

從本質(zhì)上講，LLMLingua與LlamaIndex的集成超越了傳統(tǒng)的提示壓縮方法，為未來(lái)大型語(yǔ)言模型應(yīng)用程序的優(yōu)化、上下文準(zhǔn)確和有效地針對(duì)不同領(lǐng)域進(jìn)行定制奠定了基礎(chǔ)。這種協(xié)作集成預(yù)示著大型語(yǔ)言模型應(yīng)用程序領(lǐng)域中效率和精細(xì)化的新時(shí)代的到來(lái)。

責(zé)任編輯：華軒來(lái)源： DeepHub IMBA

大型語(yǔ)言模型人工智能

點(diǎn)贊

51CTO技術(shù)棧公眾號(hào)

業(yè)務(wù)
速覽

媒體

51CTO CIOAge HC3i

社區(qū)

51CTO博客鴻蒙開(kāi)發(fā)者社區(qū) AI.x社區(qū)

教育

51CTO學(xué)堂精培企業(yè)培訓(xùn) CTO訓(xùn)練營(yíng)