DeepSeek V3 與 V4 架構資訊圖

一份詳盡的並排技術資訊圖,比較 DeepSeek V3/R1 與 DeepSeek V4 的 Transformer 架構,適用於社群媒體貼文、簡報或模型分析視覺化。

提示詞
{"type":"並排 AI 架構比較資訊圖","style":"簡潔技術圖表,白色背景,細黑色輪廓,圓角矩形,虛線標註框,顏色編碼重點,簡報 Slides 美學,向量資訊圖","canvas":{"aspect_ratio":"2:1","resolution":"寬幅橫向"},"title_row":{"left_title":"DeepSeek V3/R1 (6,710 億參數)","right_title":"DeepSeek V4 (1.2 兆參數)","left_title_color":"亮橘紅色","right_title_color":"亮藍色"},"layout":{"columns":2,"sections":[{"title":"DeepSeek V3/R1 (6,710 億參數)","position":"左半部","count":9,"labels":["詞彙量 129k","FeedForward (SwiGLU) 模組","中間隱藏層維度 2,048","MoE 層","支援 128k token 的上下文長度","前 3 個區塊使用隱藏層大小為 18,432 的密集 FFN 而非 MoE","範例輸入文字","嵌入維度 7,168","128 個注意力頭"]},{"title":"DeepSeek V4 (1.2 兆參數)","position":"右半部","count":9,"labels":["詞彙量 160k","FeedForward (SwiGLU) 模組","中間隱藏層維度 3,072","MoE 層","支援 256k token 的上下文長度","前 3 個區塊使用隱藏層大小為 24,576 的密集 FFN 而非 MoE","範例輸入文字","嵌入維度 8,192","128 個注意力頭"]},{"title":"底部比較表","position":"底部全寬","count":10,"labels":["總參數","每個 token 的活躍參數","隱藏層大小","樣本設計","DeepSeek V3/R1","中間層 (FF)","注意力頭","上下文長度","嵌入維度","詞彙量"]}]},"left_panel":{"background":"淺灰色圓角矩形","main_stack":{"count":8,"blocks":["Token 化文字","Token 嵌入層","RMSNorm 1","多頭潛在注意力 (Multi-head Latent Attention)","RMSNorm 2","MoE","最終 RMSNorm","線性輸出層"]},"side_module":"RoPE 連接至左側注意力區塊","attention_block":{"label":"多頭潛在注意力 (Multi-head Latent Attention)","accent":"Latent 一詞使用橘紅色文字"},"feedforward_inset":{"title":"FeedForward (SwiGLU) 模組","count":4,"blocks":["線性層","SiLU 激活函數","線性層","線性層"],"diagram":"兩個分支相乘後進行投影"},"moe_inset":{"title":"MoE 層","count":5,"blocks":["頂部合併節點","前饋網路","前饋網路","路由器","專家數量標章 256"],"details":"小型黑色方塊,選取 1 位專家,箭頭指向專家,虛線分隔線"},"annotations":{"vocab":"詞彙量 129k","ff_dim":"中間隱藏層維度 2,048","context":"支援 128k token 的上下文長度","dense_first_blocks":"前 3 個區塊使用隱藏層大小為 18,432 的密集 FFN 而非 MoE","resource_savings":"資源節省:模型大小為 671B,但每個 token 僅啟動 1 個(共享)+ 8 個專家;每次推理步驟僅啟動 37B 參數"},"bottom_stats":{"count":10,"items":["總參數:671B","每個 token 活躍參數:37B (1 + 8 位專家)","隱藏層大小:7,128","樣本設計:28,432","中間層 (FF):2,048","注意力頭:128","上下文長度:128k","嵌入維度:前 3 個區塊","上下文長度:22G7","詞彙量:129k"]}},"right_panel":{"background":"淺藍色圓角矩形","main_stack":{"count":8,"blocks":["Token 化文字","Token 嵌入層","RMSNorm 1","多頭潛在注意力 (Multi-head Latent Attention)","RMSNorm 2","MoE","最終 RMSNorm","線性輸出層"]},"side_module":"RoPE 連接至左側注意力區塊","attention_block":{"label":"多頭潛在注意力 (Multi-head Latent Attention)","accent":"Latent 一詞使用藍色文字"},"feedforward_inset":{"title":"FeedForward (SwiGLU) 模組","count":4,"blocks":["線性層","SiLU 激活函數","線性層","線性層"],"diagram":"與左側面板結構相同"},"moe_inset":{"title":"MoE 層","count":5,"blocks":["頂部合併節點","前饋網路","前饋網路","路由器","專家數量標章 384"],"details":"小型黑色方塊,選取 1 位專家,箭頭指向專家,虛線分隔線,藍色邊框強調"},"annotations":{"vocab":"詞彙量 160k","ff_dim":"中間隱藏層維度 3,072","context":"支援 256k token 的上下文長度","dense_first_blocks":"前 3 個區塊使用隱藏層大小為 24,576 的密集 FFN 而非 MoE","resource_savings":"資源節省:模型大小為 1.2T,但每個 token 僅啟動 1 個(共享)+ 8 個專家;每次推理步驟僅啟動 52B 參數"},"bottom_stats":{"count":10,"items":["總參數:1.2T","每個 token 活躍參數:52B (1 + 8 位專家)","隱藏層大小:7,2B","樣本設計:28,432","中間層 (FF):3,072","注意力頭:128","上下文長度:256k","嵌入維度:前 3 個區塊","上下文長度:22G7","詞彙量:160k"]}},"global_notes":"建立一份極其詳細的 Transformer 架構比較圖,採用鏡像佈局。每一半包含一個大型模型堆疊圖以及 2 個嵌入圖:1 個前饋模組和 1 個 MoE 層。在區塊之間使用箭頭、微小的技術標籤,並從標籤連接至相關組件。保持排版緊湊且具備 Slides 風格,V3/R1 的重點統一使用橘紅色,V4 的重點統一使用藍色。包含底部一列橫跨寬度的緊湊表格指標。保留略帶手工感、資訊密度高且標註擁擠的資訊圖風格。"}

如何使用這條提示詞

  1. 1

    複製上方完整的提示詞。

  2. 2

    開啟支援 GPT Image 2 的平台(例如 YouMind),把提示詞貼上。

  3. 3

    依你的想法替換主體、風格或細節,然後生成。

這是 YouMind 提示詞庫裡的一條免費 AI 提示詞。這裡還有成千上萬條 圖像 提示詞,都能免費複製與改用。

探索更多 圖像 提示詞

更多提示詞功能

AI 提示詞庫

AI 搜尋提示詞

讓 AI 幫忙搜尋數萬提示詞,支援指定模型、時間範圍、關鍵字,依互動量如曝光、收藏、轉發等排序。

視覺工具

圖片轉提示詞

把任意圖片轉成可重複使用的 AI 圖像提示詞:免費圖片轉提示詞工具自動解析構圖、風格與光線,幾秒還原同款效果。

為創作者而生,永久免费。

YouMind 是受到全球創作者信賴的 AI 創作助手。這裡的每條提示詞都經過精選,幫助你更快、更好地創作。

探索更多提示詞