热门文档
- 2025-04-21 11:02:12 2025年中国宠物行业白皮书
- 2024-11-17 13:23:03 Mckinsey:2024中国消费趋势调研
- 2025-03-07 11:57:25 【剧星传媒】《哪吒2》资源推荐0205
- 2024-07-31 22:30:59 城市飞行营地总体规划及主题体验区概念设计规划案
- 2024-05-19 21:59:54 2024小红书餐饮行业方法论
- 2024-03-22 23:03:56 红爵啤酒品鉴会“红爵之夜 嗨啤TIME ”主题活动策划方案
- 2024-07-21 20:53:59 2024全球E-Bike市场洞察报告(电动自行车)
- 2024-07-09 21:55:15 如何运营你的小红书?
- 2024-08-18 12:25:23 2024年科大讯飞1024招商方案
- 2024-11-04 20:41:56 淘天集团:2024年家装家居行业营销解决方案
- 2025-05-06 14:49:34 中国剧本杀行业研究报告
- 2024-09-08 21:00:26 【Short TV】ShortTV出海短剧内容生态

1、本文档共计 0 页,下载后文档不带水印,支持完整阅读内容或进行编辑。
2、当您付费下载文档后,您只拥有了使用权限,并不意味着购买了版权,文档只能用于自身使用,不得用于其他商业用途(如 [转卖]进行直接盈利或[编辑后售卖]进行间接盈利)。
3、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。
4、如文档内容存在违规,或者侵犯商业秘密、侵犯著作权等,请点击“违规举报”。
2、当您付费下载文档后,您只拥有了使用权限,并不意味着购买了版权,文档只能用于自身使用,不得用于其他商业用途(如 [转卖]进行直接盈利或[编辑后售卖]进行间接盈利)。
3、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。
4、如文档内容存在违规,或者侵犯商业秘密、侵犯著作权等,请点击“违规举报”。
Al Agents Beyond ChatGPTLLMLLMLLMZhou(Jo)YuColumbia UniversityArklex AlWho supports AI Agents?Bill GatesAgents are bringing about the biggestCurrent agents are just thinrevolution in computing since we went fromwrappers around LLMs.typing commands to tapping on icons.Autoregressive LLMs canAndrew Ngnever reason or plan.I think Al agentic workflows will drivemassive AI progress this year.Auto-GPT's limitations in...revealSam Altmanthat it is far from being a practicalsolution.2025 is when agents will work.Slides adapted from Yu SuWhat are AI Agents?Perception:Multimodal inputs including,text,image,audio,video,touch,etc.AgentSensorsPerceptsPlanning(Inner Monologue):ReasoningChain-of-Thought reasoning over tokensEnvironmentthat powered by LLMsInnerMonologueReflection:meta-reasoning in every stopActions:function/tool calling,embodiedActionsactions.Adapted from Russell Norvig (2020)AI Agent Deployment ConsiderationPHASE 1RESEARCHPHASE 2SCALINGPHASE 3INNOVATINGLevel 1Level 2Level 3Level 4Level 5"Your Work Assistant'Agent-as-a-Servico"Autonomous Agents"TASK 1A2LLMPROTO AGIA completion to theAn LLM-centric softwareA service-centric system withAn autonomous system withhuman prompt wit世tssystem for assisting real-LLMs as core components forLLMs as core components forvehicle,an L5 agent isworld tasks.completing various tasksSlide:Alex Wang @Scale Al18Overview3.AI agent self-improvement via tree search (vuet al.ICLR 2025)Background:In-Context Self-ImprovementInput:Q:Calculate (4 *1)-(2 3)=?运营动脉运营动脉运营动Xiao Yu,Baolin Peng,Michel Galley,Jianfeng Gao,Zhou Yu,Teaching Language Models to Self-Improve through2Background:In-Context Self-ImprovementInput:Q:Calculate (4 *1)-(2 3)=?Q:Calculate 1+2=?Ans:3nu3noun-jo-ureyQ:Calculate (4 *-1)+(2 *3)=?Q:Calculate…oys-MajLet's think step by step:Q:Calculate (4 1)-(2 3)=?Step1(4*1)-(2*3)=4-6.Ans:-2Step2:4-6=-2Ans:-23Background:In-Context Self-ImprovementInput:Q:Calculate (4 *1)-(2 3)=?Self-Improvement Prompting(Madaan,et al,2023)Step1:(4*1)-(2*3)=4-6Step2:4-6=-3Ans:-3Madaan,A.et al.(2023)'Self-Refine:Iterative Refinement with Self-Feedback'4Background:In-Context Self-ImprovementInput:Q:Calculate (4 *1)-(2 3)=?Self-Improvement Prompting(Madaan,et al,2023)Step1:(4*1)-(2*3)=4-6Step2:4-6=-3promptAns:-3feedbackIn step 2 the part "4-6=-3"isincorrect.This is because...promptupdateStep1:(4*1)-(2*3)=4-6Step2:4-6=-2Ans:-2Madaan,A.et al.(2023)'Self-Refine:Iterative Refinement with Self-Feedback'5Background:In-Context Self-ImprovementInput:Q:Calculate (4 *1)-(2 3)=?Self-Improvement Prompting(Madaan,et al,2023)Step1:(4*1)-(2*3)=4-6Step2:4-6=-3promptAns:-3feedbackIn step 2 the part "4-6=-3"isincorrect.This is because...promptpromptfeedbackupdateStep1:(4*1)-(2*3)=4-6Step2:4-6=-2Ans:-2Madaan,A.et al.(2023)'Self-Refine:Iterative Refinement with Self-Feedback'6Background:In-Context Self-ImprovementMultistep ArithmeticCodex (175B)5LLaMa (7B)+2.0Problem 1:small LM cannot self-improve via prompting!-5.2-5.1Logical DeductionCodex (175B)5+4.4LLaMa (7B)4.1+CoT prompt+SI.prompt+ft (finetune)+ft Sl.demoBackgroundMotivationApproachExperiments7Background:In-Context Self-ImprovementMultistep Arithmetic■Codex(175B)5LLaMa (7B)+2.0Problem 1:small LM cannot self-improve via prompting!-5.2-5.1Logical Deduction0Step1:(4*1)-(2*3)=4-6tep2:4-6=-3Codex (175B)5+4.4LLaMa (7B)Ans:-3In step1 the part“2*3=6”is4.1incorrect.This is because...+CoT prompt+SI.prompt+ft (finetune)+ft Sl.demo..error propagates!BackgroundMotivationApproachExperiments8Background:In-Context Self-ImprovementMultistep ArithmeticCodex (175B)5LLaMa (7B)Problem 2:small LM cannot leam+2.0"self-improvement"from LLM demonstrations!Logical DeductionCodex (175B)5+4.4LLaMa (7B)054.1+CoT prompt+SI.prompt+SI.prompt+ft(finetune)+ft Sl.demoBackgroundMotivationApproachExperiments9Background:In-Context Self-ImprovementMultistep ArithmeticCodex (175B)5LLaMa(7B)Problem 2:small LM cannot lear+2.0"self-improvement"from LLM demonstrations!5.25.1Q:Calculate 4-0 *-1*8+6=?Logical DeductionCodex(175B)5+4.4LLaMa (7B)=4-(0*-1*-8)+600=4-(0*-1*-8)+60=4-(0)+6=4-(0+8)+6=4-(0+6)=4-8+65-4.1=4-6=-2+6=-2=4+CoT prompt+ft (finetune)+ft Sl.demofeedback:..irrelevant demonstrations!10MotivationPrior work shows that self-improvement(S.I.)is useful for task performance/generalization (Madaan,et al,2023)We find prompt-based S.I./simple distillation methods fails with small LMMadaan.A.et al.(2023)'Self-Refine:Iterative Refinement with Self-Feedback'BackgroundMotivationApproachExperiments
请如实的对该文档进行评分-
-
-
-
-
0 分