《Softserve:2024年掌握提示工程的藝術研究報告(英文版)(12頁).pdf》由會員分享,可在線閱讀,更多相關《Softserve:2024年掌握提示工程的藝術研究報告(英文版)(12頁).pdf(12頁珍藏版)》請在三個皮匠報告上搜索。
1、MASTER THE ART OF PROMPT ENGINEERING Strategies for Optimization and EvaluationAuthors:Nazarii DrushchakNataliya PolyakovskaDmytro ZikrachWhite PaperMaster the Art of Prompt Engineering2Prompt engineering has emerged as a critical discipline in large language models(LLMs),which is pivotal in maximiz
2、ing the effectiveness and reliability of AI-enabled systems.The essence of prompt engineering lies in crafting input queries that guide these models to generate desired outputs with precision.The quality and structure of these prompts strongly correlate with the performance,stability,fairness,and pr
3、actical utility of LLM systems.In this paper,SoftServe delves into the multifaceted impact of prompt engineering,beginning with its foundational significance to enhance LLM operations.We explore how strategic prompt design improves the accuracy and relevance of model responses and drives business ou
4、tcomes by enabling more sophisticated and context-aware AI applications.Beyond its direct benefits,prompt engineering serves as a lens through which the nuances of human-AI interaction and communication become apparent,offering insights into how models perceive and process natural language inputs.A
5、critical aspect of harnessing the potential of prompt engineering involves the evaluation of prompt effectiveness.SoftServe discusses various techniques to assess and measure the impact of different prompts on model performance,providing a framework for continuous improvement and optimization.Moreov
6、er,we outline strategies for improvement prompts to align closely with specific goals and use cases,demonstrating how incremental adjustments will yield significant improvements in results.In this paper,SoftServe will highlight the paramount importance of prompt engineering in using LLMs to their fu
7、ll potential by offering practical insights and strategies for optimization and evaluation.INTRODUCTIONLLMs are powerful.They offer vast potential across many applications,from writing assistance to complex problem-solving.Prompt quality crucially determines their effectiveness.A well-crafted prompt
8、 leads Generative AI models to deliver pertinent and precise results.In contrast,a poor prompt may result in irrelevant outcomes.This emphasizes the importance of prompt engineering in Gen AI.Prompt engineering involves optimizing the technical aspects of prompts,refining queries or instructions to
9、generate precise,contextually appropriate,and efficient responses.Techniques like few-shot learning and specificity in prompts are used to enhance the models understanding of context and desired outcomes,minimizing errors and biases.SIGNIFICANCE OF PROMPT ENGINEERINGWhite PaperMaster the Art of Prom
10、pt Engineering3On the other hand,prompt design focuses on user-friendliness,ethical considerations,and inclusivity.Design considerations encompass factors such as language simplicity,inclusivity,fairness,and safety,which contribute to the overall usability and acceptability of AI systems.In the real
11、m of LLMs,prompt engineering and designing outweigh model retraining.Unlike the latter,which involves revising the entire AI architecture,prompt engineering only focuses on refining instructions.This targeted approach is advantageous when resource or time constraints make model retraining impractica
12、l.Moreover,prompt engineering enables quick adaptations to evolving contexts without extensive retraining cycles,which is too expensive for LLMs.By combining both prompt engineering and design principles,developers create prompts that optimize AI performance and enhance user satisfaction and trust.T
13、his integrated approach emerges as a pivotal strategy for enhancing LLM performance and versatility across diverse applications.Prompt engineering involves the creation of instructions to elicit desired responses from LLMs,optimizing for accuracy and relevance.Prompt design,a broader process,encompa
14、sses the creation,evaluation,and distillation of prompts for user-friendliness,ethics,and inclusivity.PROMPT ENGINEERING VS.DESIGNASPECTPROMPT ENGINEERING PROMPT DESIGNFocusScientific,iterative testing,and model modificationArtistic,strategic framing,and focus on communicationPurposeOptimize model o
15、utputCreate effective and responsible promptsTechniquesFormat specification,iterative testing,context,and structureClarity,simplicity,and fairnessScopeNarrow and focused on model optimizationBroad and includes user experiences and ethical considerationsWhite PaperMaster the Art of Prompt Engineering
16、4GOOD VS.BAD PROMPTSSoftServe explores the prompt engineering concept with examples of good and bad prompts and why this distinction matters.There is also a look at advanced techniques like chain-of-thought prompting.Bad prompt:Give me a report on sales.Generated output example (by gpt-3.5-turbo):Th
17、e total sales for the period were$X million,with product 1,product 2,and product 3 as top sellers.The report lacks specific details about timeframes,markets,or product lines,making it too generic for action.Why it is bad:This prompt is overly vague and lacks context.The AI may generate a generic rep
18、ort on sales without considering the specific industry,timeframe,or region important to the business.This leads to an output unactionable or irrelevant to the organizations needs.Example of a good prompt:Generate a detailed monthly sales report for Q1 2023,comparing product lines A,B,and C in the No
19、rth American market,and highlight key growth drivers and areas for improvement.Generated output example (by gpt-3.5-turbo):Q1 2023 saw product C leading with a 10%sales increase driven by an effective marketing campaign.Product A sales grew by 5%,while product B faced a 2%decline,which suggests the
20、need for a revised strategy.The report provides a detailed comparison and actionable insights for each product line in the North American market.Why it is good:This prompt is precise,specifying the period(Q1 2023),products of interest(product lines A,B,and C),and geographical focus(North American ma
21、rket).It also asks for an analysis beyond raw numbers(key growth drivers and areas for improvement),which guides the AI to produce a comprehensive,relevant,and actionable report.EXAMPLE 1White PaperMaster the Art of Prompt Engineering5GOOD VS.BAD PROMPTSBad prompt:Write a query to find customers.Gen
22、erated output example (by gpt-3.5-turbo):SELECT*FROM customersWhy it is bad:This prompt is extremely vague and provides no context or specific request.The AI model has no information about the database schema,desired outcome,or complexity of the query needed.As a result,it cannot generate a meaningf
23、ul or relevant SQL query.Example of a good prompt:Given a database with tables Customers(with fields id,name,location)and Orders(with fields id,order_date,customer_id),write an SQL query to find the names of customers who have placed more than five orders.Generated output example (by gpt-3.5-turbo):
24、SELECT c.name FROM Customers c JOIN(SELECT customer_id FROM Orders GROUP BY customer_id HAVING COUNT(id)5)o ON c.id=o.customer_id.Why it is good:This prompt specifies the database schema and exact requirements.It provides enough detail for the AI to understand the context and generate a precise SQL
25、query that achieves the request.EXAMPLE 2White PaperMaster the Art of Prompt Engineering6BUSINESS IMPACTAt the heart of using AI in business is prompt optimization,a critical factor needed to enhance the accuracy and stability of Gen AI models.This accuracy directly influences key performance indica
26、tors(KPIs)and makes it a cornerstone for measuring success across various business operations.Studies indicate that optimized prompts will lead to at least a 10%-30%increase in model accuracy,which translates to significant improvements in efficiency,customer satisfaction,and financial outcomes.It i
27、s this foundational benefit that sets the stage for a cascade of additional advantages,such as:By prioritizing prompt optimization,organizations bolster the direct accuracy and effectiveness of their AI systems and unlock a host of secondary benefits,from cost savings to enhanced market positioning.
28、This underscores the multifaceted value of this practice in todays AI-enabled landscape.Cost efficiency By increasing the precision of AI outputs from the beginning,businesses will streamline processes and reduce overhead associated with corrections.For example,customer service AI systems that more
29、accurately understand and address customer queries will significantly reduce reliance on human intervention,which yields substantial savings.Enhanced experienceOptimized prompts ensure AI systems provide timely and highly relevant responses.This precision boosts user interactions in industries like
30、retail,where AI-enabled personalized recommendations may lift conversion rates.Informed decision-makingMore accurate AI analysis fosters better,data-driven business decisions.For instance,to reliably predict market trends in the financial sector,AI helps organizations make smarter investment choices
31、,which potentially enhances profitability.Risk mitigationEffective prompt engineering also significantly reduces AI-generated errors or hallucinations,which may lead to operational and ethical risks.This safeguards against misinformation and protects company credibility and consumer trust.White Pape
32、rMaster the Art of Prompt Engineering7PROMPT LIFECYCLE AND EVALUATION TECHNIQUES123The lifecycle of a prompt is as essential as the advancement of the models themselves in prompt engineering for LLMs.By referring to the lifecycle diagram,insight is gained into the dynamic process of developing,evalu
33、ating,and refining prompts.This lifecycle is one component of the broader LLMOps lifecycle,which oversees the entire journey from the design to deployment of Gen AI applications.Following is a focus on the prompt part of the comprehensive lifecycle,which will be explored in SoftServes next series of
34、 white papers.Design is the first phase,where the initial construction of a prompt takes place.It is thoughtfully composed to elicit the best response from the LLM.Implementation is where the prompt is introduced to the model,and its initial performance is observed.The evaluation phase is pivotal in
35、 prompt engineering and similar to the entire LLM system evaluation techniques,which is discussed in another paper.A variety of metrics are employed,with each one chosen based on the specific nature of the task and what aspects of performance will be measured,such as:Classical NLP tasks.For tasks li
36、ke text classification or named entity recognition(NER),the F1 score,precision,and recall offer a balanced view of the models accuracy.Generated content with training data.For tasks that involve content generation,such as summarization or translation,metrics like BLEU or rouge are essential.They com
37、pare the generated text to a reference set,which ensures linguistic quality and content fidelity.Use these when as a golden standard against which to compare the output.Generated content without training data.In scenarios like creative writing,where there may not be a correct answer,perplexity measu
38、res the prediction smoothness of the model.Lower perplexity indicates better fluency and is particularly useful when evaluating new content creation.Hallucination detection.These metric types are critical in fact-based tasks,such as news article generation or data-driven reports.It helps maintain th
39、e credibility of the content by identifying and correcting factual inaccuracy.White PaperMaster the Art of Prompt Engineering8 Custom task-specific metrics.Depending on the domain,such as verifying facts,solving mathematical problems,executing code,or ensuring readability,specialized metrics are dev
40、eloped to provide the outputs that meet the necessary criteria for success in these areas.LLMs as evaluators.Using other models to evaluate content will provide additional validation,which is useful in tasks like content grading or peer review.Fairness metrics.Particularly relevant for prompts with
41、social implications,such as job application screening tools or credit scoring systems,to ensure that outputs do not discriminate against any group.Human feedback.This kind of feedback is indispensable in tasks where user satisfaction is the goal,for example,through chatbots or interactive storytelli
42、ng.It adds a layer of qualitative assessment that metrics alone cannot provide.In addition,the number of prompt input and output tokens are essential metrics to evaluate cost-effectiveness and response utility.They help fine-tune the prompt to ensure concise inputs lead to valuable and relevant outp
43、uts without unnecessary verbosity or truncation.Evaluating the models efficiency with a given prompt is crucial in LLM-based applications.It ensures that the models responses are accurate,coherent,and delivered within an acceptable timeframe,which balances computational costs against performance.Thi
44、s multifaceted approach to evaluation ensures that the prompts lead to an efficient,fair,and effective use of LLMs.This lifecycle and the associated metrics form the backbone of prompt engineering.They establish the creation of prompts that produce stable and high-quality outputs and do so economica
45、lly and ethically.Refinement,iteration,validation,deployment,and maintenance are steps that warrant the prompt remains effective,stable,fair,and evolves along with the model and its applications.4White PaperMaster the Art of Prompt Engineering9PROMPT OPTIMIZATION FRAMEWORKFor the creation of effecti
46、ve prompts that yield targeted and valuable AI responses,consider a structured approach that incorporates the following steps:Effective prompts require a blend of art and science tailored to each AI models nuances.Experimentation and adaptation are key,as best practices vary.By testing different app
47、roaches and adjusting based on AI responses,prompts may be refined for optimal results.For example,transitioning from a general prompt like summarize this article to a more specific one,such as provide a concise,three-sentence summary focusing on main arguments and conclusions enhances clarity and a
48、ligns with informational goals.Iterative refinement,informed by LLM feedback,improves precision and relevance in AI interactions.1Define your objective.Clearly state the purpose of your prompt by focusing on the desired achievement with the AIs response.Provide context.Include relevant background in
49、formation to frame a request,which enables the AI to understand the broader scope.Be specific.Detail exactly what information or output is needed,including any parameters,criteria,or formats.3Use clear and concise language.Avoid ambiguity by carefully choosing words by aiming for clarity and brevity
50、.5Include examples.When possible,offer examples to guide the AI to deliver the appropriate response.Iterate and refine.Based on initial responses,refine prompts to improve precision and relevance.246White PaperMaster the Art of Prompt Engineering10When refining prompts with LLM feedback,it is insigh
51、tful to consider how an LLM process prompts into a chain-of-thoughts approach.For example:Please note that the improvement will be an iterative process.In the end,a more comprehensive prompt will be created.1.Identify top strategies for increasing blog post engagement.2.Analyze effective headlines f
52、or blog posts in Gen AI.3.Explore SEO best practices for the blog post to boost visibility.4.Recommend tools to analyze blog post engagement metrics.Initial prompt:My blog posts are not engaging my audience.How can I improve?Prompt improved by LLMs:What are proven methods to increase engagement on b
53、log posts within the tech industry significantly?Include headline crafting,SEO optimization techniques,and tools for engagement analysis.Generated thought processes and queries by LLMs:For example:Prompt improved by LLMs:What are proven methods to increase engagement on blog posts within the tech in
54、dustry significantly?Include headline crafting,SEO optimization techniques,and tools for engagement analysis.To boost audience engagement with my blog posts in the tech industry,I seek comprehensive strategies encompassing content creation and distribution.Specifically:Content development:What are e
55、ffective techniques for crafting headlines that capture attention and encourage clicks within the tech industry?SEO optimization:What are the latest SEO strategies tailored for tech blogs to improve visibility on search engines?Can you recommend keyword research and SEO analysis tools that are parti
56、cularly effective for tech industry content?White PaperMaster the Art of Prompt Engineering11Engagement analysis and improvement:What metrics should I focus on to accurately measure engagement on my blog posts?Are there specific tools or platforms recommended for tracking these metrics and analyzing
57、 reader behavior?How can I use feedback and engagement data to iterate on content strategy for better performance?Community building:What are proven methods for fostering a community around my tech blog that encourages regular interaction and feedback?Please provide actionable insights and examples
58、where applicable,emphasizing methods proven to be effective specifically within the tech blogging sphere.Always employ the evaluation methodologies outlined before to determine the effectiveness of prompts.Through this informed refinement process,the quality of AI-generated content is enhanced,which
59、 makes sure that the prompts are effective and customized to specific demands of the task.This white paper marks the beginning of SoftServes series on defining various stages of prompt engineering process efficiency by concentrating on optimization and evaluation.We delve into general techniques for
60、 optimizing and evaluating prompts and demonstrate their crucial role in enhancing AI model performance and user satisfaction.Effective prompt engineering is essential for organizations to maximize the quality of their system.By crafting prompts tailored to specific tasks and objectives,organization
61、s will trigger AI systems to generate accurate and relevant responses,which improve operational efficiency and customer experiences.In future white papers,SoftServe will dive deeper into advanced topics like creating and refining summarization prompts through step-by-step optimization.These papers w
62、ill also explore custom evaluation techniques for specific use cases.As SoftServe continues the exploration of these subjects,the objective is to give practitioners and researchers the necessary knowledge and resources to fully use prompt engineering in AI applications.Monitoring and evaluation are
63、crucial components in the LLM world and serve as checkpoints to gauge the efficacy and performance of prompt engineering techniques.They provide essential feedback loops that enable the refinement and enhancement of LLM systems.CONCLUSION AND FUTURE WORKNORTH AMERICAN HQ201 W.5th Street,Suite 1550 A
64、ustin,TX 78701+1 866 687 3588(USA)+1 647 948 7638(Canada)EUROPEAN HQ30 Cannon StreetLondon EC4 6XHUnited Kingdom+44 333 006 4341 ABOUT USSoftServe is a premier IT consulting and digital services provider.We expand the horizon of new technologies to solve todays complex business challenges and achiev
65、e meaningful outcomes for our clients.Our boundless curiosity drives us to explore and reimagine the art of the possible.Clients confidently rely on SoftServe to architect and execute mature and innovative capabilities,such as digital engineering,data and analytics,cloud,and AI/ML.Our global reputat
66、ion is gained from more than 30 years of experience delivering superior digital solutions at exceptional speed by top-tier engineering talent to enterprise industries,including high tech,financial services,healthcare,life sciences,retail,energy,and manufacturing.Visit our website,blog,LinkedIn,Facebook,and X(Twitter)pages for more information.