{"id":226669,"date":"2025-03-02T05:20:39","date_gmt":"2025-03-02T05:20:39","guid":{"rendered":"https:\/\/globetimeline.com\/ar\/tech\/rewrite-this-title-in-arabic-ai-companies-race-to-use-distillation-to-produce-cheaper-models\/"},"modified":"2025-03-02T05:20:39","modified_gmt":"2025-03-02T05:20:39","slug":"rewrite-this-title-in-arabic-ai-companies-race-to-use-distillation-to-produce-cheaper-models","status":"publish","type":"post","link":"https:\/\/globetimeline.com\/ar\/tech\/rewrite-this-title-in-arabic-ai-companies-race-to-use-distillation-to-produce-cheaper-models\/","title":{"rendered":"rewrite this title in Arabic AI companies race to use \u2018distillation\u2019 to produce cheaper models"},"content":{"rendered":"<p>Summarize this content to 2000 words in 6 paragraphs in Arabic Leading artificial intelligence firms including OpenAI, Microsoft and Meta are turning to a process called \u201cdistillation\u201d in the global race to create AI models that are cheaper for consumers and businesses to adopt.The technique caught widespread attention after China\u2019s DeepSeek used it to build powerful and efficient AI models based on open-source systems released by competitors Meta and Alibaba. The breakthrough rocked confidence in Silicon Valley\u2019s AI leadership, leading Wall Street investors to wipe billions of dollars of value from US Big Tech stocks.Through distillation, companies take a large language model \u2014 dubbed a \u201cteacher\u201d model \u2014 which generates the next likely word in a sentence. The teacher model generates data which then trains a smaller \u201cstudent\u201d model, helping to quickly transfer knowledge and predictions of the bigger model to the smaller one. While distillation has been widely used for years, recent advances have led industry experts to believe the process will increasingly be a boon for start-ups seeking\u00a0cost-effective ways to build applications based on the technology.\u201cDistillation is quite magical,\u201d said Olivier Godement, head of product for OpenAI\u2019s platform. \u201cIt\u2019s the process of essentially taking a very large smart frontier model and using that model to teach a smaller model\u2009.\u2009.\u2009.\u2009very capable in specific tasks that is super cheap and super fast to execute.\u201d Large language models such as OpenAI\u2019s GPT-4, Google\u2019s Gemini and Meta\u2019s Llama require massive amounts of data and computing power to develop and maintain. While the companies have not revealed precise figures for how much it costs to train large models, it is likely to be hundreds of millions of dollars.Thanks to distillation, developers and businesses can access these models\u2019 capabilities at a fraction of the price, allowing app developers to run AI models quickly on devices such as laptops and smartphones.\u00a0Developers can use OpenAI\u2019s platform for distillation, learning from the large language models that underpin products like ChatGPT. OpenAI\u2019s largest backer, Microsoft, used GPT-4 to distil its small language family of models Phi as part of a commercial partnership after investing nearly $14bn into the company.However, the San Francisco-based start-up has said it believes DeepSeek distilled OpenAI\u2019s models to train its competitor, a move that would be against its terms of service. DeepSeek has not commented on the claims.While distillation can be used to create high-performing models, experts add they are more limited.\u201cDistillation presents an interesting trade-off; if you make the models smaller, you inevitably reduce their capability,\u201d said Ahmed Awadallah of Microsoft Research, who said a distilled model can be designed to be very good at summarising emails, for example, \u201cbut it really would not be good at anything else.\u201dDavid Cox, vice-president for AI models at IBM Research, said most businesses do not need a massive model to run their products, and distilled ones are powerful enough for purposes such as customer service chatbots or running on smaller devices like phones.\u201cAnytime you can [make it less expensive] and it gives you the right performance you want, there is very little reason not to do it,\u201d he added.That presents a challenge to many of the business models of leading AI firms. Even if developers use distilled models from companies like OpenAI, they cost far less to run, are less expensive to create, and, therefore, generate less revenue. Model-makers like OpenAI often charge less for the use of distilled models as they require less computational load.Yet, OpenAI\u2019s Godement argued that large language models will still be required for \u201chigh intelligence and high stakes tasks\u201d where \u201cbusinesses are willing to pay more for a high level of accuracy and reliability\u201d. He added that large models will also be needed to discover new capabilities that can then be distilled into smaller ones.Still, the company aims to prevent its large models from being distilled to train a competitor. OpenAI has teams monitoring usage and can remove access to users it suspects are generating vast amounts of data to export and train a rival, as it has apparently done with accounts it believes were linked to DeepSeek. Yet much of this action happens retroactively. \u201cOpenAI has been trying to protect against distillation for a long time, but it is very hard to avoid it altogether,\u201d said Douwe Kiela, chief executive of Contextual AI, a start-up building information retrieval tools for enterprises.Distillation is also a victory for advocates of open models, where the technology is made freely available for developers to build upon. DeepSeek has made its recent models also open for developers. \u201cWe\u2019re going to use [distillation] and put it in our products right away,\u201d said Yann LeCun, Meta\u2019s chief AI scientist. \u201cThat\u2019s the whole idea of open source. You profit from everyone and everyone else\u2019s progress as long as those processes are open.\u201dDistillation also means that model-makers can\u00a0spend billions of dollars to advance the capabilities of AI systems but still face competitors that often catch up quickly, as DeepSeek\u2019s recent releases demonstrate. This raises questions about the first-mover advantage in building LLMs when their capabilities can be replicated in a matter of months.\u201cIn a world where things are moving so fast\u2009.\u2009.\u2009.\u2009you could actually spend a lot of money, doing it the hard way, and then the rest of the field is right on your heels,\u201d IBM\u2019s Cox said. \u201cSo it is an interesting and tricky business landscape.\u201d Additional reporting Michael Acton in San Francisco<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Summarize this content to 2000 words in 6 paragraphs in Arabic Leading artificial intelligence firms including OpenAI, Microsoft and Meta are turning to a process called \u201cdistillation\u201d in the global race to create AI models that are cheaper for consumers and businesses to adopt.The technique caught widespread attention after China\u2019s DeepSeek used it to build<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[63],"tags":[],"class_list":{"0":"post-226669","1":"post","2":"type-post","3":"status-publish","4":"format-standard","6":"category-tech"},"_links":{"self":[{"href":"https:\/\/globetimeline.com\/ar\/wp-json\/wp\/v2\/posts\/226669","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/globetimeline.com\/ar\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/globetimeline.com\/ar\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/globetimeline.com\/ar\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/globetimeline.com\/ar\/wp-json\/wp\/v2\/comments?post=226669"}],"version-history":[{"count":0,"href":"https:\/\/globetimeline.com\/ar\/wp-json\/wp\/v2\/posts\/226669\/revisions"}],"wp:attachment":[{"href":"https:\/\/globetimeline.com\/ar\/wp-json\/wp\/v2\/media?parent=226669"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/globetimeline.com\/ar\/wp-json\/wp\/v2\/categories?post=226669"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/globetimeline.com\/ar\/wp-json\/wp\/v2\/tags?post=226669"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}