{"id":236485,"date":"2025-03-11T09:55:53","date_gmt":"2025-03-11T09:55:53","guid":{"rendered":"https:\/\/globetimeline.com\/ar\/tech\/rewrite-this-title-in-arabic-how-inference-is-driving-competition-to-nvidias-ai-chip-dominance\/"},"modified":"2025-03-11T09:55:53","modified_gmt":"2025-03-11T09:55:53","slug":"rewrite-this-title-in-arabic-how-inference-is-driving-competition-to-nvidias-ai-chip-dominance","status":"publish","type":"post","link":"https:\/\/globetimeline.com\/ar\/tech\/rewrite-this-title-in-arabic-how-inference-is-driving-competition-to-nvidias-ai-chip-dominance\/","title":{"rendered":"rewrite this title in Arabic How \u2018inference\u2019 is driving competition to Nvidia\u2019s AI chip dominance"},"content":{"rendered":"<p>Summarize this content to 2000 words in 6 paragraphs in Arabic Nvidia\u2019s challengers are seizing a new opportunity to crack its dominance of artificial intelligence chips after Chinese start-up DeepSeek accelerated a shift in AI\u2019s computing requirements. DeepSeek\u2019s R1 and other so-called \u201creasoning\u201d models, such as OpenAI\u2019s o3 and Anthropic\u2019s Claude 3.7, consume more computing resources than previous AI systems at the point when a user makes their request, a process called \u201cinference\u201d. That has flipped the focus of demand for AI computing, which until recently was centred on training or creating a model. Inference is expected to become a greater portion of the technology\u2019s needs as demand grows among individuals and businesses for applications that go beyond today\u2019s popular chatbots, such as ChatGPT or xAI\u2019s Grok. It is here that Nvidia\u2019s competitors \u2014 which range from AI chipmaker start-ups such as Cerebras and Groq to custom accelerator processors from Big Tech companies including Google, Amazon, Microsoft and Meta \u2014 are focusing their efforts to disrupt the world\u2019s most valuable semiconductor company. \u201cTraining makes AI and inference uses AI,\u201d said Andrew Feldman, chief executive of Cerebras. \u201cAnd the usage of AI has gone through the roof\u2009.\u2009.\u2009.\u2009The opportunity right now to make a chip that is vastly better for inference than for training is larger than it has been previously.\u201dNvidia dominates the market for huge computing clusters such as Elon Musk\u2019s xAI facility in Memphis or OpenAI\u2019s Stargate project with SoftBank. But its investors are looking for reassurance that it can continue to outsell its rivals in far smaller data centres under construction that will focus on inference. Vipul Ved Prakash, chief executive and co-founder of Together AI, a cloud provider focused on AI that was valued at $3.3bn last month in a round led by General Catalyst, said inference was a \u201cbig focus\u201d for his business. \u201cI believe running inference at scale will be the biggest workload on the internet at some point,\u201d he said. Analysts at Morgan Stanley have estimated more than 75 per cent of power and computational demand for data centres in the US will be for inference in the coming years, though they warned of \u201csignificant uncertainty\u201d over exactly how the transition will play out. Still, that means hundreds of billions of dollars\u2019 worth of investments could flow towards inference facilities in the next few years, if usage of AI continues to grow at its current pace. Analysts at Barclays estimate capital expenditure for inference in \u201cfrontier AI\u201d \u2014 referring to the largest and most advanced systems \u2014 will exceed that of training over the next two years, jumping from $122.6bn in 2025 to $208.2bn in 2026. While Barclays predicts Nvidia will have \u201cessentially 100 per cent market share\u201d in frontier AI training, it will serve only 50 per cent of inference computing \u201cover the long term\u201d. That leaves the company\u2019s rivals with almost $200bn in chip spending to play for by 2028. \u201cThere is a huge pull towards better, faster, more efficient [chips],\u201d said Walter Goodwin, founder of UK-based chip start-up Fractile. Cloud computing providers are eager for \u201csomething that cuts out over-dependence\u201d on Nvidia, he added. Nvidia chief executive Jensen Huang insisted his company\u2019s chips are just as powerful for inference as they are for training, as he eyes a giant new market opportunity. The US company\u2019s latest Blackwell chips were designed to handle inference better and many of those products\u2019 earliest customers are using them to serve up, rather than train, AI systems. The popularity of its software, based on its proprietary Cuda architecture, among AI developers also presents a formidable barrier to competitors. \u201cThe amount of inference compute needed is already 100x more\u201d than it was when large language models started out, Huang said on last month\u2019s earnings call. \u201cAnd that\u2019s just the beginning.\u201d The cost of serving up responses from LLMs has fallen rapidly over the past two years, driven by a combination of more powerful chips, more efficient AI systems and intense competition between AI developers such as Google, OpenAI and Anthropic. \u201cThe cost to use a given level of AI falls about 10x every 12 months, and lower prices lead to much more use,\u201d Sam Altman, OpenAI\u2019s chief executive, said in a blog post last month. DeepSeek\u2019s v3 and R1 models, which triggered a stock market panic in January largely because of what was perceived as lower training costs, have helped bring down inference costs further, thanks to the Chinese start-up\u2019s architectural innovations and coding efficiencies. At the same time, the kind of processing required by inference tasks \u2014 which can include far greater memory requirements to answer longer and more complex queries \u2014 opened the door to alternatives to Nvidia\u2019s graphics processing units, whose strengths lie in handling very large volumes of similar calculations. \u201cThe performance of inference on your hardware is a function of how fast you can [move data] to and from memory,\u201d said Cerebras\u2019s Feldman, whose chips have been used by French AI start-up Mistral to accelerate performance of its chatbot, Le Chat. Speed is vital to engaging users, Feldman said. \u201cOne of the things that Google [search] showed 25 years ago is that even microseconds [of delay] reduce the attention of the viewer,\u201d he said. \u201cWe are producing answers for Le Chat in sometimes a second while [OpenAI\u2019s] o1 would have taken 40.\u201d Nvidia maintains its chips are just as powerful for inference as for training, pointing to 200-fold improvement in its inference performance over the past two years. It says hundreds of millions of users access AI products through millions of its GPUs today. \u201cOur architecture is fungible and easy to use in all of those different ways,\u201d Huang said last month, for both building large models or serving up AI applications in new ways.Prakash, whose company counts Nvidia as an investor, said Together uses the same Nvidia chips for inference and training today, which is \u201cpretty useful\u201d. Unlike Nvidia\u2019s \u201cgeneral purpose\u201d GPUs, inference accelerators work best when they are tuned to a particular type of AI model. In a fast-moving industry, that could prove a problem for chip start-ups which bet on the wrong AI architecture. \u201cI think the one advantage of general purpose computing is that as the model architectures are changing, you just have more flexibility,\u201d Prakash said, while also adding: \u201cMy sense is there will be a complex mix of silicon over the coming years.\u201d Additional reporting by Michael Acton in San Francisco<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Summarize this content to 2000 words in 6 paragraphs in Arabic Nvidia\u2019s challengers are seizing a new opportunity to crack its dominance of artificial intelligence chips after Chinese start-up DeepSeek accelerated a shift in AI\u2019s computing requirements. DeepSeek\u2019s R1 and other so-called \u201creasoning\u201d models, such as OpenAI\u2019s o3 and Anthropic\u2019s Claude 3.7, consume more computing<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[63],"tags":[],"class_list":{"0":"post-236485","1":"post","2":"type-post","3":"status-publish","4":"format-standard","6":"category-tech"},"_links":{"self":[{"href":"https:\/\/globetimeline.com\/ar\/wp-json\/wp\/v2\/posts\/236485","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/globetimeline.com\/ar\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/globetimeline.com\/ar\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/globetimeline.com\/ar\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/globetimeline.com\/ar\/wp-json\/wp\/v2\/comments?post=236485"}],"version-history":[{"count":0,"href":"https:\/\/globetimeline.com\/ar\/wp-json\/wp\/v2\/posts\/236485\/revisions"}],"wp:attachment":[{"href":"https:\/\/globetimeline.com\/ar\/wp-json\/wp\/v2\/media?parent=236485"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/globetimeline.com\/ar\/wp-json\/wp\/v2\/categories?post=236485"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/globetimeline.com\/ar\/wp-json\/wp\/v2\/tags?post=236485"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}