{"id":271840,"date":"2025-04-11T05:39:35","date_gmt":"2025-04-11T05:39:35","guid":{"rendered":"https:\/\/globetimeline.com\/ar\/tech\/rewrite-this-title-in-arabic-openai-slashes-ai-model-safety-testing-time\/"},"modified":"2025-04-11T05:39:35","modified_gmt":"2025-04-11T05:39:35","slug":"rewrite-this-title-in-arabic-openai-slashes-ai-model-safety-testing-time","status":"publish","type":"post","link":"https:\/\/globetimeline.com\/ar\/tech\/rewrite-this-title-in-arabic-openai-slashes-ai-model-safety-testing-time\/","title":{"rendered":"rewrite this title in Arabic OpenAI slashes AI model safety testing time"},"content":{"rendered":"<p>Summarize this content to 2000 words in 6 paragraphs in Arabic OpenAI has slashed the time and resources it spends on testing the safety of its powerful artificial intelligence models, raising concerns that its technology is being rushed out without sufficient safeguards.Staff and third-party groups have recently been given just days to conduct \u201cevaluations\u201d, the term given to tests for assessing models\u2019 risks and performance, on OpenAI\u2019s latest large language models, compared to several months previously.According to eight people familiar with OpenAI\u2019s testing processes, the start-up\u2019s tests have become less thorough, with insufficient time and resources dedicated to identifying and mitigating risks, as the $300bn start-up comes under pressure to release new models quickly and retain its competitive edge. \u201cWe had more thorough safety testing when [the technology] was less important,\u201d said one person currently testing OpenAI\u2019s upcoming o3 model, designed for complex tasks such as problem-solving and reasoning.They added that as LLMs become more capable, the \u201cpotential weaponisation\u201d of the technology is increased. \u201cBut because there is more demand for it, they want it out faster. I hope it is not a catastrophic mis-step, but it is reckless. This is a recipe for disaster.\u201dThe time crunch has been driven by \u201ccompetitive pressures\u201d, according to people familiar with the matter, as OpenAI races against Big Tech groups such as Meta and Google and start-ups including Elon Musk\u2019s xAI to cash in on the cutting-edge technology.There is no global standard for AI safety testing, but from later this year, the\u00a0EU\u2019s AI Act\u00a0will compel companies to conduct safety tests on their most powerful models. Previously, AI groups, including OpenAI, have signed voluntary commitments with governments in the UK and US to allow researchers at AI safety institutes to test models.OpenAI has been pushing to release its new model o3 as early as next week, giving less than a week to some testers for their safety checks, according to people familiar with the matter. This release date could be subject to change.Previously, OpenAI allowed several months for safety tests. For GPT-4, which was launched in 2023, testers had six months to conduct evaluations before it was released, according to people familiar with the matter.One person who had tested GPT-4 said some dangerous capabilities were only discovered two months into testing. \u201cThey are just not prioritising public safety at all,\u201d they said of OpenAI\u2019s current approach.\u201cThere\u2019s no regulation saying [companies] have to keep the public informed about all the scary capabilities\u2009.\u2009.\u2009.\u2009and also they\u2019re under lots of pressure to race each other so they\u2019re not going to stop making them more capable,\u201d said Daniel Kokotajlo, a former OpenAI researcher who now leads the non-profit group AI Futures Project.OpenAI has previously committed to building customised versions of its models to assess for potential misuse, such as whether its technology could help make a biological virus more transmissible.The approach involves considerable resources, such as assembling data sets of specialised information like virology and feeding it to the model to train it in a technique called fine-tuning.But OpenAI has only done this in a limited way, opting to fine-tune an older, less capable model instead of its more powerful and advanced ones.\u00a0The start-up\u2019s\u00a0safety and performance\u00a0report\u00a0on o3-mini, its smaller\u00a0model\u00a0released in January,\u00a0references how its earlier model GPT-4o\u00a0was able to perform a certain biological task only when fine-tuned. However, OpenAI has never reported how its newer models, like o1 and o3-mini, would also score if fine-tuned.\u00a0\u201cIt is great OpenAI set such a high bar by committing to testing customised versions of their models. But if it is not following through on this commitment, the public deserves to know,\u201d said Steven Adler, a former OpenAI safety researcher, who has written a blog about this topic.\u201cNot doing such tests could mean OpenAI and the other AI companies are underestimating the worst risks of their models,\u201d he added.People familiar with such tests said they bore hefty costs, such as hiring external experts, creating specific data sets, as well as using internal engineers and computing power. OpenAI said it had made efficiencies in its evaluation processes, including automated tests, which have led to a reduction in timeframes. It added there was no agreed recipe for approaches such as fine-tuning, but it was confident that its methods were the best it could do and were made transparent in its reports. It added that models, especially for catastrophic risks, were thoroughly tested and mitigated for safety.\u201cWe have a good balance of how fast we move and how thorough we are,\u201d said Johannes Heidecke, head of safety systems.Another concern raised was that safety tests are often not conducted on the final models released to the public. Instead, they are performed on earlier so-called checkpoints that are later updated to improve performance and capabilities, with \u201cnear-final\u201d versions referenced in OpenAI\u2019s system safety reports.\u201cIt is bad practice to release a model which is different from the one you evaluated,\u201d said a former OpenAI technical staff member.OpenAI said the checkpoints were \u201cbasically identical\u201d to what was launched in the end.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Summarize this content to 2000 words in 6 paragraphs in Arabic OpenAI has slashed the time and resources it spends on testing the safety of its powerful artificial intelligence models, raising concerns that its technology is being rushed out without sufficient safeguards.Staff and third-party groups have recently been given just days to conduct \u201cevaluations\u201d, the<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[63],"tags":[],"class_list":{"0":"post-271840","1":"post","2":"type-post","3":"status-publish","4":"format-standard","6":"category-tech"},"_links":{"self":[{"href":"https:\/\/globetimeline.com\/ar\/wp-json\/wp\/v2\/posts\/271840","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/globetimeline.com\/ar\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/globetimeline.com\/ar\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/globetimeline.com\/ar\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/globetimeline.com\/ar\/wp-json\/wp\/v2\/comments?post=271840"}],"version-history":[{"count":0,"href":"https:\/\/globetimeline.com\/ar\/wp-json\/wp\/v2\/posts\/271840\/revisions"}],"wp:attachment":[{"href":"https:\/\/globetimeline.com\/ar\/wp-json\/wp\/v2\/media?parent=271840"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/globetimeline.com\/ar\/wp-json\/wp\/v2\/categories?post=271840"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/globetimeline.com\/ar\/wp-json\/wp\/v2\/tags?post=271840"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}