rewrite this title Hanna Hajishirzi's open approach to AI challenges the industry's norms

Summarize this content to 2000 words in 6 paragraphs

Hanna Hajishirzi in her office at the Allen Institute for AI in Seattle. (GeekWire Photo / Todd Bishop)

Editor’s note: This series profiles five of the Seattle region’s “Uncommon Thinkers”: inventors, scientists, technologists and entrepreneurs transforming industries and driving positive change in the world. They will be recognized Dec. 12 at the GeekWire Gala. Uncommon Thinkers is presented in partnership with Greater Seattle Partners.

Hannaneh “Hanna” Hajishirzi’s quest to crack open the black box of artificial intelligence reflects a determination that has defined her pursuit of computer science since she was 10 years old.

As a kid, she loved the predictability and logic of math and computers. Chemistry? No, thanks. Too many exceptions. But when it came to studying graph theory and algebra, or programming in QBasic and Pascal, she was all in.

“Raising Hanna was both easy and hard,” said her mom, Soheila Hajishirzi. It was easy, she explained, because her daughter was always more mature than her age, and more willing to listen to logical reasoning. But it was also hard, her mom said, “because she never settled — she always challenged herself to learn more.”

Hajishirzi specializes in natural language processing, with a focus on open-source AI models. She is senior director of NLP research at the Allen Institute for AI (Ai2), and an associate professor at the University of Washington’s Paul G. Allen School of Computer Science in Seattle.

Her work is based, in part, on the belief that the world deserves to know more about powerful AI models than big tech companies are willing to reveal. In her role at Ai2, she is one of the leaders of the team behind the OLMo open language model and Tulu post-training model.

What’s different is that the non-profit institute makes all the ingredients, recipes, and secret sauce for these AI models available publicly, in the form of training data, training code, model weights, checkpoints, and other information. The idea is to improve technical understanding of AI among researchers, and accelerate scientific progress in the field overall.

“We want to innovate, build new methodologies, new advances in language modeling and generative AI research, and then even inform many other researchers and companies about what new models should look like,” Hajishirzi said.

This contrasts with popular closed AI models, in which the lack of transparency limits the ability of researchers to evaluate, improve, and build on them.

“I think a big part of why Hanna is passionate about our radically open approach is simply her commitment to science,” said Noah Smith, also of Ai2 and the UW’s Allen School, who co-manages the team responsible for OLMo and Tulu with Hajishirzi.

Fueled by a competitive spirit

Hajishirzi and the Ai2 team take pride when their models match or exceed rivals, especially closed alternatives, in core AI model performance benchmarks. This is no small achievement, especially given that Ai2 doesn’t have the vast resources of Microsoft, Amazon, or Google, or relative upstarts like Anthropic and OpenAI.

Ai2 makes up for this by focusing in part on improving the efficiency of training large language models, to reduce the computing resources and costs required.

This, too, is a point of pride — showing that it’s possible for a smart, nimble team to go toe-to-toe with tech giants.

Smith recalled, for example, how Hajishirzi was able to rally their team to focus on beating a certain AI model during the development of Tulu 3. This is the Ai2 initiative that focuses on post-training, the process of refining a language model to enhance its capabilities for specific applications. (Smith called the competing model “X,” not wanting to reveal which model they were determined to topple.)

“Hanna managed to get the team to focus on ‘beating X’ just the right amount — never sacrificing on scientific integrity, but also staying inspired to take on the challenge,” Smith said. “I think it comes down to careful listening and knowing where everyone’s ‘at’ a particular day, and interjecting with the ‘beat X’ mantra at just the right times when it will inspire.”

Click image to enlarge. This chart from Ai2 shows how Tulu 3 compares to other models on specific tasks.

Announcing the new Tulu 3 models last month, Hajishirzi said they rival and, in some cases, exceed proprietary models from OpenAI, Mistral, Google, and others on benchmarks for skills like math, instruction following, and chat capabilities.

Focusing on real-world challenges

Hajishirzi’s competitive spirit goes back to her childhood.

“She was a curious kid with a deep passion for all sorts of games, from puzzles and legos, to volleyball and dodgeball,” her mom said. Starting as a young kid, “she showed signs of being competitive, and she worked hard to be her very best in every single game she played.”

As she grew older, she started to show traits of being a leader among her friends, and earning their trust.

“Throughout her career she has consistently shown strong determination to be her very best at every stage in her life,” her mom said.

In high school and college, Hajishirzi’s passion for discrete math and efficient code initially fueled her interest in theoretical computer science. But then a robotics competition opened her eyes to the bigger potential.

“We didn’t succeed, but it was an interesting real-world application,” she said, remembering what it was like to apply her knowledge of computer algorithms and data structure to something tangible like a robot’s movement. “So that became very interesting to me.”

Hanna Hajishirzi delivers the luncheon keynote at the Research Showcase and Open House at the UW’s Paul G. Allen School of Computer Science & Engineering in 2023. (GeekWire Photo / Todd Bishop)

Hajishirzi grew up in Iran and attended the Sharif University of Technology, getting her bachelor’s degree in computer science and engineering. She came to the U.S. at age 20, attending the University of Illinois at Urbana-Champaign for her doctorate in computer science, focusing on AI. She has been a University of Washington professor since 2014, and an Ai2 researcher since 2018.

She is married to Ali Farhadi, who returned to Ai2 last year as CEO of the organization. Farhadi founded and led Ai2 spinout Xnor.ai as CEO, and sold the AI startup to Apple in 2020.

Challenges and opportunities ahead for Ai2’s NLP team include advancing AI in specific scientific domains, improving their language modeling capabilities, and seeking novel directions in AI research, while prioritizing education, community engagement, and underlying principles of transparency and open collaboration.

Seeing clearly through a mess

Does Hajishirzi consider herself an “Uncommon Thinker”? After initially pondering this question in our recent interview, she gave it some additional thought and followed up via email afterward.

“I think most people who think deeply are uncommon thinkers in their own way,” she wrote. “For me, I really enjoy taking on new challenges, especially ones that are inspired by real-world problems. Tackling these requires persistence, which I believe is key to successfully solving these challenges.”

Connecting ideas from different areas, and using them to take on new challenges, is one approach that Hajishirzi has found successful.

She explained, “I always believe there’s more than one way to tackle a problem, so if one approach doesn’t work, I’ll try a different path.”

Another defining personality trait: she likes things to be organized. “Even at my home, if I have extra time, I just go and organize things,” she explained.

Colleagues say this mindset translates into her work.

She is “really good at looking at a complicated, confusing mess of information and seeing clearly what can be safely ignored, what we can conclude, what we still need to figure out, and working out actionable next steps,” Smith said.

“Some people bring chaos,” he said. “Hanna brings order.”