You loved ChatGPT. Wait until you see its rivals.

2023-01-31
关注

Connor Leahy thought this might happen. “I’ve been basically waiting for this moment for at least two years now,” says the co-founder of the AI start-up Conjecture, referring to the current buzz surrounding ChatGPT. Deeply enmeshed in the development of large language models (LLMs) since his time at the helm of EleutherAI, an open-source cooperative of machine learning engineers and enthusiasts, Leahy got his general amazement at the capabilities of these programs out of the way way back in 2019, when he first glimpsed GPT-2. “Now, I feel like a lot of people are having the reaction that I had,” he says.

That reaction can generally be described as one of delight, awe and not a little foreboding. Built using GPT-3, an LLM boasting some 175 billion parameters, ChatGPT has knocked the world for six for its effortless ability to write reams of eloquent, incisive prose (and some poetry, too.) Joined by other services like DALL-E 2 and Midjourney, the market for generative AI has suddenly become very hot indeed, with Microsoft poised to dramatically increase its initial $1bn investment in ChatGPT’s creator, OpenAI, with a cash injection of up to $10bn. Little wonder then, that increasing attention is being paid to the development of alternative LLMs to GPT-3 – foundation models that other tech giants would likely snap up in a heartbeat.

ChatGPT is facing competition from other LLMs
ChatGPT has captured the imagination of the general public – and big tech firms eager to invest in the next, hot generative AI model. (Photo by Rmedia7 / Shutterstock)

Jurassic’s Park

One of these alternatives is Jurassic-1. Numbering some 178 billion parameters in its largest form, the LLM is the brainchild of AI21 Labs, a research start-up-turned-product and platform manager founded in 2017 with the goal of creating models specialising in language generation and comprehension. “We didn’t want to play a pure research game like DeepMind,” says Ori Goshen, AI21 Lab’s chief executive (though it, too, is flirting with the idea of its own superpowered chatbot.) “We admire DeepMind, we think they’re great. But we also wanted to bring real commercial value from the get-go.”

That same instinct is present at Cohere, another AI start-up based out of Toronto. Its latest model, explains its chief operating officer Martin Kon, powers classification, semantic search and content moderation across 160 languages. “That might not be exciting to consumers who enjoy generating poems about their cats or images of dogs in sushi houses, but we feel it’s certainly incredibly exciting to CEOs and executives of companies, organisations, and governments of all sizes everywhere in the world,” he says. 

Companies Intelligence

View All

Reports

View All

Data Insights

View All

The increasing buzz surrounding generative AI seems to be helping both start-ups hit their stride, with Cohere raising $125m in Series B funding in February last year and AI21 Labs’ Jurassic-1 released on AWS’s machine learning platform Sagemaker in November. This doesn’t mean that AI21 Labs is imitating OpenAI’s close relationship with Microsoft, says Goshen. While he doesn’t rule out the possibility of a much closer partnership with a large platform – “I mean, never say never,” says Goshen, highlighting AI21 Labs’ fruitful partnerships with Amazon and Google – he maintains that the company remaining neutral when it comes to cloud providers has long-term benefits, not least in allowing it to explore using the latest computing hardware whatever the provider.

A similar mindset seems to prevail among start-ups in the LLM space, including BigScience and its BLOOM model, and Anthropic, which just released ChatGPT rival Claude to mixed reviews. As the market for new foundation models continues to heat up, Goshen predicts that demand will inevitably grow for LLMs to move away from being trained on pages scraped en masse from the internet to narrower, proprietary datasets. “That can create these more specialised models, maybe [for] specific domains, or even…models that have an understanding of a specific company,” he says.

An LLM alignment problem

It’s impossible to tell, however, whether we’re seeing the emergence of a dynamic marketplace for LLMs or the beginning of a longer process of consolidation, as Big Tech companies fight to acquire as much talent and capabilities for themselves as possible. What we may see, argues Leahy, is something akin to Goshen’s prediction, with the creation of smaller models designed to accomplish narrower goals. The creation of larger and more complex LLMs, however, will continue to depend on the goodwill and infrastructure of hyperscalers. “There really are only a few actors in the world who are capable of mustering the resources to train something or build something like GPT-4,” he says, referring to OpenAI’s next LLM, due to be released sometime this year. Details of the system have yet to be revealed, but many in the field predict it will surpass most other LLMs in size and complexity.

Even so, OpenAI’s CEO Sam Altman recently told Reuters that the firm wouldn’t release the model until it met strict safeguarding benchmarks. It’s a challenge Leahy sympathises with. Known in AI research as the ‘alignment problem,’ Leahy spends much of his time at Conjecture figuring out how to bind new models to quintessentially human ethics and motivations. While he believes that generative AI has the potential to become “unimaginably positive for the world,” he worries that not enough people realise machine intelligences think very differently to the people they’re serving – and that those who do realise, like OpenAI and other start-ups, don’t know yet how to make them hold to our common set of values. It’s certainly something to consider as more LLMs hit the market in the coming year. As psychiatrist and part-time AI guru Scott Alexander recently put it in his own post about ChatGPT, ‘[t]his thing is an alien that has been beaten into a shape that makes it look vaguely human. But scratch it the slightest bit and the alien comes out.’

Content from our partners

Sherif Tawfik: The Middle East and Africa are ready to lead on the climate

Sherif Tawfik: The Middle East and Africa are ready to lead on the climate

What to look for in a modern ERP system

What to look for in a modern ERP system

How tech leaders can keep energy costs down and meet efficiency goals

How tech leaders can keep energy costs down and meet efficiency goals

Read more: Russian hackers are bypassing ChatGPT restrictions imposed by OpenAI

View all newsletters Sign up to our newsletters Data, insights and analysis delivered to you By The Tech Monitor team

Topics in this article : AI , ChatGPT

参考译文
你喜欢ChatGPT。等着瞧它的竞争对手吧。
康纳·莱希认为这可能会发生。这位人工智能初创公司Conjecture的联合创始人说:“我基本上已经等这一刻等了至少两年了。”他指的是目前围绕ChatGPT的热议。自从Leahy掌舵EleutherAI(机器学习工程师和爱好者的开源合作组织)以来,他就深深沉浸在大型语言模型(LLMs)的开发中,早在2019年,当他第一次看到ggt -2时,他就对这些程序的能力感到惊讶。“现在,我觉得很多人的反应都和我一样,”他说。这种反应通常可以被描述为一种喜悦、敬畏和不小的预感。ChatGPT使用GPT-3(拥有1750亿个参数的LLM)构建,以其轻松写出大量雄辩而精辟的散文(以及一些诗歌)的能力击败了世界。随着DALL-E 2和Midjourney等其他服务的加入,生成式人工智能市场突然变得非常火爆,微软(Microsoft)准备大幅增加对ChatGPT的创造者OpenAI的10亿美元初始投资,并注入高达100亿美元的现金。难怪,人们越来越多地关注GPT-3的替代LLMs的开发——其他科技巨头可能会立即抢购的基础模型。其中一个替代方案是Jurassic-1。LLM的最大形式约有1780亿个参数,是AI21实验室的脑力劳动,AI21实验室是一家研究初创公司,成立于2017年,由产品和平台经理转型而来,目标是创建专门用于语言生成和理解的模型。“我们不想像DeepMind那样玩纯粹的研究游戏,”AI21实验室的首席执行官奥里·戈珊(Ori Goshen)说(尽管该实验室也在考虑开发自己的超级聊天机器人)。“我们钦佩DeepMind,我们认为他们很棒。但我们也想从一开始就带来真正的商业价值。同样的本能也存在于另一家总部位于多伦多的人工智能初创公司Cohere。其首席运营官Martin Kon解释说,它的最新模式支持160种语言的分类、语义搜索和内容审核。他说:“对于那些喜欢写关于猫的诗或寿司店狗的图片的消费者来说,这可能不会令人兴奋,但我们认为,对于世界各地大小公司、组织和政府的首席执行官和高管来说,这肯定会令人非常兴奋。”围绕生成式人工智能的热议似乎正在帮助这两家初创企业取得进展,去年2月,Cohere在B轮融资中筹集了1.25亿美元,AI21实验室的Jurassic-1于11月在AWS的机器学习平台Sagemaker上发布。Goshen说,这并不意味着AI21实验室正在模仿OpenAI与微软的密切关系。虽然他不排除与大型平台建立更密切合作关系的可能性——“我的意思是,永远不要说永远,”Goshen强调AI21实验室与亚马逊和谷歌的富有成效的合作关系——但他坚持认为,当涉及到云供应商时,公司保持中立有长期的好处,尤其是允许它探索使用最新的计算硬件,无论供应商是什么。类似的心态似乎在LLM领域的初创企业中盛行,包括BigScience和它的BLOOM模型,以及Anthropic,后者刚刚发布了ChatGPT的竞争对手Claude,评论褒奖不一。随着新的基础模型市场的持续升温,Goshen预测,对LLMs的需求将不可避免地增长,他们将从从互联网上大量抓取的页面上转移到更窄的专有数据集上。他说:“这可以创建这些更专业的模型,也许(针对)特定领域,甚至……能够理解特定公司的模型。” 然而,我们无法判断,我们看到的是一个充满活力的LLMs市场的出现,还是一个更漫长的整合过程的开始,因为大型科技公司都在为自己争取尽可能多的人才和能力。莱希认为,我们可能会看到类似于Goshen的预测,为了实现更窄的目标而设计出更小的模型。然而,更大更复杂的llm的创建将继续依赖于超大规模的商誉和基础设施。他说:“世界上真的只有少数几个人有能力聚集资源来训练或构建像GPT-4这样的东西。”他指的是OpenAI将于今年某个时候发布的下一个LLM。该系统的细节尚未透露,但该领域的许多人预测,它将在规模和复杂性上超过大多数其他llm。即便如此,OpenAI的首席执行官Sam Altman最近告诉路透社,该公司在达到严格的安全基准之前不会发布该模型。莱希对这一挑战表示同情。在人工智能研究领域,莱希将这个问题称为“对齐问题”,他在Conjecture公司花了大量时间研究如何将新模型与典型的人类伦理和动机结合起来。虽然他认为生成式人工智能有可能“对世界产生难以想象的积极影响”,但他担心,没有足够多的人意识到机器智能与它们所服务的人的思维方式非常不同——而那些意识到这一点的人,比如OpenAI和其他初创企业,还不知道如何让它们坚持我们共同的价值观。随着更多的llm在未来一年涌入市场,这当然是值得考虑的事情。正如精神病学家兼兼职人工智能专家斯科特·亚历山大(Scott Alexander)最近在他自己关于ChatGPT的帖子中所说的那样,“他的东西是一个外星人,被打成了某种形状,看起来有点像人类。”但只要稍微刮一下,外星人就会出来。”
您觉得本篇内容如何
评分

评论

您需要登录才可以回复|注册

提交评论

提取码
复制提取码
点击跳转至百度网盘