Running AI on the edge could push inferencing (and training) costs to the user.

from Heartcore Capital - AI & Productivity Report 2023

Large models and finetuned derivative models will power the application layer

With the release of Web GPU for major browsers, even some large models can be run "on the edge" (e.g. Stable Diffusion and Meta's LlaMA). As hardware improves and models become more efficient, some of the inferencing and finetuning will be done on device and thus reducing cloud costs.

Producing state-of-the-art AI innovation is a costly endeavour and could lead to a small amount of dominant players. However, open source models have historically commoditized new AI capabilities in surprisingly short periods of time.

Source: Adapted from StateofAI Report 2022; *Llama model was not intentionally made open-source (leak)

Running AI on the edge could push inferencing (and training) costs to the user.

Next Article

Large models and finetuned derivative models will power the application layer

Microsoft Teams vs. Slack

More articles from this publication:

Large models and finetuned derivative models will power the application layer

Transformer models are taking advantage of GPU compute.

About this report

Konstantine Buhler

This article is from:

Heartcore Capital - AI & Productivity Report 2023