1 minute read

Running AI on the edge could push inferencing (and training) costs to the user.

With the release of Web GPU for major browsers, even some large models can be run "on the edge" (e.g. Stable Diffusion and Meta's LlaMA). As hardware improves and models become more efficient, some of the inferencing and finetuning will be done on device and thus reducing cloud costs.


Producing state-of-the-art AI innovation is a costly endeavour and could lead to a small amount of dominant players. However, open source models have historically commoditized new AI capabilities in surprisingly short periods of time.

Source: Adapted from StateofAI Report 2022; *Llama model was not intentionally made open-source (leak)

Microsoft Teams vs. Slack