> Wrappers rely on OpenAI. OpenAI relies on Microsoft. Microsoft needs NVIDIA. NVIDIA owns the chips that power it all
So this is the model that investors see. The reality is quite different. People and orgs are not stupid and want to avoid vendor lock-in.
So in reality:
* Wrapper don't only rely on OpenAI. In fact, in order to be competitive, they have to avoid OpenAI because it's terribly expensive. If they can get away with other models, the savings can be enormous as some of these can be 10x cheaper.
* Local models are a thing. You don't need proprietary models and API calls at all for certain uses. And these models get better and better each year.
* Nvidia is still the dominant player and this won't change in the next years but AMD is really making huge progress here. I don't mention TPUs as they seem to be much Google-specific.
* Microsoft is not in any special position here - I was implementing OpenAI API integrations with various API gateways and it's by no means something related to Azure only.
* OpenAI's business model is based on faith at this moment. This was debated ad nauseam so it makes no sense to repeat all arguments here but the fact is that they used to be the only one game in town, then the leader, and now are neither, but still claim to be.
> Local models are a thing. You don't need proprietary models and API calls at all for certain uses. And these models get better and better each year.
They are getting better so fast that I'm considering building a business that depends on much lower cost LLM inference. So betting years of effort on it.
But the bet is also that the proprietary models won't run away with faster improvements that make local models uncompetitive even while they improve. Can the local models keep up? They seem to be closing the gap now. Is that the rule or an artifact of the early development phase?
The safer plan may be to pass the inference cost through to the user and let them pick premium or budget models according to their need almost per request, as Zed editor does now.
Outside of giant tech companies, there are many researchers with access to little more than a single consumer GPU card. They are highly motivated to reduce the cost of training and inference.
> The safer plan may be to pass the inference cost through to the user and let them pick premium or budget models according to their need almost per request, as Zed editor does now.
I'm working on a solution right now that is using a local/cheap model first, does some validation, and if this validation fails, use the expensive SOTA model. This is the most reasonable approach if you have a way to verify the results somehow (which might not be easy depending on the use case).
The use case is structuring arbitrary natural language, e.g. triple extraction. That seems to benefit from as much context and intelligence as can be applied. "Good enough" remains a case by case judgment.
So this is the model that investors see. The reality is quite different. People and orgs are not stupid and want to avoid vendor lock-in.
So in reality:
* Wrapper don't only rely on OpenAI. In fact, in order to be competitive, they have to avoid OpenAI because it's terribly expensive. If they can get away with other models, the savings can be enormous as some of these can be 10x cheaper.
* Local models are a thing. You don't need proprietary models and API calls at all for certain uses. And these models get better and better each year.
* Nvidia is still the dominant player and this won't change in the next years but AMD is really making huge progress here. I don't mention TPUs as they seem to be much Google-specific.
* Microsoft is not in any special position here - I was implementing OpenAI API integrations with various API gateways and it's by no means something related to Azure only.
* OpenAI's business model is based on faith at this moment. This was debated ad nauseam so it makes no sense to repeat all arguments here but the fact is that they used to be the only one game in town, then the leader, and now are neither, but still claim to be.