You can try it here: https://inferx.net:8443/demo/

verdverm · 2026-01-31T16:01:34 1769875294

There is nothing there to "try", it's some very basic html displaying some information that doesn't mean anything to me. Looks like a status page, not a platform

Really, it looks like someone who is new to startups / b2b copy, welcome to first contact with users. Time to iterate or pivot

I would focus on design, aesthetics, and copy. Don't put any more effort into building until you have a message that resonates

pveldandi · 2026-01-31T16:11:55 1769875915

Basic html? The core of what we built is at the runtime layer. We’re capturing CUDA graphs and restoring model state directly at the GPU execution level rather than just snapshotting containers. That’s what enables fast restores and higher utilization across multiple models.

If that’s not a problem space you care about, that’s totally fair. But for teams juggling many models with uneven traffic, that’s where the economics start to matter.

pveldandi · 2026-01-31T16:16:44 1769876204

Also, For what it’s worth, this can be deployed both on-prem and in the cloud. Different teams have different constraints, so we’re trying to stay flexible on that.

pveldandi · 2026-01-31T16:26:19 1769876779

Happy to dig deeper and show how exactly it works under the hood. For context, here’s the main site where the architecture and deployment options are explained: https://inferx.net/

verdverm · 2026-01-31T16:35:09 1769877309

I don't personally have this problem. One of my clients does, so my questions are ones I'd expect the CTO to ask you in a sales call. They already have an in-house system and I suspect would not replace it with anything other an open source option or hyperscaler option.

Are you going to make this open source? That's the modus operandi around Ai and gaining adoption for those outside Big Ai (where branding is already strong)

pveldandi · 2026-01-31T16:49:08 1769878148

It’s an open-core model. The control plane is already open source and can be deployed fairly easily. We’re not trying to replace in-house systems or hyperscalers. This can run on Kubernetes and integrate into existing infrastructure. The runtime layer is where we’re focusing the differentiation.

pveldandi · 2026-01-31T16:48:07 1769878087

It’s an open-core model. The control plane is already open source and can be deployed fairly easily. We’re not trying to replace in-house systems or hyperscalers. This can run on Kubernetes and integrate into existing infrastructure. The runtime layer is where we’re focusing the differentiation.

verdverm · 2026-01-31T16:26:10 1769876770

I'm referring to the "demo" and inappropriateness of the Show HN prefix

there is nothing to try or play with, it's just content

pveldandi · 2026-01-31T16:32:53 1769877173

The demo is live. It’s meant to show how snapshot restore works inside a multi-tenant runtime, not just a prompt playground. You can interact with the deployed models and observe how state is restored and managed across them. The focus is on the runtime behavior rather than a chat UI.

verdverm · 2026-01-31T17:31:01 1769880661

Please read the first sentence of the Show HN guidelines, "show" is more specific in this context. This should be a regular submission type

https://news.ycombinator.com/showhn.html

pveldandi · 2026-01-31T17:52:46 1769881966

Fair point. I’ll repost as a regular submission instead of Show HN. The goal was to demonstrate the runtime behavior behind multi-model serving rather than a polished end-user app. Appreciate the clarification.

pveldandi · 2026-01-31T23:42:35 1769902955

Happy to engage with you if Hulu have any additional questions. Thanks for all the feedback.that was helpful