Architecting a GenAI Deployment That Actually Shipped

The situation

A company had built a working GenAI prototype — a retrieval-augmented system that gave employees fast access to internal knowledge. It worked in demo. But the project had stalled in pre-deployment review: security raised concerns about data exposure, finance balked at projected costs at production scale, and legal flagged uncertainty around how the system handled sensitive information. The prototype that had impressed everyone was suddenly the system nobody could approve.

The work

Architected the production version from the ground up around the constraints the prototype had ignored. Data boundaries designed so sensitive content stayed inside controlled access patterns. Cost-aware architecture that kept inference costs predictable at scale, with FinOps controls baked in. NIST AI RMF-aligned governance overlay — risk register, monitoring, evaluation procedures — designed into the architecture rather than retrofitted as a compliance layer. The trade-offs were named explicitly so leadership could make the deployment decision with full visibility.

The result

The system shipped, with security, finance, and legal aligned on the architecture before deployment rather than discovering objections after. The same prototype that had stalled in review became the foundation of a production system because the architecture answered the questions that prototypes don’t have to.

← All case studies