Nvidia unveiled the Vera Rubin AI computing platform at CES 2026, claiming up to 10x lower inference token costs and faster ...
API Gateway Response Streaming Support - You can now deploy with Amazon API Gateway REST API instead of ALB, enabling true response streaming for better latency and cost optimization. See Deployment ...