Autonomous code optimization, backed by real benchmarks
AutoPerf uses LLM pipelines (including RAG) to generate, compile, and benchmark code variants on Google Cloud. CPU support ships first; NVIDIA GPUs and CUDA are next.
Built for HPC simulation, quantitative trading, ML inference, and kernel engineering teams that need measurable speedups.
Reproducible cloud benchmarks
Pinned Google Cloud instances ensure consistent CPU characteristics for every run.
Python + C/C++ support
Start by optimizing numeric kernels, trading models, and scientific loops. CUDA is next.
LLM-guided exploration
RAG pipelines analyze your codebase, propose candidates, and focus on high-impact tweaks.
Kernel-level insights
Track vectorization, memory access, and scheduling suggestions with benchmark deltas.
Safe execution
Container isolation, resource limits, and audit trails keep private code under control.
Auto GitHub PRs
Review diffs, metrics, and commentary directly inside an automatically opened pull request.
How it works
Profile
Instrument your workload to surface hot kernels, loops, and call stacks.
Generate
Feed LLM + RAG pipelines with repo context to draft high-impact code variants.
Compile
Build candidates in sandboxed containers with pinned toolchains and dependencies.
Benchmark
Run reproducible trials on dedicated GCP runners to gather real performance deltas.
Promote
Ship the winning change via an auto-generated GitHub PR complete with diffs and metrics.
Roadmap
Private alpha
CPU runners ship first. Join the waitlist to reserve a slot and share your workloads.
NVIDIA GPU support
A10/A100-class GPUs follow, enabling CUDA kernel tuning for ML and HPC pipelines.
Extended workflows
Broader language coverage, hybrid CPU/GPU strategies, and deeper profiling integrations.
Built for teams shipping performance-critical software
Whether you run HPC simulations, low-latency trading systems, algorithmic engines, or ML inference, AutoPerf keeps the optimization loop automated while you stay in control of the merge.
- • HPC & scientific simulation teams chasing runtime gains.
- • Fintech and quant orgs optimising latency-critical code paths.
- • ML/AI engineers tuning serving kernels and feature pipelines.
FAQ
Is CPU support live?
CPU runners will ship first in the private alpha. Join the waitlist to be notified as soon as slots open.
Which languages are supported?
We start with Python and C/C++. CUDA and broader GPU workflows are on the near-term roadmap.
How do I review changes safely?
AutoPerf opens a GitHub pull request with diffs, benchmark data, and suggested notes so you stay in full control.
Where do benchmarks run?
Runs execute on pinned Google Cloud machine types to keep results reproducible across sessions.
Reserve your spot
Join the waitlist and be the first to try AutoPerf when the private alpha opens.