AI Infra2023

garak

Probes LLMs for failure modes — prompt injection, jailbreaks, data leakage, toxicity, hallucination — the way nmap scans a network. Ships 20+ attack probes that run against Hugging Face, OpenAI, Bedrock, Cohere, or any REST endpoint.

Visit Website

Introduction

Security teams have decades of tooling to scan networks for weaknesses, but an LLM long stayed a black box you simply hoped behaved. garak flips that by treating a model the way a pentester treats a server: throw thousands of adversarial inputs at it and log every place it breaks. The underlying insight is that LLM safety is empirical, not declarative — you can't read a model card and know it won't leak data or obey a jailbreak. You have to attack it and count the failures.

What Sets It Apart

Probe-and-detector architecture, not a fixed checklist. Each of its dozen-plus probe families — DAN-style jailbreaks, encoding-based injection, training-data leakage, malware generation, toxicity — pairs with detectors that score responses, so coverage grows as new attack classes land rather than freezing at release.
Model-agnostic by design. The same suite runs against Hugging Face, OpenAI, AWS Bedrock, Cohere, Groq, Replicate, or any REST endpoint, so you can benchmark a hosted API against a local model on identical attacks.
Quantified output, not vibes. Runs emit JSONL logs and per-probe hit rates, turning "is this model safer than last quarter's?" into a number you can diff in CI instead of a judgment call.

Who It's For

Great fit if you ship or fine-tune LLMs and need repeatable, attack-based evidence of robustness before release, or if you red-team models and want a standard probe library instead of hand-rolled prompts. Look elsewhere if you need guardrails that block attacks at inference time — garak finds weaknesses, it doesn't patch them — or if you expect a polished GUI; it is a CLI whose output is logs and reports, closer to nmap than to a dashboard.

Back

Information

Websitegithub.com
OrganizationsNVIDIA
AuthorsLeon Derczynski, Erick Galinkin, Jeffrey Martin, Subho Majumdar, Nanna Inie, NVIDIA
Published date2023/05/10

More Items

Reinforcement Learning Papers2026

LongStraw: Long-Context RL Beyond 2M Tokens under a Fixed GPU Budget

Changhai Zhou, Kieran Liu +18

Enables RL post-training with million-token prompts under a fixed GPU budget by evaluating shared prompt state without autograd, retaining only minimal model state, and replaying short response branches; instantiated as GRPO and demonstrated on Qwen3.6-27B and GLM-5.2 up to multi-million token execution.

RL llm qwen mLOps ai-train+1

AI Infra2026

OpenTelemetry GenAI Semantic Conventions

OpenTelemetry

Defines OpenTelemetry semantic conventions for generative AI telemetry — spans, metrics, and events for GenAI clients, the Model Context Protocol (MCP), and provider-specific integrations. Includes YAML models, human-readable docs, and reference implementations to standardize observability across GenAI deployments.

mcp mcp-client mcp-server mlops ai-api+3

AI Infra2024

TheRock

ROCm (AMD)

Provides a lightweight build platform for HIP and ROCm that supports building ROCm, PyTorch, and JAX from source, multi-architecture nightly releases, and integrated CI/CD and developer tooling for Linux and Windows.

pytorch github ai-framework ai-development docker+1