Entities · Labs
Gotit.pub
7 articles tagged with this entity.
-
MalTree: Tracing Malware Evolution from Embeddings at Scale
-
SlimSearcher: Training Efficiency-Aware Web Agents via Adaptive Reward Gating
-
ShallowBench: Benchmarking Generative Drug Design Models on Shallow-Pocket Targets
-
MacArena: Benchmarking Computer Use Agents on an Online macOS Environment
-
Act As a Real Researcher: A Suite of Benchmarks Evaluating Frontier LLMs and Agentic Harnesses in Research Lifecycle
-
OpenHalDet: A Unified Benchmark for Hallucination Detection across Diverse Generation Scenarios
-
GP-Adapter: Gaussian Process CLIP-Adapter for Few-Shot Out-of-Distribution Detection