MedCTA: A Benchmark for Clinical Tool Agents
MedCTA: A Benchmark for Clinical Tool Agents
Topic · Agent
仅有原始 MD
Quick Read
LLM failed, fallback used
T2S: A Rehearsal-Based Approach for Extraction-Resistant Model Watermarking
T2S: A Rehearsal-Based Approach for Extraction-Resistant Model Watermarking
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Noise-Aware Framework for Correcting Corrupted Labels
Noise-Aware Framework for Correcting Corrupted Labels
Topic · 机器学习框架
仅有原始 MD
Quick Read
LLM failed, fallback used
Goal-Autopilot: A Verifiable Anti-Fabrication Firewall for Unattended Long-Horizon Agents
Goal-Autopilot: A Verifiable Anti-Fabrication Firewall for Unattended Long-Horizon Agents
Topic · Agent
仅有原始 MD
Quick Read
LLM failed, fallback used
Layer-Isolated Evaluation: Gating the Deterministic Scaffold of a Production LLM Agent with a No-LLM, Regression-Locked Test Harness
Layer-Isolated Evaluation: Gating the Deterministic Scaffold of a Production LLM Agent with a No-LLM, Regression-Locked Test Harness
Topic · Agent
仅有原始 MD
Quick Read
LLM failed, fallback used
Reason, Then Re-reason: Cross-view Revisiting Improves Spatial Reasoning
Reason, Then Re-reason: Cross-view Revisiting Improves Spatial Reasoning
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Can Open-Source LLM Agents Replace Static Application Security Testing Tools? An Empirical Assessment
Can Open-Source LLM Agents Replace Static Application Security Testing Tools? An Empirical Assessment
Topic · Agent
仅有原始 MD
Quick Read
LLM failed, fallback used
Runtime Skill Audit: Targeted Runtime Probing for Agent Skill Security
Runtime Skill Audit: Targeted Runtime Probing for Agent Skill Security
Topic · Agent
仅有原始 MD
Quick Read
LLM failed, fallback used
ARGUS: Stacked Multi-View Identity Mosaic Injection for Subject-Preserving Video Generation
ARGUS: Stacked Multi-View Identity Mosaic Injection for Subject-Preserving Video Generation
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Sparse probes and murky physics: a case study of interpretability challenges in a foundation model for continuum dynamics
Sparse probes and murky physics: a case study of interpretability challenges in a foundation model for continuum dynamics
Topic · 大模型底座
仅有原始 MD
Quick Read
LLM failed, fallback used
TAROT: Task-Adaptive Refinement of LLM-prior Graphs for Few-shot Tabular Learning
TAROT: Task-Adaptive Refinement of LLM-prior Graphs for Few-shot Tabular Learning
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Are LLMs Bad at Moral Reasoning?
Are LLMs Bad at Moral Reasoning?
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Sovereign Assurance Boundary: Certificate-Bound Admission for Agentic Infrastructure
Sovereign Assurance Boundary: Certificate-Bound Admission for Agentic Infrastructure
Topic · Agent
仅有原始 MD
Quick Read
LLM failed, fallback used
LUCID: Learning Embodiment-Agnostic Intent Models from Unstructured Human Videos for Scalable Dexterous Robot Skill Acquisition
LUCID: Learning Embodiment-Agnostic Intent Models from Unstructured Human Videos for Scalable Dexterous Robot Skill Acquisition
Topic · 具身智能
仅有原始 MD
Quick Read
LLM failed, fallback used
When Context Returns: Toward Robust Internalization in On-Policy Distillation
When Context Returns: Toward Robust Internalization in On-Policy Distillation
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Information-Theoretic Decomposition for Multimodal Interaction Learning
Information-Theoretic Decomposition for Multimodal Interaction Learning
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Physics-Distilled Neural Network enabled by Large Language Models for Manufacturing Process-Property Predictive Modeling
Physics-Distilled Neural Network enabled by Large Language Models for Manufacturing Process-Property Predictive Modeling
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Model-Based and Data-Driven Hierarchical Control and Topology Co-Design for Robust Networked Systems
Model-Based and Data-Driven Hierarchical Control and Topology Co-Design for Robust Networked Systems
Topic · 具身智能
仅有原始 MD
Quick Read
LLM failed, fallback used
AVIS: Adaptive Test-Time Scaling for Vision-Language Models
AVIS: Adaptive Test-Time Scaling for Vision-Language Models
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
ConsistencyPlanner: Real-time Planning with Fast-Sampling Consistency Models
ConsistencyPlanner: Real-time Planning with Fast-Sampling Consistency Models
Topic · Agent
仅有原始 MD
Quick Read
LLM failed, fallback used