ArXiv Intelligence

MedCTA: A Benchmark for Clinical Tool Agents

Topic · Agent

仅有原始 MD

Quick Read

LLM failed, fallback used

详情问答

T2S: A Rehearsal-Based Approach for Extraction-Resistant Model Watermarking

Topic · 其他

仅有原始 MD

Quick Read

LLM failed, fallback used

详情问答

Noise-Aware Framework for Correcting Corrupted Labels

Topic · 机器学习框架

仅有原始 MD

Quick Read

LLM failed, fallback used

详情问答

Goal-Autopilot: A Verifiable Anti-Fabrication Firewall for Unattended Long-Horizon Agents

Topic · Agent

仅有原始 MD

Quick Read

LLM failed, fallback used

详情问答

Layer-Isolated Evaluation: Gating the Deterministic Scaffold of a Production LLM Agent with a No-LLM, Regression-Locked Test Harness

Topic · Agent

仅有原始 MD

Quick Read

LLM failed, fallback used

详情问答

Reason, Then Re-reason: Cross-view Revisiting Improves Spatial Reasoning

Topic · 其他

仅有原始 MD

Quick Read

LLM failed, fallback used

详情问答

Can Open-Source LLM Agents Replace Static Application Security Testing Tools? An Empirical Assessment

Topic · Agent

仅有原始 MD

Quick Read

LLM failed, fallback used

详情问答

Runtime Skill Audit: Targeted Runtime Probing for Agent Skill Security

Topic · Agent

仅有原始 MD

Quick Read

LLM failed, fallback used

详情问答

ARGUS: Stacked Multi-View Identity Mosaic Injection for Subject-Preserving Video Generation

Topic · 其他

仅有原始 MD

Quick Read

LLM failed, fallback used

详情问答

Sparse probes and murky physics: a case study of interpretability challenges in a foundation model for continuum dynamics

Topic · 大模型底座

仅有原始 MD

Quick Read

LLM failed, fallback used

详情问答

TAROT: Task-Adaptive Refinement of LLM-prior Graphs for Few-shot Tabular Learning

Topic · 其他

仅有原始 MD

Quick Read

LLM failed, fallback used

详情问答

Are LLMs Bad at Moral Reasoning?

Topic · 其他

仅有原始 MD

Quick Read

LLM failed, fallback used

详情问答

Sovereign Assurance Boundary: Certificate-Bound Admission for Agentic Infrastructure

Topic · Agent

仅有原始 MD

Quick Read

LLM failed, fallback used

详情问答

LUCID: Learning Embodiment-Agnostic Intent Models from Unstructured Human Videos for Scalable Dexterous Robot Skill Acquisition

Topic · 具身智能

仅有原始 MD

Quick Read

LLM failed, fallback used

详情问答

When Context Returns: Toward Robust Internalization in On-Policy Distillation

Topic · 其他

仅有原始 MD

Quick Read

LLM failed, fallback used

详情问答

Information-Theoretic Decomposition for Multimodal Interaction Learning

Topic · 其他

仅有原始 MD

Quick Read

LLM failed, fallback used

详情问答

Physics-Distilled Neural Network enabled by Large Language Models for Manufacturing Process-Property Predictive Modeling

Topic · 其他

仅有原始 MD

Quick Read

LLM failed, fallback used

详情问答

Model-Based and Data-Driven Hierarchical Control and Topology Co-Design for Robust Networked Systems

Topic · 具身智能

仅有原始 MD

Quick Read

LLM failed, fallback used

详情问答

AVIS: Adaptive Test-Time Scaling for Vision-Language Models

Topic · 其他

仅有原始 MD

Quick Read

LLM failed, fallback used

详情问答

ConsistencyPlanner: Real-time Planning with Fast-Sampling Consistency Models

Topic · Agent

仅有原始 MD

Quick Read

LLM failed, fallback used

详情问答

2026-06-11 · 199 篇

MedCTA: A Benchmark for Clinical Tool Agents

T2S: A Rehearsal-Based Approach for Extraction-Resistant Model Watermarking

Noise-Aware Framework for Correcting Corrupted Labels

Goal-Autopilot: A Verifiable Anti-Fabrication Firewall for Unattended Long-Horizon Agents

Layer-Isolated Evaluation: Gating the Deterministic Scaffold of a Production LLM Agent with a No-LLM, Regression-Locked Test Harness

Reason, Then Re-reason: Cross-view Revisiting Improves Spatial Reasoning

Can Open-Source LLM Agents Replace Static Application Security Testing Tools? An Empirical Assessment

Runtime Skill Audit: Targeted Runtime Probing for Agent Skill Security

ARGUS: Stacked Multi-View Identity Mosaic Injection for Subject-Preserving Video Generation

Sparse probes and murky physics: a case study of interpretability challenges in a foundation model for continuum dynamics

TAROT: Task-Adaptive Refinement of LLM-prior Graphs for Few-shot Tabular Learning

Are LLMs Bad at Moral Reasoning?

Sovereign Assurance Boundary: Certificate-Bound Admission for Agentic Infrastructure

LUCID: Learning Embodiment-Agnostic Intent Models from Unstructured Human Videos for Scalable Dexterous Robot Skill Acquisition

When Context Returns: Toward Robust Internalization in On-Policy Distillation

Information-Theoretic Decomposition for Multimodal Interaction Learning

Physics-Distilled Neural Network enabled by Large Language Models for Manufacturing Process-Property Predictive Modeling

Model-Based and Data-Driven Hierarchical Control and Topology Co-Design for Robust Networked Systems

AVIS: Adaptive Test-Time Scaling for Vision-Language Models

ConsistencyPlanner: Real-time Planning with Fast-Sampling Consistency Models