Battery-Sim-Agent: Leveraging LLM-Agent for Inverse Battery Parameter Estimation
Battery-Sim-Agent: Leveraging LLM-Agent for Inverse Battery Parameter Estimation
Topic · Agent
仅有原始 MD
Quick Read
LLM failed, fallback used
Opt-Verifier: Unleashing the Power of LLMs for Optimization Modeling via Dual-Side Verification
Opt-Verifier: Unleashing the Power of LLMs for Optimization Modeling via Dual-Side Verification
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
UI-KOBE: Knowledge-Oriented Behavior Exploration for Lightweight Graph-Guided GUI Agents
UI-KOBE: Knowledge-Oriented Behavior Exploration for Lightweight Graph-Guided GUI Agents
Topic · Agent
仅有原始 MD
Quick Read
LLM failed, fallback used
DeepSurvey: Enhancing Analytical Depth and Citation Reliability in Automated Survey Generation
DeepSurvey: Enhancing Analytical Depth and Citation Reliability in Automated Survey Generation
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
MINDGAMES: A Live Arena for Evaluating Social and Strategic Reasoning in Multi-Agent LLMs
MINDGAMES: A Live Arena for Evaluating Social and Strategic Reasoning in Multi-Agent LLMs
Topic · Agent
仅有原始 MD
Quick Read
LLM failed, fallback used
Xetrieval: Mechanistically Explaining Dense Retrieval
Xetrieval: Mechanistically Explaining Dense Retrieval
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
The Curse of Helpfulness: Inverse Scaling Law in Robustness to Distractor Instructions via DistractionIF
The Curse of Helpfulness: Inverse Scaling Law in Robustness to Distractor Instructions via DistractionIF
Topic · 大模型底座
仅有原始 MD
Quick Read
LLM failed, fallback used
VitalAgent: A Tool-Augmented Agent for Reactive and Proactive Physiological Monitoring over Wearable Health Data
VitalAgent: A Tool-Augmented Agent for Reactive and Proactive Physiological Monitoring over Wearable Health Data
Topic · Agent
仅有原始 MD
Quick Read
LLM failed, fallback used
CrystalXRD-Bench: Benchmarking Vision-Language Models for XRD Peak Indexing Across Diverse Crystalline Materials
CrystalXRD-Bench: Benchmarking Vision-Language Models for XRD Peak Indexing Across Diverse Crystalline Materials
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Towards Human-Like Interactive Speech Recognition With Agentic Correction and Semantic Evaluation
Towards Human-Like Interactive Speech Recognition With Agentic Correction and Semantic Evaluation
Topic · Agent
仅有原始 MD
Quick Read
LLM failed, fallback used
ReasonLight: A Multimodal Foundation Model-Enhanced Reinforcement Learning Framework for Zero-Shot Traffic Signal Control
ReasonLight: A Multimodal Foundation Model-Enhanced Reinforcement Learning Framework for Zero-Shot Traffic Signal Control
Topic · 强化学习
仅有原始 MD
Quick Read
LLM failed, fallback used
When Does Persona Prompting Actually Help? A Retrieval and Metric Analysis of Expert Role Injection in LLMs
When Does Persona Prompting Actually Help? A Retrieval and Metric Analysis of Expert Role Injection in LLMs
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Architecture-Sensitive Supervised Fine-Tuning for Screen-Conditioned Action Prediction: A PiSAR Benchmark
Architecture-Sensitive Supervised Fine-Tuning for Screen-Conditioned Action Prediction: A PiSAR Benchmark
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Aligned but Fragile: Enhancing LLM Safety Robustness via Zeroth-Order Optimization
Aligned but Fragile: Enhancing LLM Safety Robustness via Zeroth-Order Optimization
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
EvoMD-LLM: Learning the Language of Species Evolution in Reactive Molecular Dynamics
EvoMD-LLM: Learning the Language of Species Evolution in Reactive Molecular Dynamics
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
MiraBench: Evaluating Action-Conditioned Reliability in Robotic World Models
MiraBench: Evaluating Action-Conditioned Reliability in Robotic World Models
Topic · 强化学习
仅有原始 MD
Quick Read
LLM failed, fallback used
Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet
Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
PassNet: Scaling Large Language Models for Graph Compiler Pass Generation
PassNet: Scaling Large Language Models for Graph Compiler Pass Generation
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
ConMoE: Expert-Pool Consolidation via Prototype Reassignment for MoE Compression
ConMoE: Expert-Pool Consolidation via Prototype Reassignment for MoE Compression
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used
Rubric-Guided Process Reward for Stepwise Model Routing
Rubric-Guided Process Reward for Stepwise Model Routing
Topic · 其他
仅有原始 MD
Quick Read
LLM failed, fallback used