週次AIニュース 2026-W23

対象期間: 2026-06-01 〜 2026-06-07（2194 件）

トピックの推移

トピック別件数

今週のハイライト（上位 10 件）

2026-06-04 21:00 JSTOpenAILLM/生成AIエージェント

How Endava is redesigning software delivery around AI agents

Learn how Endava is using AI agents, ChatGPT Enterprise, and Codex to accelerate software delivery, automate workflows, and build an AI-nat…

2026-06-04 18:00 JSTOpenAILLM/生成AI

Dreaming: Better memory for a more helpful ChatGPT

ChatGPT introduces a new memory system to better remember preferences, keeping context fresh and relevant across conversations.

2026-06-03 22:15 JSTOpenAILLM/生成AI研究/論文

Introducing new capabilities to GPT-Rosalind

GPT-Rosalind advances life sciences research with enhanced biological reasoning, medicinal chemistry expertise, genomics analysis, and expe…

2026-06-03 21:00 JSTOpenAILLM/生成AIエージェント

How Wasmer used Codex to build a Node.js runtime for the edge

See how Wasmer used Codex with GPT-5.5 to build a Node.js runtime for the edge, accelerating development 10x to 20x and shipping in weeks i…

2026-06-03 19:00 JSTOpenAILLM/生成AI

A blueprint for democratic governance of frontier AI

OpenAI outlines a blueprint for U.S. governance of frontier AI, proposing a federal framework for safety, resilience, and national security.

2026-06-03 19:00 JSTOpenAILLM/生成AI

OpenAI public policy agenda

OpenAI outlines its public policy agenda for AI, including safety, youth protection, workforce transition, and global standards to ensure A…

2026-06-02 21:00 JSTOpenAILLM/生成AI

Travelers deploys AI-powered claims countrywide with OpenAI

Travelers built an AI-powered Claim Assistant with OpenAI to guide customers through filing claims, provide 24/7 support, and scale operati…

2026-06-02 18:00 JSTOpenAIエージェント

Codex for every role, tool, and workflow

Discover new Codex plugins, sites, and annotations that help analysts, marketers, designers, investors, and other teams get more done with…

2026-06-02 16:00 JSTOpenAILLM/生成AI

Advancing youth safety and opportunity through global leadership

OpenAI calls for global action on youth AI safety, proposing an international institute to strengthen safeguards, standards, and opportunit…

2026-06-01 21:00 JSTOpenAILLM/生成AI

Building the infrastructure for the Intelligence Age in Michigan

OpenAI breaks ground on a 1GW data center project in Michigan as part of Stargate, building AI infrastructure to expand access, create jobs…

全件（日付別）

2026-06-07（5件）

2026-06-07 07:22 JSTITmedia AI+LLM/生成AI

ChatGPTに「ロックダウンモード」　プロンプトインジェクションによる情報漏えい対策

OpenAIは、ChatGPTに新たなセキュリティ機能「ロックダウンモード」の提供を開始した。プロンプトインジェクション攻撃によるデータ流出リスクを抑えるためのオプションで、有効にするとWebブラウジングや外部サービスへの接続機能が制限される。機密データを扱い、厳格な保護を求め…

2026-06-07 05:32 JSTTechCrunch AILLM/生成AI

OpenAI unveils Lockdown Mode to protect sensitive data from prompt injection attacks

Even with Lockdown Mode, ChatGPT could be still vulnerable to prompt injections, but the goal is to reduce the likelihood that sensitive da…

2026-06-07 03:13 JSTTechCrunch AIその他

What to expect from WWDC 2026: Siri’s highly anticipated revamp and Apple Intelligence updates

Apple's WWDC nears: Here's what you can look forward to.

2026-06-07 02:42 JSTTechCrunch AI規制/政策

Sriram Krishnan is leaving his role as White House AI advisor

Krishnan is reportedly starting a new institution to continue shaping Trump's AI policy.

2026-06-07 01:17 JSTTechCrunch AILLM/生成AI

The Trump administration might take an equity stake in OpenAI

President Donald Trump said he's discussing deals "where the American people can benefit from the success of AI."

2026-06-06（3件）

2026-06-06 05:00 JSTTechCrunch AIその他

Startup Battlefield 200 applications officially close in 3 days

Applications for Startup Battlefield 200 officially close on June 8, 11:59 p.m. PT. Don't wait any longer. Secure your shot at competing on…

2026-06-06 03:57 JSTTechCrunch AIその他

Google will pay SpaceX $920M per month for compute

In a statement, a Google representative described the deal as a result of unexpected demand for its recently launched AI products.

2026-06-06 02:17 JSTTechCrunch AIその他

The most interesting startups right now want to get you off your phone

While the AI fundraising machine keeps breaking its own records, some founders are building in the other direction. Mirror founder Brynn Pu…

2026-06-05（18件）

2026-06-05 23:49 JSTTechCrunch AIその他

The token bill comes due: Inside the industry scramble to manage AI’s runaway costs

"The whole conversation shifted from tokenmaxxing and 'go fast' to 'we need guardrails, how do we control this?'"

2026-06-05 23:00 JSTTechCrunch AIその他

The ‘together tech’ wave might be the most intriguing startup bet of 2026

While the AI fundraising machine keeps breaking its own records, some founders are building in the other direction. Mirror founder Brynn Pu…

2026-06-05 22:27 JSTITmedia AI+その他

英ケンブリッジ大学、AIが設計したワクチンの臨床試験に成功　未知の変異株にも備える“万能型”

ケンブリッジ大学は、AIが設計した抗原を用いる“ユニバーサルワクチン”の初の臨床試験に成功したと発表した。サルベコウイルス群のゲノム配列を機械学習で解析し、グループ共通の“スーパー抗原”を設計した。健康な39人に投与し、安全性と免疫応答を確認した。

2026-06-05 22:03 JSTTechCrunch AIその他

AirTrunk commits $30B to build 5GW of AI data centers in India

The Australian data center operator plans to set up 5GW of capacity in India.

2026-06-05 14:06 JSTTechCrunch AIその他

Mira Murati steps back into the spotlight, carefully

In the current environment, remaining heads down has diminishing returns; at some point, you have to make some noise just to remind the mar…

2026-06-05 09:00 JSTITmedia AI+その他

検図から積算まで支援する図面解析AI、工数を最大60％削減

フィーチャは、図面解析AI「Drawing-AI」の機能拡張と対応領域の拡大を発表した。回路図や金型図面に加えて建築図面にも対応し、検図やデータ化、積算業務を支援する。実証実験では作業工数を30～60％削減した。

2026-06-05 08:00 JSTITmedia AI+研究/論文

AIで思考力が奪われる？　世界の研究が警告するAIバカの壁【動画あり】

調べる前にAI、考える前にAI――こうした使い方をしていると知らないうちに思考力に影響を与えているかもしれません。AIを使うほど人はバカになるのでしょうか。今回はその実態と対処法を紹介します。

2026-06-05 07:43 JSTTechCrunch AILLM/生成AIビジネス/資金調達

Ahead of its IPO, Anthropic’s Daniela Amodei shrugs off doubts about AI’s returns

Anthropic has been growing at a breakneck pace. The company announced that annualized revenue crossed $47 billion in May, up dramatically f…

2026-06-05 07:29 JSTTechCrunch AILLM/生成AI

Airbnb’s Brian Chesky plans to launch a new AI lab

The Airbnb CEO said last year it hasn't struck an LLM partnership because existing products weren't quite ready.

2026-06-05 06:30 JSTTechCrunch AIその他

Defense tech, AI, and fundraising take center stage at StrictlyVC Los Angeles on June 18

On Thursday, June 18, at The Aerospace Corporation Campus, investors, founders, and tech leaders will gather for an evening of conversation…

2026-06-05 06:00 JSTITmedia AI+その他

タイヤFEM解析を45分から5分に　住友ゴムと富士通がAIサロゲートモデルを共同開発

住友ゴム工業と富士通は、タイヤ性能をAIで予測するAIサロゲートモデルを共同開発した。実証実験では、タイヤの変形挙動予測において解析時間を従来の約45分から約5分に短縮するとともに、約60万要素規模の解析を実現した。

2026-06-05 05:00 JSTITmedia AI+LLM/生成AI規制/政策

「この1年はAI戦国時代」　メルカリに学ぶ、AIガバナンス策定の勘所

生成AIの業務利用が前提となり、AIを通じてビジネス価値をどう生み出すかが問われている一方で、「シャドーAI」をはじめとするリスクも指摘されている。先行企業はAIのリスクをどう受け止め、対策に乗り出しているのか。本稿では「AI-Native Company」への転換を宣言し、A…

2026-06-05 04:33 JSTTechCrunch AIその他

Meta steals a tactic from Tesla and builds data centers in tents

Meta may have found one way to slash its massive data center bill: tents.

2026-06-05 04:20 JSTTechCrunch AIエージェント

Apple approves Poke as the first AI agent on its Messages for Business platform

Poke, the startup that lets people use AI agents through simple text messages, has become the first AI agent approved for Apple’s Messages…

2026-06-05 03:28 JSTITmedia AI+LLM/生成AI研究/論文

東大松尾研が「LLM講座基礎編」の講義資料を無料公開　期間限定で

東京大学の松尾・岩澤研究室（以下、東大松尾研）が、LLMの基礎から技術動向まで体系的にまとめた講義資料「大規模言語モデル（LLM）講座 2025 基礎編」を期間限定で無料公開している。

2026-06-05 01:32 JSTTechCrunch AIその他

Meta rolls out a new AI creator assistant on Facebook

Creators often have to parse through charts and dashboards to understand their performance, but with the new AI assistant, they can get qui…

2026-06-05 01:31 JSTTechCrunch AIその他

What to expect from WWDC 2026: Siri’s highly anticipated revamp and Apple Intelligence updates

Apple's WWDC nears: Here's what you can look forward to.

2026-06-05 00:05 JSTTechCrunch AIロボティクス

Is Silicon Valley ready to put robots in people’s homes? Hello Robot is.

The California startup released the fourth-generation of its home assistance robot, Stretch.

2026-06-04（353件）

2026-06-04 23:05 JSTTechCrunch AIその他

Apple touts $1.4 trillion in App Store billings and sales, 90% without a commission

Apple's App Store generated $1.4 trillion in sales, up from $1.3 trillion last year, with $149 billion in sales for digital goods.

2026-06-04 21:00 JSTOpenAILLM/生成AIエージェント

How Endava is redesigning software delivery around AI agents

Learn how Endava is using AI agents, ChatGPT Enterprise, and Codex to accelerate software delivery, automate workflows, and build an AI-nat…

2026-06-04 18:00 JSTOpenAILLM/生成AI

Dreaming: Better memory for a more helpful ChatGPT

ChatGPT introduces a new memory system to better remember preferences, keeping context fresh and relevant across conversations.

2026-06-04 17:55 JSTITmedia AI+ハードウェア/半導体

TSMC、AI活用拡大による成長維持に自信　株主総会、東京エレクトロンとの取引は継続

半導体受託生産の世界最大手、台湾積体電路製造（TSMC）は6月4日、台湾の新竹市で株主総会を開いた。魏哲家会長兼最高経営責任者（CEO）は、AIの活用拡大により「われわれの最先端技術と製造能力の価値は引き続き成長する」と述べ、今後数年間の同社の成長維持に強い自信を示した。

2026-06-04 13:00 JSTITmedia AI+LLM/生成AI

Google Chromeの新機能「Skills」　AIプロンプトの“毎回手打ち”を不要に

GoogleはChrome向けのAI新機能「Skills in Chrome」を発表した。AIプロンプトを保存してワンクリックで再利用可能にするという。

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Toward Pre-Deployment Assurance for Enterprise AI Agents: Ontology-Grounded Simulation and Trust Certification

Pre-deployment verification of enterprise artificial intelligence (AI) agents remains a critical gap between large language model (LLM) cap…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Stumbling Into AI Emotional Dependence: How Routine AI Interactions Reshape Human Connection

Public discourse and emerging policy typically assume that AI emotional support is a deliberate act: a lonely user consciously seeking comf…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Thinking Through Signs: PEEL as a Semiotic Scaffolding for Epistemically Accountable AI-Enabled Research

Large language models are reshaping research practice while quietly eroding researchers epistemic accountability. This commentary introduce…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント

SMAC-Talk: A Natural Language Extension of the StarCraft Multi-Agent Challenge for Large Language Models

As LLMs become more widely deployed, they are increasingly expected to work alongside other AI agents rather than operating in isolation. E…

2026-06-04 13:00 JSTarXiv cs.AIエージェント

Consensus is Strategically Insufficient: Reasoning-Trace Disagreement as a Knowledge-Representation Signal

Multi-agent systems are commonly designed to reduce disagreement through voting, consensus protocols, debate, or fault-tolerant aggregation…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成研究/論文

VAMPS: Visual-Assisted Mathematical Problem Solving Benchmark

Multimodal large language models are increasingly capable of complex reasoning, yet their performance often degrades when they must externa…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

StepPRM-RTL: Stepwise Process-Reward Guided LLM Fine-Tuning for Enhanced RTL Synthesis

Automatic generation of RTL code for digital hardware designs remains challenging due to long-horizon reasoning, multi-step dependencies, a…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成エージェント

Can Generalist Agents Automate Data Curation?

Curating training data is among the most consequential yet labor-intensive parts of modern AI development: practitioners iteratively propos…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Characterizing initial human-AI proof formalization workflows

For centuries, human mathematicians have written proofs to substantiate their mathematical arguments; yet, the ability to automatically ver…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント

The Saturation Trap and the Subjectivity of Intervention Timing: Why Affect-Based Triggers and LLM Judges Fail to Time Interventions on Autonomous Agents

As autonomous AI agents move from conversational systems to long-horizon software execution, runtime safety layers that decide when to inte…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Exploring Cross-Scenario Generality of Agentic Memory Systems: Diagnostics and a Strong Baseline

LLM agents accumulate histories that outgrow their context windows, motivating a growing literature on memory systems. Yet most existing de…

2026-06-04 13:00 JSTarXiv cs.AIエージェント

The Digital Apprentice: A Framework for Human-Directed Agentic AI Development

Agentic AI deployments face a recurring design tension: heavy human oversight limits scale, while broad autonomy outruns accountability. Ne…

2026-06-04 13:00 JSTarXiv cs.AIエージェント

Online Skill Learning for Web Agents via State-Grounded Dynamic Retrieval

Language agents increasingly rely on reusable skills to improve multi-step web automation across related tasks. A growing line of work stud…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Not All Errors Are Equal: Consequence-Aware Reasoning Compute Allocation

Modern reasoning models can allocate different amounts of test-time computation, such as thinking tokens, model calls, or compute budget, t…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Trivium: Temporal Regret as a First-Class Objective for Causal-Memory Controllers

Many current agentic systems and LLM pipelines correct mistakes by optimizing outcome reward. This addresses only the what of failure: when…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Cascading Hallucination in Agentic RAG: The CHARM Framework for Detection and Mitigation

Multi-step agentic retrieval-augmented generation (RAG) pipelines have demonstrated significant capability for complex reasoning tasks, yet…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェントビジネス/資金調達研究/論文

The Meta-Agent Challenge: Are Current Agents Capable of Autonomous Agent Development?

Current AI benchmarks evaluate agents on task execution within human-designed workflows. These evaluations fundamentally fail to measure a…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント

AgentJet: A Flexible Swarm Training Framework for Agentic Reinforcement Learning

We present AgentJet, a distributed swarm training framework for large language model (LLM) agent reinforcement learning. Unlike centralized…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Beyond Prompt-Based Planning: MCP-Native Graph Planning-based Biomedical Agent System

Biomedical agents promise to automate complex biological workflows, yet current systems face two fundamental bottlenecks: bioinformatics to…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Simulate, Reason, Decide: Scientific Reasoning with LLMs for Simulation-Driven Decision Making

Scientific simulators are increasingly being integrated into LLM-driven systems for high-stakes simulation-driven decision-making. However,…

2026-06-04 13:00 JSTarXiv cs.AIエージェント

MapAgent: An Industrial-Grade Agentic Framework for City-scale Lane-level Map Generation

Lane-level maps are critical infrastructure for autonomous driving and lane-level navigation, yet constructing and maintaining standardized…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Scaling Self-Evolving Agents via Parametric Memory

Existing memory-augmented LLM agents store past experience exclusively in prompt space, as textual summaries or retrieved passages, while k…

2026-06-04 13:00 JSTarXiv cs.AIエージェント

Neetyabhas: A Framework for Uncertainty-Aware Public Policy Optimization in Rational Agent-Based Models

Purpose The WHO's COVID-19 non-pharmaceutical interventions (e.g., lockdowns, vaccinations) effectively curb transmission but impose heavy…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

SCI-PRM: A Tool Aware Process Reward Model for Scientific Reasoning Verification

While Process Reward Models (PRMs) have achieved remarkable success in mathematical reasoning, their application in complex scientific doma…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Learning Admissible Heuristics via Cost Partitioning

Admissible heuristics are essential for optimal planning, yet learning them remains challenging due to the risk of overestimation. Cost par…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Plan First, Judge Later, Run Better: A DMAIC-Inspired Agentic System for Industrial Anomaly Detection

Large language model (LLM) agents have shown promise in automating complex data-analysis workflows, but their reliable deployment remains c…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Parthenon Law: A Self-Evolving Legal-Agent Framework

As agents grow more capable, legal-domain LLM agents promise to turn document-heavy matters into reviewable work products -- yet reliable d…

2026-06-04 13:00 JSTarXiv cs.AIハードウェア/半導体

A Normative Intermediate Representation for ASP-Based Compliance Reasoning

We propose MONIR, a Modalized-Output Normative Intermediate Representation for ASP-based compliance reasoning. Its core fragment has a stag…

2026-06-04 13:00 JSTarXiv cs.AIエージェント

MIRAGE: Mobile Agents with Implicit Reasoning and Generative World Models

Mobile agents are increasingly expected to operate everyday applications from screenshots and language goals, where reliable control requir…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

BiNSGPS: Geometry Problem Solving via Bidirectional Neuro-Symbolic Interaction

Geometry problem solving poses distinct challenges in artificial intelligence. Existing approaches typically fall into two paradigms: symbo…

2026-06-04 13:00 JSTarXiv cs.AIエージェント

Fog of Love: Engineering Virtuous Agent Behavior with Affinity-based Reinforcement Learning in a Game Environment

Instilling virtuous behavior in artificial intelligence has seen increasing interest. One of the techniques proposed is known as affinity-b…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント

FALSIFYBENCH: Evaluating Inductive Reasoning in LLMs with Rule Discovery Games

Large language models (LLMs) are increasingly deployed as autonomous agents in scientific tasks. Yet whether these systems can effectively…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Inference-Time Vulnerability Beyond Shallow Safety: Alignment Along Generation Trajectories

Safety-aligned Large Language Models (LLMs) remain vulnerable to interventions during inference that redirect generation toward harmful out…

2026-06-04 13:00 JSTarXiv cs.AIエージェント研究/論文

Tree-Based Formalization of Multi-Agent Complementarity in Human-AI Interactions

Complementarity is the case in which a human--AI interaction (HAI) outperforms the best prediction benchmark available among its members. A…

2026-06-04 13:00 JSTarXiv cs.AIエージェント

AIP: A Graph Representation for Learning and Governing Agent Skills

Agent Skills today consist largely of free-form prose requiring the agent to read, interpret, and re-derive how to act in every session. Th…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

BiasGRPO: Stabilizing Bias Mitigation in High-Variance Reward Landscapes via Group-Relative Policy Optimization

Mitigating social bias in Large Language Models (LLMs) presents a distinct alignment challenge: unlike verifiable tasks, bias lacks a singl…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Beyond Objective Equivalence: Constraint Injection for LLM-Based Optimization Modeling on Vehicle Routing Problems

Large language models (LLMs) increasingly translate natural-language optimization problems into executable solver code. Yet for constraint-…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント

R-APS: Compositional Reasoning and In-Context Meta-Learning for Constrained Design via Reflective Adversarial Pareto Search

Large language models (LLMs) are fluent on open-ended tasks, yet in agentic settings, where a system must plan, use tools, and act over ext…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

AICompanionBench: Benchmarking LLMs-as-Judges for AI Companion Safety

As AI companion platforms such as Replika and Character.AI rapidly grow, concerns about unsafe human-AI interactions have intensified. This…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

What Type of Inference is Active Inference?

Active inference casts decision-making as inference, with the Expected Free Energy (EFE) unifying goal-directed and information-seeking beh…

2026-06-04 13:00 JSTarXiv cs.AIエージェント

Strabo: Declarative Specification and Implementation of Agentic Interaction Protocols

The last few years have witnessed major advances in the modeling and implementation of multiagent systems based on declarative interaction…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

AutoLab: Can Frontier Models Solve Long-Horizon Auto Research and Engineering Tasks?

Scientific and engineering progress is fundamentally a long-horizon iterative process: proposing changes, running experiments, measuring ou…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Knowledge Index of Noah's Ark

Knowledge benchmarks for LLMs face three issues: scaling-driven designs that do not operationalize disciplinary representativeness; flat-pa…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

AI from concrete to abstract: demystifying artificial intelligence to the general public

Artificial Intelligence (AI) has been adopted in a wide range of domains. This shows the imperative need to develop means to endow common p…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

How do machines learn? Evaluating the AIcon2abs method

This study expands on previous work that introduced the AIcon2abs method (AI from Concrete to Abstract: Demystifying Artificial Intelligenc…

2026-06-04 13:00 JSTarXiv cs.AIロボティクスハードウェア/半導体

DiffAero: A GPU-Accelerated Differentiable Simulation Framework for Efficient Quadrotor Policy Learning

This letter introduces DiffAero, a lightweight, GPU-accelerated, and fully differentiable simulation framework designed for efficient quadr…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成ビジネス/資金調達研究/論文

SpurAudio: A Benchmark for Studying Shortcut Learning in Few-Shot Audio Classification

Few-shot classification (FSC) is widely used for learning from limited labeled data, yet most evaluations implicitly assume that target con…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Constraint-Enhanced Physical Search through Correlation Matching

Physical systems do not merely add noise to search processes; they impose constraints that generate structured correlations. We propose a p…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Early Detection of Alzheimer's Disease Using Explainable Machine Learning on Clinical Biomarkers: A Multi-Class Classification Study Using the Alzheimer's Disease Neuroimaging Initiative (ADNI) Dataset

Background: Alzheimer's disease (AD) affects over 55 million people worldwide. Accurate, interpretable detection of normal cognition (NC),…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Neural Radiated-Noise Fields for Unmanned Underwater Vehicle Noise Spectrum Prediction in Three-Dimensional Scenes

Radiated noise in unmanned underwater vehicles (UUVs) is an important indicator for characterizing acoustic signatures and evaluating platf…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Counterfactual Explanations for Deep Two-Sample Testing

Two-sample testing is a fundamental tool for detecting distributional differences across scientific domains, but classical tests (including…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

The Variance Brain Foundation Models Forgot: Third-Order Statistics Predict Cognition Where Billion-Parameter Models Fail

Brain foundation models (BFMs) are self-supervised Transformers pretrained on fMRI data. We posit that these models should capture each sub…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Gravity-Aware Hierarchical Routing for Lightweight SensorLLM on Human Activity Recognition

Recent studies on sensor-language alignment have shown that two-stage frameworks can improve the semantic modeling ability of wearable-sens…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

CodegenBench: Can LLMs Write Efficient Code Across Architectures?

While large language models (LLMs) have been extensively evaluated on code generation tasks for general-purpose programming and GPU-acceler…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

The Biomimetic Architecture of Software 4.0

Dominant programming paradigms inherit an execution model optimised for a bygone era of a single human mind instructing a local machine, le…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

MaskForge: Structure-Aware Adaptive Attacks for Jailbreaking Diffusion Large Language Models

Diffusion large language models (dLLMs) generate text by iteratively denoising partially masked sequences under bidirectional context, expo…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Position: Deployed Reinforcement Learning should be Continual

Reinforcement Learning (RL) has received increasing attention and adoption in real-world use cases. Most of these systems follow a train-th…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Do Transformers Need Three Projections? Systematic Study of QKV Variants

Transformers have become the standard solution for various AI tasks, with the query, key, and value (QKV) attention formulation playing a c…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Unpredictable Safety: Domain-Dependent Compliance and the Transparency Gap in Open-Weight LLMs

We present a systematic study of domain-dependent safety behavior in open-weight LLMs: 7 standardized experiments across 7 ethical domains,…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Beyond Static Priors: Dynamic Neural Guidance for Large-Scale Ant Colony Optimization

Neural-guided Ant Colony Optimization (ACO) suffers from a fundamental training-inference misalignment: policies are typically trained to g…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Channel-Oriented Design for EEG-to-Music Reconstruction

Brain-computer interfaces aim to decode naturalistic stimuli from neural signals, yet most progress to date has focused on vision and langu…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Bayes-Sufficient Representations in Supervised Learning

Representation learning is often described as preserving the information in an input that is relevant for prediction. This work asks what r…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成ロボティクス

Dive into the Scene: Breaking the Perceptual Bottleneck in Vision-Language Decision Making via Focus Plan Generation

In embodied vision-language decision making tasks such as robotic manipulation and navigation, Vision-Language and Vision-Language-Action M…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Unlocking Feature Learning in Gated Delta Networks at Scale

Training and scaling Large Language Models demand enormous computational resources, motivating both efficient sub-quadratic architectures a…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

LiftQuant: Continuous Bit-Width LLM via Dimensional Lifting and Projection

Existing quantization methods are fundamentally limited by rigid, integer-based bit-widths (e.g., 2, 3-bit), resulting in a ``deployment ga…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント

RUBAS: Rubric-Based Reinforcement Learning for Agent Safety

The evolution of LLMs into tool-enabled agents creates a new class of safety challenges associated with real-world execution rather than si…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

A Goal-Set Characterization of Task Composition in the Boolean Task Algebra

The Boolean Task Algebra (BTA) provides a principled framework for zero-shot task composition in reinforcement learning by equipping goal-r…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

The Invisible Lottery: How Subtle Cues Steer Algorithm Choice in LLM Code Generation

Large language models (LLMs) now generate substantial production code, often for tasks with multiple valid algorithmic solutions. Incidenta…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Spectral Scaling Laws of Muon

Orthonormalized update rules have rapidly become a leading choice of optimizer for training large language models, with recent open-source…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

LLM Compression with Jointly Optimizing Architectural and Quantization choices

Deploying large language models (LLMs) is challenging due to their significant memory and computational requirements. While some methods ad…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Need to Know: Contextual-Integrity-Grounded Query Rewriting for Privacy-Conscious LLM Delegation

As LLMs become increasingly woven into everyday workflows, user queries sent to cloud hosted LLMs routinely mix task-essential content with…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

TPA-AD: A Two-Stage Pseudo Anomaly-Guided Method for Bearing Time-Series Anomaly Detection

This paper proposes a two-stage pseudo anomaly-guided anomaly detection method (\textbf{T}wo-stage \textbf{P}seudo \textbf{A}nomaly-guided…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Adaptive Patching Is Harder Than It Looks For Time-Series Forecasting

Adaptive patching is a recent and compelling proposal for time-series Transformers: allocate finer patches where the sequence looks locally…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Large Language Models Hack Rewards, and Society

Reinforcement learning (RL) has become a dominant post-training paradigm, enabling large language models (LLMs) to learn from rewards. We o…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

POLARIS: Guiding Small Models to Write Long Stories

Small open-weight models struggle at long-form creative writing: their generated stories either fall far short of the requested length, or…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

The Differentiable Auditory Loop (DAL): An ML Framework for Hyper-Personalized Hearing Aids

Conventional hearing aids rely on fixed, frequency-dependent amplification and compression to manage reduced sensitivity, which often fails…

2026-06-04 13:00 JSTarXiv cs.AIエージェント

Proof-Carrying Agent Actions: Model-Agnostic Runtime Governance for Heterogeneous Agent Systems

Agent systems execute through runtimes with very different control points: local coding tools, framework SDKs, managed agent platforms, API…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Building The Ph(ysical)AI Layer Of Machine Intelligence

Foundation models achieve generalization through massive-scale training on diverse data, but have limitations with transfer to truly unseen…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成

SymTRELLIS: Symmetry-Enforced Voxel Latents for 3D Generation

Single-view 3D generative models have achieved impressive visual quality, yet they are not designed to satisfy structural or functional req…

2026-06-04 13:00 JSTarXiv cs.AIエージェントロボティクス

AgenticDiffusion: Agentic Diffusion-based Path Planning for Vision-Based UAV Navigation

Indoor UAV navigation requires efficient exploration, scene understanding, and reliable trajectory execution under limited field-of-view ob…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

dMX: Differentiable Mixed-Precision Assignment for Low-Precision Floating-Point Formats

Quantizing large language models (LLMs) to low-precision floating-point representations is central to efficient deployment, yet applying a…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント

SaliMory: Orchestrating Cognitive Memory for Conversational Agents

Conversational agents that serve as lifelong companions must maintain persistent memory across all interactions. However, simply expanding…

2026-06-04 13:00 JSTarXiv cs.AIエージェントロボティクス

Semantic Constraint Synthesis for Adaptive Trajectory Optimization via Large Language Models

Trajectory optimization is a critical component for enabling safe and reliable autonomous operations in space exploration. As space mission…

2026-06-04 13:00 JSTarXiv cs.AIエージェント研究/論文

HighTide: An Agent-Curated Open-Source VLSI Benchmark Suite

We introduce HighTide, an evolving AI-assisted benchmark suite. Specifically, the contributions are: (i) a diverse open-source suite spanni…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェントハードウェア/半導体

Caught in the Act(ivation): Toward Pre-Output and Multi-Turn Detection of Credential Exfiltration by LLM Agents

LLM agents often place sensitive credentials in the same context window as untrusted retrieved content, creating a direct path for indirect…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Physics-Informed Machine Learning for Short-Term Flood Prediction

Accurate flood forecasting is essential for mitigating disaster risks and protecting communities. However, purely data-driven machine learn…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

EvalStop: Using World Feedback to Detect and Correct Reward Overoptimization in Multi-Tenant RLHF Platforms

Cloud LLM fine-tuning platforms increasingly serve RLHF workloads, where a learned reward model is optimized as a proxy for human quality.…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

ADAPTOOD: Uncertainty-Aware Fine-Tuning for Out-of-Distribution ECG Time Series Models

Data samples used for training often differ from those encountered during fine-tuning and deployment, and while ML models show promise, the…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Smart Transportation Without Neurons -- Fair Metro Network Expansion with Tabular Reinforcement Learning

We tackle the Metro Network Expansion Problem (MNEP), a subset of the Transport Network Design Problem (TNDP), which focuses on expanding m…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

MimeLens: Position-Agnostic Content-Type Detection for Binary Fragments

File-type classification underlies many workflows like malware triage, forensic carving, packet inspection, and storage indexing. Learned s…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

A Systematic Analysis of Linguistic Features in AI-Generated Text Detection Across Domains and Models

Interpretable linguistic features offer a promising approach for explaining why a given text appears machine-generated, particularly for no…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Exact Unlearning in Reinforcement Learning

We formulate the problem of \emph{exact unlearning} in reinforcement learning, where the goal is to design an efficient framework that enab…

2026-06-04 13:00 JSTarXiv cs.AIロボティクス

Dual Advantage Fields

Offline goal-conditioned reinforcement learning requires both long-horizon reachability estimates and local action comparisons. Dual goal r…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Metric-Aware Hybrid Forecasting for the CTF4Science Lorenz Challenge

We describe our approach to the CTF4Science Lorenz challenge, a benchmark that mixes short-horizon forecasting, long-time distribution matc…

2026-06-04 13:00 JSTarXiv cs.AIエージェント

Notarized Agents: Receiver-Attested Confidential Receipts for AI Agent Actions

Current AI agent observability is structurally compromised: the entity producing the activity log is the same entity whose activity is bein…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

DetectZoo: A Unified Toolkit for AI-Generated Content Detection Across Text, Audio, and Image Modalities

The growing popularity and capacity of generative models have eroded the distinction between human and machine-generated content, motivatin…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIロボティクス

PerceptTwin: Semantic Scene Reconstruction for Iterative LLM Planning and Verification

Simulation environments are useful for both robot policy learning and planning verification and validation. Traditionally, the process of c…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Incremental Sheaf Cohomology on Cellular Complexes: O(1)-in-n Lazy Edit Processing under Bounded Local Geometry

We present an algorithmic framework for incremental maintenance of first sheaf cohomology $H^1(X; \mathcal{F})$ on dynamically evolving 1-d…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

MM-BizRAG: Rethinking Multimodal Retrieval-Augmented Generation for General Purpose Enterprise Q&A

Recent advances in multimodal retrieval-augmented generation (MM-RAG) have shifted toward minimal parsing, relying on page-level images for…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Supportive Token Revealing for Fast Diffusion Language Model Decoding

Discrete diffusion language models can generate text efficiently by updating multiple masked positions in parallel, but this parallelism in…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Recover-LoRA for Aggressive Quantization: Reclaiming Accuracy in 2-Bit Language Models via Low-Rank Adaptation with Knowledge Distillation on Synthetic Data

Aggressive weight quantization to 2-bit precision offers substantial throughput and memory gains for large language model (LLM) inference,…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Overview of the EReL@MIR 2025 Multimodal Document Retrieval Challenge (Track 1)

Retrieval over visually-rich documents, pages that interleave text with figures, tables, and charts, is essential for multimodal retrieval-…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Can I Take Another Dose? Evaluating LLM Decision-Making Under Temporal Uncertainty in OTC Dosing QA

Large language models (LLMs) are increasingly used for everyday health questions, including whether a user can safely take another dose of…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成ロボティクス

Instant-Fold: In-Context Imitation Learning for Deformable Object Manipulation

Deformable object manipulation (DOM) is challenging due to high-dimensional, partially observable states that evolve through long-horizon,…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成エージェント

StandardE2E: A Unified Framework for End-to-End Autonomous Driving Datasets

Autonomous driving has shifted from modular perception-prediction-planning stacks toward end-to-end (E2E) models that map sensor inputs dir…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

From Ticks to Flows: Dynamics of Neural Reinforcement Learning in Continuous Environments

We present a novel theoretical framework for deep reinforcement learning (RL) in continuous environments by modeling the problem as a conti…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

The Loss Is Not Enough: Sampling Conditions and Inductive Bias in Contrastive Representation Learning

Contrastive learning has become a leading paradigm for self-supervised representation learning, yet the conditions under which it recovers…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Sparse Mixture-of-Experts Reward Models Learn Interpretable and Specialized Experts for Personalized Preference Modeling

Preference modeling plays a central role in reinforcement learning from human feedback (RLHF), enabling large language models (LLMs) to ali…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Scaling Novel Graph Generation via Lightweight Structure-Guided Autoregressive Models

Generating realistic and diverse graphs is a key problem in machine learning, with applications in molecular discovery, circuit design, cyb…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Anycast Performance in Context

IP anycast lets a service advertise one address from many physical sites, leaving BGP to map each client to a site. It is central to the DN…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

OpenRFM: Dissecting Relational In-Context Learning

Relational Foundation Models (RFMs) promise a single pre-trained predictor that, given any relational database, returns predictions in one…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Measuring What Matters: Synthetic Benchmarks for Concept Bottleneck Models

Concept bottleneck models predict outcomes from high-level concepts detected in inputs. Although concepts provide a simple way to reap bene…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

A Geometric Characterization of the Stationary Plateau for Two-Layer Neural Networks

We investigate the geometric structure of stationary plateaus that arise in the loss landscape of two-layer neural networks with smooth act…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Generalizable Multi-Task Learning for Wireless Networks Using Prompt Decision Transformers

Future wireless networks demand rapid adaptation to highly heterogeneous environments and dynamic task configurations, necessitating a shif…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント

From Untrusted Input to Trusted Memory: A Systematic Study of Memory Poisoning Attacks in LLM Agents

Memory is a core component of AI agents, enabling them to accumulate knowledge across interactions and improve performance. However, persis…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Expectations vs. Realities: The Cost of MSE-Optimal Forecasting Under Conditional Uncertainty

Multi-step time series forecasting (MSF) is commonly evaluated using point-wise error metrics such as mean squared error (MSE), implicitly…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

HYolo: An Intelligent IoT-Based Object Detection System Using Hypergraph Learning

This paper presents HYolo, an intelligent IoT-based object detection framework that integrates hypergraph learning into the YOLO architectu…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

MorphoQuant: Modality-Aware Quantization for Omni-modal Large Language Models

Conventional Post-Training Quantization (PTQ) methods struggle with 4-bit Omni-modal Large Language Models (OLLMs) due to the extreme distr…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成

Multi-Granularity 3D Kidney Lesion Characterization from CT Volumes

Radiology reports describe kidney lesions by type, size, enhancement, and attenuation, yet existing 3D methods predict only at the patient…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成

Selective Coupling of Decoupled Informative Regions: Masked Attention Alignment for Data-Free Quantization of Vision Transformers

Data-Free Quantization (DFQ) addresses data security concerns by synthesizing samples, without accessing real data. It has garnered increas…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

DSIRM: Learning Query-Bridged Discrete Semantic Identifiers for E-commerce Relevance Modeling

Despite rapid progress of continuous embeddings for e-commerce search relevance, a long-standing open problem is the difficulty in capturin…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

From Symbolic to Geometric: Enabling Spatial Reasoning in Large Language Models

Recent large language models (LLMs) often appear to exhibit spatial reasoning ability; however, this capability is largely \emph{symbolic},…

2026-06-04 13:00 JSTarXiv cs.AI規制/政策研究/論文

LCSHBench: A Multilingual, Consensus-Grounded Benchmark for Library of Congress Subject Heading Assignment

Automated subject cataloging assigns controlledvocabulary headings to bibliographic records, but LCSH has no standard public benchmark. We…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Rethinking Sales Lead Scoring with LLM-based Hierarchical Preference Ranking

Sales lead conversion in high-stakes domains (e.g., automotive, real estate) differs fundamentally from e-commerce recommendation due to pr…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

TITAN-FedAnil+: Trust-Based Adaptive Blockchain Federated Learning for Resource-Constrained Intelligent Enterprises

Federated Learning (FL) has emerged as an effective paradigm for collaborative intelligence while preserving data privacy. However, data he…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Low-Rank Decay for Grokking in Scale-Invariant Transformers: A Spectral-Geometric View

Modern Transformer architectures frequently employ normalization mechanisms such as RMSNorm and Query-Key Normalization, making parts of th…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

An Ensembled Latent Factor Model via Differential Evolution and Gradient Descent Optimization

High-dimensional and incomplete (HDI) data are prevalent in many real-world big data scenarios. Latent factor models serve as a common repr…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成

An Empirical Study of Data Scale, Model Complexity, and Input Modalities in Visual Generalization

Modern deep neural networks usually have large parameter scales and nonlinear hierarchical structures, and they have achieved strong perfor…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成ビジネス/資金調達

L-TGVN: Leveraging Longitudinal Priors for Personalized Rapid MRI

MRI provides excellent soft-tissue contrast without ionizing radiation, but long acquisition times increase patient discomfort while also r…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント

What If Prompt Injection Never Left? Exploring Cross-Session Stored Prompt Injection in Agentic Systems

Modern agentic systems transform LLMs from session-bounded assistants into stateful systems that persist and evolve shared world state acro…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

LoopMoE: Unifying Iterative Computation with Mixture-of-Experts for Language Modeling

Mixture-of-Experts (MoE) and looped architectures scale models along two orthogonal axes, namely parameter capacity and effective depth. Ho…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

MemoryDocDataSet: A Benchmark for Joint Conversational Memory and Long Document Reasoning

AI systems increasingly need to combine two demanding capabilities: navigating multi-session conversation history and performing deep readi…

2026-06-04 13:00 JSTarXiv cs.AIビジネス/資金調達

RowNet: A Memory Transformer for Tabular Regression

Real estate valuation is a structured regression problem in which prices are governed by heterogeneous feature types, sparse regional effec…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

Token Rankings are Unforgeable Language Model Signatures

Language model parameters are known to impose unique (to each model) geometric constraints on their logit outputs, which serves as a signat…

2026-06-04 13:00 JSTarXiv cs.AIエージェント研究/論文

CyberGym-E2E: Scalable Real-World Benchmark for AI Agents' End-to-End Cybersecurity Capabilities

AI has the potential to transform cybersecurity by enabling systems that can autonomously detect, analyze, and remediate software vulnerabi…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント

SePO: Self-Evolving Prompt Agent for System Prompt Optimization

System prompt optimization improves agent behavior without modifying the underlying model, yielding human-readable, model-agnostic instruct…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

ParetoPilot: Zero-Surrogate Offline Multi-Objective Optimization via Infer-Perturb-Guide Diffusion

Offline multi-objective optimization (Offline MOO) aims to discover novel Pareto-optimal designs based on static datasets without expensive…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成

Adaptive Calibration for Fair and Performant Facial Recognition

We introduce Adaptive Calibration (AC), a novel calibration strategy for facial recognition that maps cosine similarity between normalized…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

ChessMimic: Per-Rating Transformer Models for Human Move, Clock, and Outcome Prediction in Online Blitz Chess

We present ChessMimic, a system of three small encoder-only transformers - for move, thinking-time, and outcome prediction - conditioned on…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Evaluating Reasoning Fidelity in Visual Text Generation

Recent text-to-image (T2I) models can render highly legible and well-structured text within images, enabling applications including documen…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成

SFMambaNet: Spectral-Frequency Enhanced Selective State Space Model for Correspondence Pruning

Correspondence pruning aims to identify inliers from an initial set of correspondences. Most existing Graph Neural Network (GNN)-based meth…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Smart Picks in the Dark: Towards Efficient RLVR for Reasoning via Tracing Metacognitive Pivots

Reinforcement learning with verifiable rewards (RLVR) has greatly advanced large reasoning models (LRMs), but it requires timely training o…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

Self-Evolving Deep Research via Joint Generation and Evaluation

Large Language Models (LLMs) have become increasingly adopted in daily applications, with deep research standing out as a particularly impo…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

GeoMin: Data-Efficient Semi-Supervised RLVR via Geometric Distribution Modeling

Reinforcement learning with verifiable rewards (RLVR) significantly advances LLM reasoning, yet it faces a dilemma: standard supervised sca…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Treat Traffic Like Trees: A Semantic-Preserving Hierarchical Graph-Based Expert Framework for Encrypted Traffic Analysis

Graph-based deep learning methods have been widely employed in encrypted traffic analysis to exploit latent correlations across different g…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

ANN Search: Recall What Matters

Approximate nearest neighbor (ANN) search has become a core primitive in information retrieval and modern machine learning tasks, from clas…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成

Optical-Guided Neural Collapse for SAR Few-Shot Class Incremental Learning

Few-shot class-incremental learning (FSCIL) in synthetic aperture radar imagery presents unique challenges due to severe data scarcity and…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Dynamic Infilling Anchors for Format-Constrained Generation in Diffusion Large Language Models

Diffusion large language models (dLLMs) offer bidirectional attention and parallel generation, enabling them to exploit global context and…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Temporal Order Matters for Agentic Memory: Segment Trees for Long-Horizon Agents

Long-horizon conversational agents need to interact with users through evolving events, tasks, and goals. Such histories are naturally temp…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Rollout-Level Advantage-Prioritized Experience Replay for GRPO

Reinforcement learning from verifiable rewards with GRPO is a standard approach for post-training reasoning LLMs. It remains sample ineffic…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Multi-SPIN: Multi-Access Speculative Inference for Cooperative Token Generation at the Edge

Speculative inference (SPIN) was originally developed as an efficient architecture to accelerate Large Language Models (LLMs). In this work…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Synthetic Personalities: How Well Can LLMs Mimic Individual Respondents Using Socio-Economic Microdata?

LLM-based digital twins promise to scale and accelerate market research, but most published twins are either coarse persona bots conditione…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Ekka: Automated Diagnosis of Silent Errors in LLM Inference

LLM serving frameworks are quickly evolving with a complex software stack and a vast number of optimizations. The rapid development process…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

QuBLAST: A Framework for Quantizing Large Language Models with Block-Level Compression Approach and Activation Scaling Strategy

LLMs have become the state-of-the-art algorithms for solving NLP tasks. However, they typically come at huge computational and memory costs…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

QO-Bench: Diagnosing Query-Operator-Preserving Retrieval over Typed Event Tuples

Many real-world questions over business, legal, and scientific corpora are natural-language versions of database-style queries over records…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成エージェント

Instance-Level Post Hoc Uncertainty Quantification in Object Detection

Object detection is a safety-critical component of autonomous driving. It is essential to quantify the uncertainty in bounding-box predicti…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Why Muon Outperforms Adam: A Curvature Perspective

Muon improves training efficiency over Adam in large language-model training by about two times, but the local geometric source of this adv…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Learning Long Range Spatio-Temporal Representations over Continuous Time Dynamic Graphs with State Space Models

Continuous-time dynamic graphs (CTDGs) provide a richer framework to capture fine-grained temporal patterns in evolving relational data. Lo…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成

Real-Time Automatic License Plate Recognition Using YOLOv8, SORT Tracking, and Temporal Data Interpolation

The real-time hardships of video processing seriously limit the usage of Automatic License Plate Recognition (ALPR) with application in dyn…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成

Graph-Guided Universum Learning in Generalized Eigenvalue Proximal SVMs for Alzheimer's Disease Classification

Early and accurate detection of Alzheimer's disease (AD) is important for timely intervention and disease management. Generalized Eigenvalu…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成

Enhancing MedSAM with a Lightweight Box Predictor for Medical Image Segmentation

Semantic segmentation in medical imaging is a critical yet challenging task due to data scarcity and high variability across modalities. Wh…

2026-06-04 13:00 JSTarXiv cs.AIロボティクス

VISTA: Vision-Grounded and Physics-Validated Adaptation of UMI data for VLA Training

Universal Manipulation Interface (UMI) enables scalable real-world robot data collection without hardware-specific teleoperation, yet lever…

2026-06-04 13:00 JSTarXiv cs.AIロボティクス

CoRe-MoE: Contrastive Reweighted Mixture of Experts for Multi-Terrain Humanoid Locomotion with Gait Adaptation

Humans primarily rely on walking and running to traverse complex terrains, without resorting to unnecessarily complex motion patterns. Simi…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Trace-Mediated Peak Bias: Bridging Temporal Credit Assignment and Cognitive Heuristics in Deep Reinforcement Learning

Temporal credit assignment is central to both biological and artificial intelligence, yet its interaction with non-linear function approxim…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Curvature-aware dynamic precision approach for physics-informed neural networks

Physics-informed neural networks (PINNs) have become a promising framework for simulating partial differential equations (PDEs) by embeddin…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Revisiting Vul-RAG: Reproducibility and Replicability of RAG-based Vulnerability Detection with Open-Weight Models

Large language models (LLMs) have shown strong potential for automated software vulnerability detection, particularly in retrieval-augmente…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント

TIDE: Proactive Multi-Problem Discovery via Template-Guided Iteration

Agents are widely deployed as assistants over documents, tools, and code. However, they typically act only on explicit user requests, which…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

An Empirical Audit of Input Encoders for Multi-Channel Signal Transformers

Transformers consuming multi-channel scalar signals must embed $C$ simultaneous values into one $d_{\text{model}}$-dimensional vector per t…

2026-06-04 13:00 JSTarXiv cs.AIエージェント

Archi: Agentic Operations at the CMS Experiment

We present Archi, an open-source, end-to-end framework for scientific collaborations that combines the systematic ingestion and organizatio…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Description-Code Inconsistency in Real-world MCP Servers: Measurement, Detection, and Security Implications

The Model Context Protocol (MCP) has emerged as a critical standard empowering Large Language Models (LLMs) to utilize external tools. In t…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成

Coarse-to-fine Hierarchical Architecture with Sequential Mamba for Brain Reconstruction

Understanding the relationship between deep visual representations and the human visual system is a fundamental challenge in computational…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成

Activation Steering of Video Generation Models via Reduced-Order Linear Optimal Control

Text-to-video (T2V) models trained on large-scale web data can generate undesired content, motivating interventions that reduce harmful out…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成エージェント

NoRA: Evaluating Grounded Reasonableness in Visual First-person Normative Action Reasoning

LLMs and agentic systems are increasingly deployed in social environments, making normative competence critical for safe and appropriate be…

2026-06-04 13:00 JSTarXiv cs.AIエージェント

Scenario Generation for Risk-Aware Reinforcement Learning with Probably Approximately Safe Guarantees

Guaranteeing safety is critical to the deployment of reinforcement learning (RL) agents in the real-world, especially as policies learned u…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Learning While Acting: A Skill-Enhanced Test-Time Co-Evolution Framework for Online Lifelong Learning Agents

Lifelong learning is essential for Large Language Model (LLM) agents operating in dynamic, interactive environments. However, existing life…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成

OA-CutMix: Correcting the Label Bias of CutMix

CutMix has become the de facto standard mixing augmentation, yet its label assignment rests on a flawed assumption: The area of the pasted…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Signed Dual Attention: Capturing Signed Dependencies in Time Series Forecasting

Initially developed for natural language processing, Transformer architectures and attention mechanisms are now central to a wide range of…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Uncertainty-Aware End-to-End Co-Design of Neural Network Processors: From Training and Mapping to Fabrication

Designing a neural network processor is an end-to-end co-design problem: network architecture and training budget determine the inference w…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Learning Empirically Admissible Neural Heuristics for Combinatorial Search

Finding optimal solution paths for combinatorial puzzles like the Rubik's Cube, sliding tile puzzles, and Lights Out remains a classical ch…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Abduction Prover in Isabelle/HOL

Proof assistants based on expressive logics suffer limited automation for proof search, raising the cost of formal verification based on pr…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成

DiverAge: Reliable Pluralistic Face Aging with Cross-Age Identity Relation Guidance

Face aging plays an important role in long-term biometric analysis, cross-age identity verification, and forensic identity analysis. Since…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Provably Auditable and Safe LLM Agents from Human-Authored Ontologies

We introduce the LLM agent architecture Agentic Redux, intended for use with nontrivial problem domains that require linear auditability. U…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

'Your AI Text is not Mine': Redefining and Evaluating AI-generated Text Detection under Realistic Assumptions

Although it is generally agreed that AI-generated text poses a broad societal risk, there is no common understanding in the AI-generated te…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Geometry-Aware Distillation for Prompt Tuning Biomedical Vision-Language Models

Current prompt-based and adapter-based tuning of vision-language models (VLMs) is attractive for medical imaging, where clinical data sensi…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning

Rubric-based reinforcement learning (RL) uses an LLM-as-a-Judge (LaaJ) to score model outputs according to rubrics as rewards. However, pol…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

AdaKoop: Efficient Modeling of Nonlinear Dynamics from Nonstationary Data Streams with Koopman Operator Regression

Real-time data analysis requires the ability to accurately and adaptively address nonlinear dynamics in a nonstationary data stream while p…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント

From Prompt to Process: a Process Taxonomy and Comparative Assessment of Frameworks Supporting AI Software Development Agents

AI tools for programming are no longer just autocomplete or chat assistants: they organize themselves as development frameworks, with proce…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成エージェント研究/論文

Plan, Watch, Recover: A Benchmark and Architectures for Proactive Procedural Assistance

We envision a proactive multi-modal assistant system which gives users real-time step-by-step guidance on a procedural task, autonomously d…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

DeliChess: A Multi-party Dialogue Dataset for Deliberation in Chess Puzzle Solving

Multi-party dialogue is a critical setting for studying collaborative reasoning and decision-making, yet existing datasets rarely focus on…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント

From Agent Traces to Trust: Evidence Tracing and Execution Provenance in LLM Agents

Large language model (LLM)-based agents increasingly solve complex tasks by interacting with external tools, retrieval systems, memory modu…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

SharedRequest: Privacy-Preserving Model-Agnostic Inference for Large Language Models

With the widespread deployment of public large language models (LLMs) such as ChatGPT, protecting user prompt privacy has become an increas…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成ビジネス/資金調達

M$^3$Eval: Multi-Modal Memory Evaluation through Cognitively-Grounded Video Tasks

As multi-modal models advance towards long-form video understanding, memory emerges as a critical capability. Despite substantial efforts i…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント

DAR: Deontic Reasoning with Agentic Harnesses

Deontic reasoning is the task of answering questions by applying explicit rules and policies to case-specific facts, for example computing…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Invariant Gradient Alignment for Robust Reasoning Distillation

Large language models (LLMs) suffer from shortcut learning: they systematically fail on out-of-distribution (OOD) inputs whose semantic sur…

2026-06-04 13:00 JSTarXiv cs.AIエージェント

Self-Reflective APIs: Structure Beats Verbosity for AI Agent Recovery

When an AI agent calls an API and hits a validation error, it needs more than what went wrong -- it needs what to do next. A self-reflectiv…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

UniCAD: A Unified Benchmark and Universal Model for Multi-Modal Multi-Task CAD

Computer-Aided Design (CAD) underpins modern engineering and manufacturing by enabling the creation of precise, editable 3D models. However…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Automatic Generation of Titles for Research Papers Using Language Models

The title of a research paper conveys its primary idea and, occasionally, its conclusions in a clear and concise manner. Choosing an approp…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Arithmetic Pedagogy for Language Models

We investigate whether methods of human mathematics pedagogy can guide the training of language models toward arithmetic reasoning. Buildin…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成

Who Needs Labels? Adapting Vision Foundation Models With the Metadata You Already Have

We propose a label-free approach to adapt powerful but generic vision foundation models to specialized scientific domains. Standard supervi…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Continual Visual and Verbal Learning Through a Child's Egocentric Input

Children learn the meanings of words from a continuous, temporally structured stream of egocentric experience. Recent work shows that neura…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Audio Interaction Model

Audio is an inherently interactive modality, yet today's Large Audio Language Models (LALMs) are offline, and streaming audio models each h…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Towards Efficient and Evidence-grounded Mobility Prediction with LLM-Driven Agent

Individual-level mobility prediction is central to urban simulation, transportation planning, and policy analysis. Supervised sequence mode…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成

GeM-NR: Geometry-Aware Multi-View Editing for Nonrigid Scene Changes

Recent developments in multi-view image editing with generative models have brought us a step closer toward general 3D content generation a…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Failed Reasoning Traces Tell You What Is Fixable (But Not by Reading Them)

When post-trained language models fail on reasoning problems, the common test-time-scaling response is to spend more compute on additional…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Multi-Column RBF Neural Network Using Adaptive and Non-Adaptive Particle Swarm Optimization

The radial basis function neural network (RBFN) trained with a gradient descending algorithm provides an effective fully connected structur…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Reinforcement Learning from Rich Feedback with Distributional DAgger

Reasoning models have advanced rapidly, but the dominant reinforcement learning from verifiable rewards (RLVR) recipe remains surprisingly…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Streaming Communication in Multi-Agent Reasoning

Multi-agent reasoning systems adopt a "generate-then-transfer" paradigm that forces end-to-end latency to scale linearly with pipeline dept…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Longer Context, Deeper Thinking: Uncovering the Role of Long-Context Ability in Reasoning

Recent language models exhibit strong reasoning capabilities, yet the influence of long-context capacity on reasoning remains underexplored…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Breaking Bad Molecules: Are MLLMs Ready for Structure-Level Molecular Detoxification?

Toxicity remains a leading cause of early-stage drug development failure. Despite advances in molecular design and property prediction, the…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

Constrained Adaptive Rejection Sampling

Language Models (LMs) are increasingly used in applications where generated outputs must satisfy strict semantic or syntactic constraints.…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Aligning Deep Implicit Preferences by Learning to Reason Defensively

Personalized alignment is crucial for enabling Large Language Models (LLMs) to engage effectively in user-centric interactions. However, cu…

2026-06-04 13:00 JSTarXiv cs.AIエージェント

Adaptive Minds: Empowering Agents with LoRA-as-Tools

We investigate a framework in which LoRA adapters are treated as callable tools that a base language model can dynamically select and invok…

2026-06-04 13:00 JSTarXiv cs.AIエージェント

BRAINCELL-AID: An Agentic AI Created Brain Cell Type Resource for Community Annotation

Single-cell RNA sequencing has transformed our ability to identify diverse cell types and their transcriptomic signatures. However, annotat…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

A Unified Geometric Space for Topological Alignment Between Transformer-Based Models and Human Brain Networks

Prior brain-AI alignment studies are typically constrained by specific inputs and tasks, limiting their ability to capture organizational p…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

MENTOR: A Metacognition-Driven Self-Evolution Framework for Uncovering and Mitigating Implicit Domain Risks in LLMs

Ensuring the safety of Large Language Models (LLMs) is critical for real-world deployment. However, current safety measures often fail to a…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Reasoning or Fluency? Dissecting Probabilistic Confidence in Best-of-N Selection

Probabilistic confidence metrics are increasingly adopted as proxies for reasoning quality in Best-of-N selection, under the assumption tha…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Success Conditioning as Policy Improvement: The Optimization Problem Solved by Imitating Success

A widely used technique for improving policies is success conditioning, in which one collects trajectories, identifies those that achieve a…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

PersistBench: When Should Long-Term Memories Be Forgotten by LLMs?

Conversational assistants are increasingly integrating long-term memory with large language models (LLMs). This persistence of memories, e.…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Interfaze: The Future of AI is built on Task-Specific Small Models

We present Interfaze, a native hybrid model that fuses task-specific deep neural networks (CNNs and DNNs) directly into a transformer decod…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

SciDER: Scientific Data-centric End-to-end Researcher

While large language models accelerate scientific discovery, existing agents face severe limitations in adaptability, domain generalization…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

MedForge: Interpretable Medical Deepfake Detection via Forgery-aware Reasoning

Text-guided image editors can now manipulate authentic medical scans with high fidelity, enabling lesion implantation/removal that threaten…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Bilevel Autoresearch: Meta-Autoresearching Itself

If autoresearch is itself a form of research, then autoresearch can be applied to research itself. We present Bilevel Autoresearch, a bilev…

2026-06-04 13:00 JSTarXiv cs.AIエージェント

Formal Semantics for Agentic Tool Protocols: A Process Calculus Approach

The emergence of large language model agents capable of invoking external tools has created urgent need for formal verification of agent pr…

2026-06-04 13:00 JSTarXiv cs.AIエージェント

The Accountability Horizon: An Impossibility Theorem for Governing Human-Agent Collectives

Existing accountability frameworks for AI systems, legal, ethical, and regulatory, rest on a shared assumption: for any consequential outco…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成

Belief-Aware VLM Model for Human-like Reasoning

Traditional neural network models for intent inference rely heavily on observable states and struggle to generalize across diverse tasks an…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Binary Spiking Neural Networks as Causal Models

We provide a causal analysis of Binary Spiking Neural Networks (BSNNs) to explain their behavior. We formally define a BSNN and represent i…

2026-06-04 13:00 JSTarXiv cs.AIエージェント研究/論文

SciIntegrity-Bench: A Benchmark for Evaluating Academic Integrity in AI Scientist Systems

AI scientist systems are increasingly deployed for autonomous research, yet their academic integrity has never been systematically evaluate…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成

Bad Seeing or Bad Thinking? Rewarding Perception for Multimodal Reasoning

Achieving robust perception-reasoning synergy is a central goal for advanced Vision-Language Models (VLMs). Recent advancements have pursue…

2026-06-04 13:00 JSTarXiv cs.AIエージェント

Unlocking Proactivity in Task-Oriented Dialogue

Proactive task-oriented dialogue (TOD), such as outbound sales, demands a persuasive agent that actively probes the user's concerns and ste…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

The Illusion of Opting in AI-Mediated Consequential Decisions

Drawing on Ullmann-Margalit's concept of opting (transformative, irrevocable, and shadowed by foreclosed alternatives), we show that curren…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成エージェントロボティクス

Transformer-Based Autonomous Driving Models and Deployment-Oriented Compression: A Survey

Transformer-based models are becoming a central paradigm in autonomous driving because they can capture long-range spatial dependencies, mu…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント

ChatSOP: An SOP-Guided MCTS Planning Framework for Controllable LLM Dialogue Agents

Dialogue agents powered by Large Language Models (LLMs) show superior performance in various tasks. Despite the better user understanding a…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成ビジネス/資金調達

CounterFace: A Synthetic Face Dataset for Fine-Grained Counterfactual Evaluation of Face Recognition Systems

Face recognition (FR) systems are widely deployed in critical applications, making their reliability and robustness across diverse populati…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

SSSD: Simply-Scalable Speculative Decoding

Speculative Decoding has emerged as a popular technique for accelerating inference in Large Language Models. However, most existing approac…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

LaVIDE: Language-Prompted Satellite Change Detection via Map-Image Alignment

Remote sensing change detection based on a map reference and an up-to-date image boosts timely observation of the Earth's surface when earl…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

From Motion Signals to Insights: A Unified Framework for Student Behavior Analysis and Feedback in Physical Education Classes

Analyzing student behavior in educational scenarios is crucial for enhancing teaching quality and student engagement. Existing AI-based mod…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Speculative Thinking: Enhancing Small-Model Reasoning with Large Model Guidance at Inference Time

Recent advances leverage post-training to enhance model reasoning performance, which typically requires costly training pipelines and still…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

SoLoPO: Unlocking Long-Context Capabilities in LLMs via Short-to-Long Preference Optimization

Despite advances in pretraining with extended context sizes, large language models (LLMs) still face challenges in effectively utilizing re…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

100-LongBench: Are de facto Long-Context Benchmarks Literally Evaluating Long-Context Ability?

Long-context capability is considered one of the most important abilities of LLMs, as a truly long context-capable LLM enables users to eff…

2026-06-04 13:00 JSTarXiv cs.AIハードウェア/半導体

Model-Preserving Adaptive Rounding

The goal of quantization is to produce a compressed model whose output distribution is as close to the original model's as possible. To do…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

MesaNet: Sequence Modeling by Locally Optimal Test-Time Training

Sequence modeling is currently dominated by causal transformer architectures that use softmax self-attention. Although widely adopted, tran…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Can VLMs Predict Future States? Bootstrapping World Models from Inverse Dynamics

Can unified vision-language models (VLMs) perform forward dynamics prediction (FDP), i.e., predicting the future state (in image form) give…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Time Series Forecasting as Reasoning: A Slow-Thinking Approach with Reinforced LLMs

To advance time series forecasting (TSF), various methods have been proposed to improve prediction accuracy, evolving from statistical tech…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成ビジネス/資金調達

VGGSounder: Audio-Visual Evaluations for Foundation Models

The emergence of audio-visual foundation models underscores the importance of reliably assessing their multi-modal understanding. The VGGSo…

2026-06-04 13:00 JSTarXiv cs.AIビジネス/資金調達研究/論文

A Study of the Scale Invariant Signal to Distortion Ratio in Speech Separation with Noisy References

This paper examines the implications of using the Scale-Invariant Signal-to-Distortion Ratio (SI-SDR) as both evaluation and training objec…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

BioBlue: Systematic runaway-optimiser-like LLM failure modes on biologically and economically aligned AI safety benchmarks for LLMs with simplified observation format

Many AI alignment discussions of "runaway optimisation" focus on RL agents: unbounded utility maximisers that over-optimise a proxy objecti…

2026-06-04 13:00 JSTarXiv cs.AIビジネス/資金調達

Uncertainty Estimation using Variance-Gated Distributions

Evaluation of per-sample uncertainty quantification from neural networks is essential for decision-making involving high-risk applications.…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

KITE: Kernelized and Information Theoretic Exemplars for In-Context Learning

In-context learning (ICL) has emerged as a powerful paradigm for adapting large language models (LLMs) to new and data-scarce tasks using o…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

ClustRecNet: A Novel End-to-End Deep Learning Framework for Clustering Algorithm Recommendation

Identifying an effective clustering algorithm for a given dataset remains a fundamental unsupervised learning issue. We introduce ClustRecN…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成

Platonic Transformers: A Solid Choice For Equivariance

While widespread, Transformers lack inductive biases for geometric symmetries common in science and computer vision. Existing equivariant m…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Can Reasoning Path still be Effective as Input? Bridging Post-Reasoning to Chain-of-Thought Compression

Recent developments have enabled advanced reasoning in Large Language Models (LLMs) via long Chain-of-Thought (CoT), trading efficiency dur…

2026-06-04 13:00 JSTarXiv cs.AIエージェントロボティクス

Simplicial Embeddings Improve Sample Efficiency in Actor-Critic Agents

Recent works have proposed accelerating the wall-clock training time of actor-critic methods via the use of large-scale environment paralle…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Test-time reward-guided alignment of language models by importance sampling on pre-logit space

Test-time alignment of large language models (LLMs) attracts attention because fine-tuning of LLMs requires high computational costs. In th…

2026-06-04 13:00 JSTarXiv cs.AIエージェントロボティクス

Vectorized Online POMDP Planning

Planning under partial observability is an essential capability of autonomous robots. The Partially Observable Markov Decision Process (POM…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Extending Fair Null-Space Projections for Continuous Attributes to Kernel Methods

With the on-going integration of machine learning systems into the everyday social life of millions the notion of fairness becomes an ever…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

OckBench: Measuring the Efficiency of LLM Reasoning

Large language models (LLMs) such as GPT-5 and Gemini 3 have pushed the frontier of automated reasoning and code generation. Yet current be…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成

SAM 3D: 3Dfy Anything in Images

We present SAM 3D, a generative model for visually grounded 3D object reconstruction, predicting geometry, texture, and layout from a singl…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成

AttnRegDeepLab: A Two-Stage Decoupled Framework for Interpretable Embryo Fragmentation Grading

Embryo fragmentation is a morphological indicator critical for evaluating developmental potential in In Vitro Fertilization (IVF). However,…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Dynamic Content Moderation in Livestreams: Combining Supervised Classification with MLLM-Boosted Similarity Matching

Content moderation remains a critical yet challenging task for large-scale user-generated video platforms, especially in livestreaming envi…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Topology Matters: Measuring Memory Leakage in Multi-Agent LLMs

Graph topology is a fundamental determinant of memory leakage in multi-agent LLM systems, yet its effects remain poorly quantified. We intr…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成エージェントロボティクス

DVGT: Driving Visual Geometry Transformer

Perceiving and reconstructing 3D scene geometry from visual inputs is crucial for autonomous driving. However, there still lacks a driving-…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

You Only Train Once: Differentiable Subset Selection for Omics Data

Selecting compact and informative gene subsets from single-cell transcriptomic data is essential for biomarker discovery, improving interpr…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Semiparametric Preference Optimization: Your Language Model is Secretly a Single-Index Model

Policy alignment to preference data typically assumes a known link function between observed preferences and latent rewards (e.g., Bradley-…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Geometry-Aware Hallucination Detection in Large Language Models

Large language models (LLMs) frequently generate factually incorrect or unsupported content, commonly referred to as hallucinations. Prior…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Mid-Think: Training-Free Intermediate-Budget Reasoning via Token-Level Triggers

Hybrid reasoning language models are commonly controlled through high-level Think/No-think instructions to regulate reasoning behavior, yet…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Bounded Hyperbolic Tangent: A Stable and Efficient Alternative to Pre-Layer Normalization in Large Language Models

Pre-Layer Normalization (Pre-LN) is the de facto choice for large language models (LLMs) and is crucial for stable pretraining and effectiv…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

MedRedFlag: Investigating how LLMs Redirect Misconceptions in Real-World Health Communication

Real-world health questions from patients often unintentionally embed false assumptions or premises. In such cases, safe medical communicat…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Outcome-Based RL Provably Leads Transformers to Reason, but Only With the Right Data

Transformers trained via Reinforcement Learning (RL) with outcome-based supervision can spontaneously develop the ability to generate inter…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Can professional translators identify machine-generated text?

This study investigates whether professional translators without prior specialized training can reliably identify short stories generated i…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Do readers prefer AI-generated Italian short stories?

This study investigates whether readers prefer AI-generated short stories in Italian over one written by a renowned Italian author. In a bl…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Conditional PED-ANOVA: Hyperparameter Importance in Hierarchical & Dynamic Search Spaces

We propose conditional PED-ANOVA (condPED-ANOVA), a principled framework for estimating hyperparameter importance (HPI) in conditional sear…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

L$^3$: Large Lookup Layers

Modern sparse language models typically achieve sparsity through Mixture-of-Experts (MoE) layers, which dynamically route tokens to dense M…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Culturally Grounded Personas in Large Language Models: Characterization and Alignment with Socio-Psychological Value Frameworks

Despite the growing utility of Large Language Models (LLMs) for simulating human behavior, the extent to which these synthetic personas acc…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Tuning the Implicit Regularizer of Masked Diffusion Language Models: Enhancing Generalization via Insights from $k$-Parity

Masked Diffusion Language Models have recently emerged as a powerful generative paradigm, yet their generalization properties remain unders…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成

R3G: A Reasoning-Retrieval-Reranking Framework for Vision-Centric Answer Generation

Vision-centric retrieval for VQA requires retrieving images to supply missing visual cues and integrating them into the reasoning process.…

2026-06-04 13:00 JSTarXiv cs.AIエージェント

SUSD: Structured Unsupervised Skill Discovery through State Factorization

Unsupervised Skill Discovery (USD) aims to autonomously learn a diverse set of skills without relying on extrinsic rewards. One of the most…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Efficient Adversarial Attacks on High-dimensional Offline Bandits

Bandit algorithms have recently emerged as a powerful tool for evaluating machine learning models, including generative image models and la…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Making Expert Reasoning Learnable with Self-Distillation

Improving the reasoning capabilities of large language models (LLMs) typically relies either on the model's ability to sample a correct sol…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

What Structural Inductive Bias Helps Transformers Reason Over Knowledge Graphs? A Study with Tabula RASA

What structural inductive bias helps transformers reason over knowledge graphs? Through controlled ablations of a minimal transformer modif…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

TamperBench: Systematically Stress-Testing LLM Safety Under Fine-Tuning and Tampering

As increasingly capable open-weight large language models (LLMs) are deployed, improving their tamper resistance against unsafe modificatio…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

AlgoVeri: An Aligned Benchmark for Verified Code Generation on Classical Algorithms

Vericoding refers to the generation of formally verified code from rigorous specifications. Recent AI models show promise in vericoding, bu…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

MuCO: Generative Peptide Cyclization Empowered by Multi-stage Conformation Optimization

Modeling peptide cyclization is critical for the virtual screening of candidate peptides with desirable physical and pharmaceutical propert…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Unifying Model-Free Efficiency and Model-Based Representations via Latent Dynamics

We present Unified Latent Dynamics (ULD), a novel reinforcement learning algorithm that unifies the efficiency of model-free methods with t…

2026-06-04 13:00 JSTarXiv cs.AIエージェント

Toward Autonomous O-RAN: A Multi-Scale Agentic AI Framework for Real-Time Network Control and Management

Open Radio Access Networks (O-RAN) promise flexible 6G network access through disaggregated, software-driven components and open interfaces…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Tomography by Design: An Algebraic Approach to Low-Rank Quantum States

We present an algebraic algorithm for quantum state tomography that leverages measurements of certain observables to estimate structured en…

2026-06-04 13:00 JSTarXiv cs.AIエージェント

A Unified Framework for Locality in Scalable MARL

Scalable methods for networked multi-agent reinforcement learning let each agent plan using only a small neighborhood of the agent graph. T…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

DSL-Topic: Improving Topic Modeling by Distilling Soft Labelsfrom Language Models

Traditional neural topic models are typically optimized by reconstructing the document's Bag-of-Words (BoW) representations, overlooking co…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Value Entanglement: Conflation Between Different Kinds of Good In (Some) Large Language Models

Value alignment of Large Language Models (LLMs) requires us to empirically measure these models' actual, acquired representation of value.…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Does Order Matter : Connecting The Law of Robustness to Robust Generalization

Bubeck and Selke (2021) propose the connection between the Law of Robustness and robust generalization error as an open problem. The Law of…

2026-06-04 13:00 JSTarXiv cs.AIロボティクス

Evaluating Zero-Shot and One-Shot Adaptation of Small Language Models in Leader-Follower Interaction

Leader-follower interaction is an important paradigm in human-robot interaction (HRI). Yet, assigning roles in real time remains challengin…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成エージェント研究/論文

ShareVerse: Multi-Agent Consistent Video Generation for Shared World Modeling

This paper presents ShareVerse, a video generation framework enabling multi-agent shared world modeling, addressing the gap in existing wor…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成

Beyond Pixel Histories: World Models with Persistent 3D State

Interactive world models continually generate video by responding to a user's actions, enabling open-ended generation capabilities. However…

2026-06-04 13:00 JSTarXiv cs.AIロボティクス

ZeroWBC: Learning Natural Whole-Body Humanoid Interaction from Human Egocentric Data

Achieving versatile and natural whole-body humanoid interaction control remains challenging due to the high cost of whole-body teleoperatio…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Physics-Informed Neural Engine Sound Modeling with Differentiable Pulse-Train Synthesis

Engine sounds originate from sequential exhaust pressure pulses rather than sustained harmonic oscillations. While neural synthesis methods…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

EvoPrompt: Guided Prompt Evolution for Vision-Language Models Adaptation

The adaptation of large-scale vision-language models (VLMs) to downstream tasks with limited labeled data remains a significant challenge.…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェントビジネス/資金調達研究/論文

Safety Under Scaffolding: How Evaluation Conditions Shape Measured Safety

A safety score earned on a benchmark need not predict how the same model behaves once it is wrapped in an agentic scaffold the benchmark ne…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Quantum entanglement provides a competitive advantage in adversarial games

Whether uniquely quantum resources confer advantages in fully classical, competitive environments remains an open question. Competitive zer…

2026-06-04 13:00 JSTarXiv cs.AIロボティクス

ContactExplorer: Contact Coverage-Guided Exploration for General-Purpose Dexterous Manipulation

Reinforcement learning has achieved remarkable success in domains such as Atari games, navigation, and locomotion, where exploration can of…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成

Revisiting Model Stitching In the Foundation Model Era

Model stitching, connecting early layers of one model (source) to later layers of another (target) via a light stitch layer, has served as…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成

Spatial Transcriptomics as Images for Large-Scale Pretraining

Spatial Transcriptomics (ST) profiles thousands of gene expression values at discrete spots with precise coordinates on tissue sections, pr…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

FinTradeBench: A Financial Reasoning Benchmark for LLMs

Real-world financial decision-making is a challenging problem that requires reasoning over heterogeneous signals, including company fundame…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成

GenSpan: Generation-Calibrated Motion Span Priors for Multi-Verb Video Corpus Moment Retrieval

Video Corpus Moment Retrieval (VCMR) aims to retrieve both the correct video and its temporal segment corresponding to a natural-language q…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

PoliticsBench: Benchmarking Political Values in Large Language Models with Multi-Turn Roleplay

While Large Language Models (LLMs) are increasingly used as primary sources of information, their potential for political bias may impact t…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成

On-the-fly Repulsion in the Contextual Space for Rich Diversity in Diffusion Transformers

Modern Text-to-Image (T2I) diffusion models have achieved remarkable semantic alignment, yet they often suffer from a significant lack of v…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Emotion Entanglement and Bayesian Inference for Multi-Dimensional Emotion Understanding

Understanding emotions in natural language is inherently a multi-dimensional reasoning problem, where multiple affective signals interact t…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Inclusion-of-Thoughts: Mitigating Preference Instability via Purifying the Decision Space

Multiple-choice questions (MCQs) are widely used to evaluate large language models (LLMs). However, LLMs remain vulnerable to the presence…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Policy Split: Incentivizing Dual-Mode Exploration in LLM Reinforcement with Dual-Mode Entropy Regularization

To encourage diverse exploration in reinforcement learning (RL) for large language models (LLMs) without compromising accuracy, we propose…

2026-06-04 13:00 JSTarXiv cs.AIエージェントロボティクス

Contextual Multi-Task Reinforcement Learning for Autonomous Reef Monitoring

Although autonomous underwater vehicles promise the capability of marine ecosystem monitoring, their deployment is fundamentally limited by…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Generative Augmented Inference

Large language models enable inexpensive AI-generated annotations, but using them reliably for causal inference remains challenging. Naivel…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Luminol-AIDetect: Fast Zero-shot Machine-Generated Text Detection based on Perplexity under Text Shuffling

Machine-generated text (MGT) detection requires identifying structurally invariant signals across generation models, rather than relying on…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成

MAEPose: Self-Supervised Spatiotemporal Learning for Human Pose Estimation on mmWave Video

Millimetre-wave (mmWave) radar offers a more privacy-preserving alternative to RGB-based human pose estimation. However, existing methods t…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Stochastic Sparse Attention for Memory-Bound Inference

Autoregressive decoding becomes bandwidth-limited at long contexts, as generating each token requires reading all $n_k$ key and value vecto…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Efficiently Aligning Language Models with Online Natural Language Feedback

Reinforcement learning with verifiable rewards has been used to elicit impressive performance from language models in many domains. But, br…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント

A Systematic Investigation of RL-Jailbreaking in LLMs

The evolution of generative models from next-token predictors to autonomous engines of complex systems necessitates rigorous safety hardeni…

2026-06-04 13:00 JSTarXiv cs.AIハードウェア/半導体

Curated Synthetic Data Doesn't Have to Collapse: A Theoretical Study of Generative Retraining with Pluralistic Preferences

Recursive retraining of generative models poses a critical representation challenge: when synthetic outputs are curated based on a fixed re…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

FactoryNet: A Large-Scale Dataset toward Industrial Time-Series Foundation Models

We introduce the first universal pretraining corpus for industrial time-series data: FactoryNet. 51M datapoints across 23k end-to-end task…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

HEPA: A Self-Supervised Horizon-Conditioned Event Predictive Architecture for Time Series

Critical events in multivariate time series, from turbine failures to cardiac arrhythmias, demand accurate prediction, yet labeled data is…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Widening the Gap: Exploiting LLM Quantization via Outlier Injection

LLM quantization has become essential for memory-efficient deployment. Recent work has shown that quantization schemes can pose critical se…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Do LLMs Hold Their Values? MANTA: A Multi-Turn Adversarial Benchmark for Animal Welfare Reasoning

Evaluating animal welfare reasoning in LLMs remains an open challenge despite rapid deployment in consumer and professional contexts where…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Retrieval and competition: how a protein foundation model starts a protein

Protein language models are increasingly used to guide experimental and clinical decisions, yet it is often unclear whether a confident pre…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

Position: State-of-the-Art Claims Require State-of-the-Art Evidence

State-of-the-Art (SOTA) claims pervade Artificial Intelligence (AI) and Machine Learning (ML) research. These claims rest on benchmark eval…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

ZeroUnlearn: Few-Shot Knowledge Unlearning in Large Language Models

Large language models inevitably retain sensitive information, defined as inputs that may induce harmful generations, due to training on ma…

2026-06-04 13:00 JSTarXiv cs.AIビジネス/資金調達

Markov Chain Decoders Overcome the Heavy-Tail Limitations of Lipschitz Generative Models

Heavy-tailed distributions are prevalent in performance evaluation, network traffic, and risk modeling. This behavior poses a fundamental c…

2026-06-04 13:00 JSTarXiv cs.AIロボティクス

DEFLECT: Temporal Counterfactual Preference Learning for Delay-Robust Asynchronous VLAs

Vision-Language-Action (VLA) policies increasingly rely on asynchronous inference to hide large-model latency behind ongoing robot motion.…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成

Rebalancing Reference Frame Dominance to Improve Motion in Image-to-Video Models

Image-to-video models often generate videos that remain overly static, compared to text-to-video models. While prior approaches mitigate th…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

REFLECTOR: Internalizing Step-wise Reflection against Indirect Jailbreak

While Large Language Models (LLMs) demonstrate remarkable capabilities, they remain susceptible to sophisticated, multi-step jailbreak atta…

2026-06-04 13:00 JSTarXiv cs.AIエージェントロボティクス

Lost in Fog: Sensor Perturbations Expose Reasoning Fragility in Driving VLAs

Interpretable autonomous driving planners depend not only on generating explanations, but also on those explanations remaining reliable und…

2026-06-04 13:00 JSTarXiv cs.AI画像/動画生成ビジネス/資金調達

Learning to Evaluate: Cost-Effective Model Evaluation on Unlabeled Data with Meta-Learning

The rapid advancement of machine learning has led to an unprecedented expansion of model ecosystems, making it increasingly difficult to as…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Correcting Visual Blur Induced by Attention Distraction to Reduce Hallucinations: Algorithm and Theory

Multimodal large language models (MLLMs) frequently suffer from object hallucinations, yet the visual perceptual mechanism underlying this…

2026-06-04 13:00 JSTarXiv cs.AIエージェント

Grimlock: Guarding High-Agency Systems with eBPF and Attested Channels

Agentic systems increasingly run user-authored orchestration code that invokes tools, spawns subtasks, and delegates work across machines a…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Aryabhata 2: Scaling Reinforcement Learning for Advanced STEM Reasoning

Competitive STEM examinations such as JEE and NEET require multi-step symbolic reasoning, precise numerical computation, and deep conceptua…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Structured Prompt Optimization Meets Reinforcement Learning for Global and Local Interpretability over Complex Text

LLMs have advanced text classification, yet existing paradigms face a trade-off: supervised (label only) fine-tuning is scalable but offers…

2026-06-04 13:00 JSTarXiv cs.AI研究/論文

LoopFM: Learning frOm HistOrical RePresentations of Foundation Model for Recommendation

Knowledge distillation (KD) transfers a single scalar prediction from a large foundation model (FM) to compact vertical models (VMs), suffe…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

Towards Verifiable Multimodal Deep Research: A Multi-Agent Harness for Interleaved Report Generation

Large Language Models (LLMs) have advanced autonomous agents from deep search, which retrieves concise factual answers, to deep research, w…

2026-06-04 13:00 JSTarXiv cs.AILLM/生成AI

Label Over Logic? How Source Cues Bias Human Fallacy Judgments More Than LLMs

As AI-generated and AI-assisted content floods online spaces, source labels attached to such content can distort human reasoning judgments,…

2026-06-04 10:27 JSTITmedia AI+規制/政策

日本政府、AI「Mythos」アクセス権を取得　サイバー防衛強化に活用

三菱UFJ銀行、三井住友銀行、みずほ銀行もアクセス権を得たとみられている。

2026-06-04 09:00 JSTITmedia AI+その他

オートデスク、主要製品向けにAIアシスタント機能を展開　Fusion向けMCPも

Autodeskは、主要製品向けに「Autodesk Assistant」のテックプレビュー版を提供する他、Fusion向けのMCPを公開した。設計データや業務コンテキストを理解するAIアシスタントに加え、外部AIとの連携を可能にする機能も提供し、設計／製造業務におけるAI活用…

2026-06-04 08:00 JSTITmedia AI+エージェント

ServiceNowとAccenture、エージェント型AIを全社展開する「FDE」を開始

ServiceNowとAccentureが新たなプログラムを立ち上げた。エージェント型AIの導入が実証実験の段階で止まり、全社規模の成果につながらないといった課題をどのように解決するのか。

2026-06-04 07:56 JSTTechCrunch AIその他

Lovable signs multiyear deal with Google Cloud to up usage 5x, source says

Lovable and Google signed an expanded multiyear deal that involves a 5x expansion of Lovable's footprint on Google Cloud, and expanded acce…

2026-06-04 07:30 JSTITmedia AI+その他

「WEDA」を前面に押し出すアドバンテック、エッジAIモデルの開発期間を86％削減

アドバンテックは、「COMPUTEX TAIPEI 2026」において、同社のハードウェアを用いてエッジAIの開発から導入、運用までを統合的に管理するソリューション「WEDA」のデモを披露した。

2026-06-04 07:00 JSTITmedia AI+その他

2年間で「1万時間」削減　「1円の誤りも許されない」ソニー経理が“まず試してみる”DX集団に化けたワケ

「経理DXを進めたいが、現場の抵抗が強い」「ツールを導入しても活用が広がらない」――こうした悩みを抱える企業は少なくない。経理部門は正確性や継続性が求められるため、変革が難しい領域とされてきた。ソニーグループの経理部門は、約2年間で150件を超えるDXプロジェクトを推進し、累積…

2026-06-04 05:00 JSTITmedia AI+LLM/生成AI

Claude Opus 4.8は忖度（そんたく）しません　“正直すぎる”のも善しあし？

Claude Opus 4.8は、性能向上だけでなく「正直さ」の改善が大きな特徴だ。本稿では、忖度（そんたく）しないAIがなぜ評価を分けているのか、公式情報と利用者目線から整理する。

2026-06-04 04:38 JSTTechCrunch AIビジネス/資金調達

Alphabet’s record-breaking $85B raise for Google’s AI business is a helluva good signal

If Alphabet's record-breaking $85 billion stock sale signals investor appetite for AI-related offerings, we can see that investors are read…

2026-06-04 04:07 JSTTechCrunch AIその他

Google’s Dreambeans, its weirdest-named AI tool to date, will turn your life into a cartoon

Dreambeans is a curated list of AI-illustrated "stories" culled from the personal data in your Google account.

2026-06-04 03:00 JSTITmedia AI+ロボティクス

人型ロボブームを“先駆者ホンダ”はどう見る？　「悔しさもあるが……」　次の一手を聞いた

2000年に「ASIMO」を世に送り出したホンダは、足元の人型ロボットブームをどう見ているのか。人型ロボットの開発に再参入する可能性や、現在の取り組みなどを聞いた。

2026-06-04 01:38 JSTITmedia AI+その他

「Gemma 4 12B」登場　メモリ16GBのノートPCでも動作するマルチモーダルモデル

米Googleがオープンなマルチモーダルモデル「Gemma 4 12B」を発表した。エンコーダー不要の統合アーキテクチャを採用し、メモリ16GBのノートPCで動作可能。上位モデルに迫る性能を発揮するという。

2026-06-04 00:50 JSTTechCrunch AIその他

Amazon will show AI product images when you search for some reason

Amazon will use visual search and AI to show AI-generated product images that match your search queries. The retailer says it will help gui…

2026-06-04 00:00 JSTTechCrunch AIその他

These two founders left Goldman and Meta to build voice AI for markets everyone else overlooked

The startup's own stack for Africa and Middle East is now handling more than 17,000 calls per day.

2026-06-03（437件）

2026-06-03 23:58 JSTTechCrunch AIその他

Publishers will be able to opt out of AI Search, thanks to new regulation

U.K. regulators are requiring Google offer a tool allowing website publishers to opt-out of generative AI search features. The option will…

2026-06-03 23:00 JSTITmedia AI+エージェント規制/政策

「AI使うな」より「使うなら教えて」　エージェント時代のガバナンス再設計

AIエージェントの業務適用が広がる一方、組織のガバナンスが追いついていない。OWASPの指摘を踏まえ、日本企業が押さえるべき2つの原則と、来週から始められる3つのアクションを解説する。

2026-06-03 22:40 JSTTechCrunch AIエージェント

Meta’s AI agent for WhatsApp Business is now available globally

WhatsApp will charge businesses for using its AI agent based on token usage.

2026-06-03 22:15 JSTOpenAILLM/生成AI研究/論文

Introducing new capabilities to GPT-Rosalind

GPT-Rosalind advances life sciences research with enhanced biological reasoning, medicinal chemistry expertise, genomics analysis, and expe…

2026-06-03 22:02 JSTTechCrunch AIエージェントビジネス/資金調達

Coralogix raises $200M on bet that someone needs to watch the AI agents

Coralogix is among a growing number of infrastructure firms betting that as AI systems move into production, demand will rise for tools tha…

2026-06-03 21:00 JSTOpenAILLM/生成AIエージェント

How Wasmer used Codex to build a Node.js runtime for the edge

See how Wasmer used Codex with GPT-5.5 to build a Node.js runtime for the edge, accelerating development 10x to 20x and shipping in weeks i…

2026-06-03 19:00 JSTOpenAILLM/生成AI

A blueprint for democratic governance of frontier AI

OpenAI outlines a blueprint for U.S. governance of frontier AI, proposing a federal framework for safety, resilience, and national security.

2026-06-03 19:00 JSTOpenAILLM/生成AI

OpenAI public policy agenda

OpenAI outlines its public policy agenda for AI, including safety, youth protection, workforce transition, and global standards to ensure A…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Visual Graph Scaffolds for Structural Reasoning in Large Language Models

Graphs have been used to enhance large language models (LLMs) for structured reasoning, mostly as external knowledge sources are provided t…

2026-06-03 13:00 JSTarXiv cs.AIロボティクス

AURA: Action-Gated Memory for Robot Policies at Constant VRAM

The KV-cache is the right memory for datacenters but the wrong memory for robots. Datacenter inference batches many short requests and rese…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Evaluating Transformer and LSTM Frameworks for Prediction in Ungauged Basins

Watershed networks exhibit convergent topologies in which multiple tributaries merge into downstream channels,integrating diverse upstream…

2026-06-03 13:00 JSTarXiv cs.AIビジネス/資金調達

BehaviorBench: Modeling Real-World User Decisions from Behavioral Traces

Many decision-support settings require systems that adapt to individual users, but evaluation data for this problem remain limited. Existin…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

ChatHealthAI: Aligning Electronic Health Record Representations with Large Language Models for Grounded Clinical Reasoning

Large language models (LLMs) exhibit strong natural-language reasoning abilities for clinical decision support, but struggle to effectively…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Traj-Evolve: A Self-Evolving Multi-Agent System for Patient Trajectory Modeling in Lung Cancer Early Detection

Modeling patient trajectories from longitudinal electronic health records (EHRs) requires reasoning over sparse, noisy, and long-context mu…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

An Exploration of Collision-based Enemy Morphology Generation

Despite a great deal of prior research into Procedural Content Generation (PCG), relatively little prior work has explored generating enemi…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Thinking Past the Answer: Evaluating Harmful Overthinking in Large Reasoning Models

Large Reasoning Models (LRMs) improve performance by generating explicit intermediate reasoning traces through increased test-time compute,…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Toward a Modular Architecture for Embedded AI Agent Systems at the Edge

The rise of Large Language Models (LLMs) has enabled agentic AI capable of complex reasoning and tool use; however, deploying such autonomy…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

Don't Gamble, GAMBLe: An Analytical Framework for AI-Driven Research Systems

AI-Driven Research Systems (ADRS) -- systems coupling LLMs with automated evaluation to discover algorithms, proofs, and designs -- are bei…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

When Helping Hurts and How to Fix It: Multi-Agent Debate for Data Cleaning

When does multi-agent debate help data cleaning, and when does it hurt? Across three benchmarks, four model families, and over 6,000 task-c…

2026-06-03 13:00 JSTarXiv cs.AIエージェント研究/論文

Handoff Debt: The Rediscovery Cost When Coding Agents Take Over Interrupted Tasks

Coding-agent benchmarks evaluate whether a single uninterrupted agent can resolve a repository issue. Real software work is messier: tasks…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Large AI Models in Dental Healthcare: From General-Purpose Systems to Domain-Specific Foundation Models

Background: Oral diseases affect nearly 3.5 billion people worldwide, yet the comparative clinical potential of large-scale AI models in de…

2026-06-03 13:00 JSTarXiv cs.AIエージェント研究/論文

What Benchmarks Don't Measure: The Case for Evaluating Abstention Competence in Autonomous Agents

Benchmarks for autonomous agents measure whether agents complete tasks, yet this framing is systematically blind to whether an agent should…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

WISE-HAR: A Generalizable Ensemble Deep Learning Framework for WiFi-Based Human Activity Recognition

Human Activity Recognition (HAR) using WiFi signals has emerged as a transformative technology for smart homes, healthcare monitoring, secu…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Inducing Reasoning Primitives from Agent Traces

ReAct-style LLM agents often rediscover the same reasoning routines across problems, yet leave those routines trapped in transient scratchp…

2026-06-03 13:00 JSTarXiv cs.AIエージェント

AUDITFLOW: Executable Symbolic Environments for Structured Financial Reporting Verification

Structured financial audit verification is difficult for language-model agents because correctness depends on structured evidence rather th…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

TriEval: A Resource-Efficient Pipeline for LLM Bias, Toxicity, and Truthfulness Assessment

LLMs have evolved from basic chatbots to the backbone of the AI ecosystem, now widely used in healthcare, schools, and government services.…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

RelGT-AC: A Relational Graph Transformer for Autocomplete Tasks in Relational Databases

Relational databases underpin modern enterprise, scientific, and healthcare systems, yet predictive machine learning on such data remains c…

2026-06-03 13:00 JSTarXiv cs.AIエージェント

ToolGate: Token-Efficient Pre-Call Control for Tool-Augmented Vision-Language Agents

Tool-augmented vision-language agents can acquire external perceptual evidence through OCR, detection, segmentation, and other tools, but e…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

SkillDAG: Self-Evolving Typed Skill Graphs for LLM Skill Selection at Scale

As LLM agents adopt large skill libraries, selecting the right subset becomes a structural problem rather than a similarity-matching one: s…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

CORE: Conflict-Oriented Reasoning for General Multimodal Manipulation Detection

The rapid rise of generative AI has made multimodal fake news increasingly realistic and pervasive, posing severe threats to public trust a…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

DELTAMEM: Incremental Experience Memory for LLM Agents via Residual Trees

Large Language Model (LLM)-based agents increasingly rely on memory to learn from experiences over continual interactions. However, storing…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

The Shadow Price of Reasoning: Economic Perspective on Optimal Budget Allocation for LLMs

Inference-time scaling has emerged as a critical avenue for enhancing Large Language Models' performance, yet real-world deployment is cons…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Decomposing how prompting steers behavior

Prompting steers large language models (LLMs) and vision-language models (VLMs) without weight updates, but it remains unclear how instruct…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

From Long News to Accurate Forecast: Importance-Aware Fusion and PRM-Guided Reflection for Time Series Forecasting

Incorporating news into time series forecasting is appealing because news can reveal abrupt exogenous events that historical values alone c…

2026-06-03 13:00 JSTarXiv cs.AIエージェント研究/論文

DeskCraft: Benchmarking Desktop Agents on Professional Workflows and Human-in-the-Loop Collaboration

Real-world professional desktop workflows in specialized creative and engineering software unfold over long horizons and often require huma…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

EvoTrainer: Co-Evolving LLM Policies and Training Harnesses for Autonomous Agentic Reinforcement Learning

Autonomous LLM training is often framed as recipe search, which leaves the training harness largely static. This limitation sharpens in age…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Uncertainty-Aware Clarification in LLM Agents with Information Gain

Large Language Model (LLM) agents often operate under underspecified user instructions, where latent uncertainty over user intent leads to…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェントビジネス/資金調達

Think-Before-Speak: From Internal Evaluation to Public Expression in Multi-Agent Social Simulation

LLM-based multi-agent simulation offers a promising way to study social interaction, deliberation, and collective opinion dynamics. However…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

GTBench: A Curriculum-Grounded Benchmark for Evaluating LLMs as Mathematical Research Assistants in Graph Theory

Large language models (LLMs) are increasingly used as self-study assistants in technical disciplines, yet their reliability as mathematical…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

ClinicalMC: A Benchmark for Multi-Course Clinical Decision-Making with Large Language Models

Large language models (LLMs) have been widely adopted in healthcare, yet they still encounter significant challenges in complex clinical de…

2026-06-03 13:00 JSTarXiv cs.AIエージェント研究/論文

MedCUA-Bench: A Screenshot-Only Benchmark for Clinical Computer-Use Agents

Computer-use agents could automate repetitive screen-based clinical work, but their reliability in medical graphical user interfaces remain…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成

Effect of Demographic Bias on Skin Lesion Classification

In this study, we evaluate the performance of skin lesion classification using ResNet-based convolutional models, focusing on the impact of…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Perceive Before Reasoning: A Pre-Reasoning Perception Framework for Efficient and Reliable Proactive Mobile Agents

Multimodal large language models (MLLMs) have substantially advanced mobile agents, yet proactive mobile assistance remains challenging bec…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Solipsistic Superintelligence is Unlikely to be Cooperative

AI's central challenge is shifting from capability to coexistence. The dominant paradigm in AI research focuses on developing powerful agen…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成

Do Real-World Datasets Contain Natural Experiments? An Empirical Study Using Causal Feature Selection

In nature, events that affect some individuals or groups but not others constitute an implicit intervention and are known as natural experi…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Distilling Answer-Set Programming Rules from LLMs for Neurosymbolic Visual Question Answering

Visual Question Answering (VQA) is the task of answering questions about images, requiring the integration of multimodal input and reasonin…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

A Negative Result on Cross-Model Activation Transfer in a Pythia Multi-Hop Setting

Recent work shows that language models can transmit behavioural traits through hidden signals in generated data during training. We ask whe…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

LEAP: Supercharging LLMs for Formal Mathematics with Agentic Frameworks

Large Language Models (LLMs) exhibit strong informal mathematical reasoning but struggle to generate mechanically verifiable proofs in form…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

The Reliability Gap in Benchmark Auditing: Distribution Shift and Scale as Failure Modes of Contamination Detection

Benchmark contamination, where evaluation examples appear in a model's training data, threatens the validity of LLM assessment. Statistical…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

The Violation Situation Pattern: A Knowledge-Graph Pattern for Compliance Violations

Compliance pipelines detect violations as transient query results and do not keep the violation itself as a persistent graph object with re…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

InfoMem: Training Long-Context Memory Agents with Answer-Conditioned Information Gain

Long-context tasks require LLMs to identify and preserve answer-relevant information from large contexts. Chunk-wise memory agents address…

2026-06-03 13:00 JSTarXiv cs.AIエージェント

CP-Agent: Context-Aware Multimodal Reasoning for Cellular Morphological Profiling under Chemical Perturbations

Cell Painting combines multiplexed fluorescent staining, high-content imaging, and quantitative analysis to generate high-dimensional pheno…

2026-06-03 13:00 JSTarXiv cs.AIエージェント

What Makes Interaction Trajectories Effective for Training Terminal Agents?

Stronger code agents are commonly assumed to be superior teachers for post-training, yet this assumption remains poorly disentangled from t…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

DMF: A Deterministic Memory Framework for Conversational AI Agents

Conversational AI agents require memory systems that are both scalable and semantically coherent across long interaction horizons. Existing…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

StepFinder: A Temporal Semantic Framework for Failure Attribution in Multi-Agent Systems

LLM-based multi-agent systems exhibit remarkable collaborative capabilities in complex multi-step tasks. However, these systems are highly…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

A formal definition and meta-model for a machine theory of mind

This paper proposes, for the first time, a rigorous formal definition of the concept of Machine Theory of Mind, based on principles support…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

ThoughtFold: Folding Reasoning Chains via Introspective Preference Learning

Large Reasoning Models (LRMs) have achieved remarkable progress thanks to Reinforcement Learning with Verifiable Rewards (RLVR) on Chain-of…

2026-06-03 13:00 JSTarXiv cs.AIエージェント

Overlaying Governance: A Compositional Authorization Framework for Delegation and Scope in Agentic AI

As AI systems evolve from passive models into autonomous active agents capable of initiating actions, collaborating, and delegating tasks,…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェントビジネス/資金調達

SAGE: A Quantitative Evaluation of Socialized Evolution in Agent Ecosystems

Self-improving language agents are typically evaluated in isolation: an agent attempts a task, receives feedback, and iteratively refines i…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

From Prompt to Service: An SLM-Based Agent Orchestration Gateway for AI-Driven Virtual Worlds

As generative AI capabilities expand, AI-driven virtual worlds face a growing architectural challenge. Users interact through in-world inte…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Cross-Lingual Token Arbitrage: Optimizing Code Agent Context Windows via Local LLM Preprocessing

AI-assisted coding agents are bottlenecked by input-token cost. Two pathologies of raw human input drive much of this overhead: tokenizatio…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Bridging Auxiliary Constraints to Resolve Instruction Following in Large Reasoning Models

Large Reasoning Models (LRMs) have demonstrated impressive capabilities in many tasks, yet they struggle with reliably following multiple i…

2026-06-03 13:00 JSTarXiv cs.AIエージェント

TSQAgent: Rating Time Series Data Quality via Dedicated Agentic Reasoning

Assessing the quality of time series (TS) data is fundamental yet inherently challenging due to the multifaceted nature of quality dimensio…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Gender-Dependent Diagnostic Substitution in LLM Medical Triage: Same Symptoms, Unequal Urgency

We investigate whether large language models produce different medical triage recommendations for identical neurological symptoms when only…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Towards Non-Monotonic Entailment in Propositional Defeasible Standpoint Logic

Recent work in defeasible reasoning has seen notions of preferential semantics and entailment in the style of Kraus et al. applied to modal…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェントビジネス/資金調達研究/論文

Diagnosing Knowledge Gaps in LLM Tool Use: An Agentic Benchmark for Novel API Acquisition

Large language models for code generation often need to use APIs that are absent from their pretraining data. This requires more than recal…

2026-06-03 13:00 JSTarXiv cs.AIビジネス/資金調達研究/論文

From Answers to States: Verifiable Process-Level Evaluation of Chemical Reasoning in Large Language Models

Large language models are increasingly used as chemistry assistants, yet most chemistry benchmarks still score only final answers. This mas…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

EvoDrive: Pareto Evolution for Safety-Critical Autonomous Driving via Self-Improving LLM Agents

Generating safety-critical scenarios is essential for validating and improving autonomous driving systems, yet it inherently requires maxim…

2026-06-03 13:00 JSTarXiv cs.AIエージェント

The DeepSpeak-Agentic Dataset

We present DeepSpeak-Agentic, a dataset of videos comprising over 37 hours of semi-structured conversations between a human and an embodied…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

SkillPyramid: A Hierarchical Skill Consolidation Framework for Self-Evolving Agents

Recent AI agents can flexibly invoke skills to solve complex tasks, but their long-term improvement is fundamentally constrained by a lack…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Dynamic Objective Selection with Safeguards and LLM Oversight for Financial Decision-Making

Financial decision-making tasks such as stock recommendation and portfolio allocation typically estimate future return and risk and then se…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Code-on-Graph: Iterative Programmatic Reasoning via Large Language Models on Knowledge Graphs

Knowledge Graphs (KGs) are widely used to mitigate the limitations of Large Language Models (LLMs), such as outdated knowledge and hallucin…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Unveiling the Structure of Do-Calculus Reasoning via Derivation Graphs

The do-calculus defines a general system of inference for interventional queries, allowing causal quantities to be transformed through succ…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

When to Re-Plan: Subgoal Persistence in Hierarchical Latent Reasoning

Long-horizon reasoning requires a system to commit to medium-horizon intent without becoming rigid: re-plan too often and computation never…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

Proof-Refactor: Refactoring Generated Formal Proofs into Modular Artifacts

While Large Language Models (LLMs) have shown strong performance in generating formal proofs, their outputs often remain less readable, mod…

2026-06-03 13:00 JSTarXiv cs.AIエージェント

LAP: An Agent-to-Instrument Protocol for Autonomous Science

Autonomous science is moving from demonstration to infrastructure. Large language model agents now plan experiments, and self-driving labor…

2026-06-03 13:00 JSTarXiv cs.AIエージェント

From Control Boundary to Insurance Claim: Reconstructing AI-Mediated Losses Through the CER Framework

AI losses that arise through an insured organization's generative or agentic AI system require state reconstruction, not merely event recon…

2026-06-03 13:00 JSTarXiv cs.AIエージェント

Enhancing Operational Safety via Agentic Dialogue Hazard Identification Analysis

Operational safety in high-stakes domains such as industrial process control, autonomous, and safety-critical systems, demand reliable haza…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Leveraging BART to Assess CS1 C++ Programming Assignments using Rubric-based Criteria

This paper investigates rubric-aware, multitask fine-tuning of transformer models for automated grading of introductory C++ programming ass…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Calibrating Urban Traffic Simulation from Sparse Road Observations via Genetic Optimization

Urban traffic simulation is a critical tool for infrastructure planning, including the placement of electric vehicle charging stations. How…

2026-06-03 13:00 JSTarXiv cs.AIエージェント研究/論文

BigFinanceBench: A Workflow-Grounded Benchmark for Financial-Research Agents

Financial-research answers are decision-relevant only when another analyst can audit how they were produced: which source was chosen, which…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

EvoDS: Self-Evolving Autonomous Data Science Agent with Skill Learning and Context Management

Recent progress in Large Language Model (LLM) agents has enabled promising advances in automated data science. However, existing approaches…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

PyraMathBench: Evaluating and Improving Mathematical Capability in Large Language Models

Despite the pivotal role of numerical reasoning as the cornerstone of mathematical capabilities in large language models (LLMs) across appl…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Reasoning Structure of Large Language Models

Large reasoning models (LRMs) are often evaluated using metrics such as final-answer accuracy or token count. However, identical scores on…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

scTranslation: A Comprehensive Benchmark for Single-Cell Multi-Omics Modality Translation

Simultaneous measurement of multiple omics modalities in single cells enables researchers to gain a more comprehensive understanding of cel…

2026-06-03 13:00 JSTarXiv cs.AIエージェント研究/論文

Hedge-Bench: Benchmarking Agents on Hard, Realistic Tasks Pertaining to Financial Reasoning

AI agents can increasingly handle the mechanical tasks of financial analysis: retrieving documents, calculating formulas, updating spreadsh…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Entropy Is Not Enough: Unlocking Effective Reinforcement Learning for Visual Reasoning via Vision-Anchored Token Selection

While token-level entropy is commonly recognized as effective for credit assignment in text-only reinforcement learning with verifiable rew…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Imaginative Perception Tokens Enhance Spatial Reasoning in Multimodal Language Models

Vision language models (VLMs) excel at many tasks but still struggle with spatial reasoning when critical information is not directly obser…

2026-06-03 13:00 JSTarXiv cs.AIロボティクス

TRAP: Hijacking VLA CoT-Reasoning via Adversarial Patches

By integrating Chain-of-Thought (CoT) reasoning, Vision-Language-Action (VLA) models have demonstrated strong capabilities in robotic manip…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Cost-Aware Query Routing in RAG: Empirical Analysis of Retrieval Depth Tradeoffs

Retrieval-augmented generation (RAG) faces a fundamental three-way tension: deeper retrieval improves factual grounding but inflates token…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

IdiomX A Multilingual Benchmark for Idiom Understanding, Retrieval, and Interpretation

Idiomatic expressions remain a persistent challenge for natural language processing because their meanings are often non-compositional, con…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Lean-GAP: A Dataset of Formalized Graduate Algebra Problems

We present Lean-GAP (Lean-Graduate Agebra Problems), 430 formalized graduate-level algebra problems from the textbook Abstract Algebra by D…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Tracking Urban Atmospheric Pollutants using Sentinel-5P Satellite Data

Urban nitrogen dioxide ($NO_2$) is a key indicator of combustion-related air pollution and exhibits strong spatial and temporal variability…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Auditable Climate Risk Intelligence from Fragmented ESG Data: Deterministic Orchestration and Imbalance-Aware Learning for Scope 1-3 Validation

ESG and climate risk data remain fragmented across heterogeneous Scope 1, Scope 2, and Scope 3 reporting environments, while conventional v…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Cross-Modal Contrastive Learning of ECG and Angiography Representations for Severe Stenosis Classification

Coronary artery stenosis is a common cardiovascular disease, with severe, untreated cases posing significant risks of heart attack. Althoug…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

ReLoRA: Knowledge-Reusing Adaptation for Fast Rollout of Evolving LLM Services

Large Language Models (LLMs) are increasingly deployed as continuously evolving services, where frequent base-model updates may invalidate…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Geometry-Aware Tabular Diffusion

Tabular synthesis is critical for privacy-preserving sharing and augmentation, yet diffusion models rely on implicit mechanisms to capture…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Building Better Activation Oracles

Activation Oracles (AOs) are promising methods for interpreting residual stream activations. However, current AOs face important issues, su…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Samudra 2: Scaling Ocean Emulators across Resolutions

Ocean general circulation models (OGCMs) are essential to climate science but computationally expensive, limiting ensemble size and forcing…

2026-06-03 13:00 JSTarXiv cs.AIエージェント

Margin Play: A Multi-Agent System For Public Policy Analysis In The Brazilian Equatorial Margin

The Brazilian Equatorial Margin (BEM) is Brazil's next offshore oil frontier, with operations expected to begin in 2026 in the Foz do Amazo…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

FSA-GRPO: Teaching Auditory LLMs to Use Few-shot Demonstrations

Few-shot prompting provides an effective way to adapt auditory large language models to low-resource tasks such as children's speech recogn…

2026-06-03 13:00 JSTarXiv cs.AIエージェント

Closed-Loop Molecular Design with Calibrated Deference

We present Cognitive Loop via In-Situ Optimization (CLIO), an agent that couples a continuously-updated belief-state graph with a recursive…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Oscillatory State-Space Models as Inductive Biases for Physics-Informed Neural PDE Solvers

Solving time-dependent partial differential equations (PDEs) is an important problem in computational science and engineering. Physics-info…

2026-06-03 13:00 JSTarXiv cs.AIエージェント研究/論文

TadA-Bench: A Million-Variant Benchmark for Future-Round Discovery Toward Agentic Protein Engineering

AI for scientific discovery is entering an agentic era, where protein-engineering systems are expected to prioritize future wet-lab experim…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

DXA-Derived Skeletal Phenotypes and Hip Fracture Risk: A Backdoor-Adjusted Causal Analysis

Purpose: To compare dual-energy X-ray absorptiometry (DXA)-derived hip skeletal phenotypes in relation to hip fracture risk using prespecif…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Enhancing Protein-Protein Interaction Prediction with Hierarchical Motif-based Multimodal Protein Embedding

Protein-protein interactions (PPIs) are essential for many biological processes. However, existing PPI prediction approaches suffer from tw…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

MultiTurnPSB: Evaluating Multi-Turn Jailbreak Attacks an dClassifier-Based Defenses for Medical AI Safety

Patient-facing medical chatbots are commonly evaluated on single-turn prompts, yet real users push back after refusals, add urgency, and in…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

Wavelet as Tokenizer: Preliminary Results on a Shared Wavelet Token Schema for Natural Signals

This paper studies whether audio, images, and video can share a common wavelet token schema rather than relying on separate modality-specif…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Position: Prioritize Identifying Structure, Not Complex Models, for Scientific Discovery

Modern Machine Learning (ML) and Artificial Intelligence (AI) models, especially large language models (LLMs), are increasingly used to gen…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Echo-POSED: Geometric Self-Distillation for Echocardiography Guidance

We introduce Echo-POSED, a self-supervised framework for real-time transthoracic echocardiography (TTE) guidance that recommends probe adju…

2026-06-03 13:00 JSTarXiv cs.AIロボティクス

Too Much of a Good Thing: When sim2real Efforts Impede Policy Learning (And What to Do About It)

While sim2real efforts are necessary for effective policy transfer to hardware, there is such a thing as too much of a good thing. We argue…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

SegTune: Structured and Fine-Grained Control for Song Generation

Recent advances in neural song generation have enabled high-quality synthesis from lyrics and global textual prompts. However, most systems…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成

Sparse-View Lung Nodule Volumetry from Digitally Reconstructed Radiographs via AReT: Anatomy-Regularized TensoRF

We identify and resolve a previously unreported failure mode in TensoRF when applied to X-ray attenuation fields: the default density shift…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

D-Judge: Disrupting Multi-Turn Jailbreaks using Semantics-Preserving Output Rewriting

Multi-turn jailbreak attacks pose a growing threat to large language model (LLM) safety because they exploit feedback from auxiliary judge…

2026-06-03 13:00 JSTarXiv cs.AIエージェントロボティクス

CARVE: Certified Affordable Repair of Vetoed Maneuvers via Envelopes for Interactive Driving

Interactive driving exposes a failure mode that is easy to miss in rule-aware autonomous-driving stacks: a hard-rule margin can be negative…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成ハードウェア/半導体研究/論文

SVHalluc: Benchmarking Speech-Vision Hallucination in Audio-Visual Large Language Models

Despite the success of audio-visual large-language models (LLMs), they can produce plausible but ungrounded outputs, termed hallucination.…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Inference Cost Attacks for Retrieval-Augmented Large Language Models

Retrieval-Augmented Generation (RAG)-enhanced LLM systems, while powerful, introduce substantial inference costs due to the inclusion of an…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

A New Framework for Cybersecurity Refusals in AI Agents

Agentic scaffolds have dramatically improved LLM performance on complex, long-horizon tasks, yielding both broad benefits and amplified ris…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Target Updates May Stabilize Linear Q-Learning: Periodic and Soft Dynamics

Periodic target updates in Q-learning and soft target updates in actor-critic methods are empirically well established stabilization mechan…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

The Ringelmann Effect in Multi-Agent LLM Systems: A Scaling Law for Effective Team Size

Inference-time multi-agent LLM scaling lacks a shared unit: counting nominal agents conflates cost with independent evidence. We derive a t…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

CL-DMDF:Dynamic Multimodal Data Fusion Model Based on Contrastive Learning

Multimodal data fusion involves integrating and analyzing information from multiple modalities to uncover latent correlations and complemen…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Learning to Refine: Spectral-Decoupled Iterative Refinement Framework for Precipitation Nowcasting

Accurate precipitation nowcasting is vital for disaster mitigation, but deep learning methods face a key trade-off: regression models produ…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Improvise, Adapt, Overcome: An On-The-Fly Multifidelity Algorithm for Efficient Machine Learning

Machine learning has accelerated quantum chemistry but is hindered by the prohibitive cost of generating high fidelity training data. Multi…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

AdaWeather: Adaptively Mixing Probabilistic Weather Forecasts with Logarithmic Regret

Recent advances in machine learning have produced probabilistic weather forecasting models comparable to state-of-the-art numerical weather…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Anomalies in Multivariate Time Series Benchmarks Are Mostly Univariate

Many recent multivariate time series anomaly detection (MT-SAD) models incorporate cross-channel modeling, under the implicit assumption th…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Aligning Data-Driven Predictors with Allocation: A Decision-Focused Approach to Survival Analysis

Machine learning predictors have become essential tools for guiding automated decision making. However, a major misalignment persists: pred…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Filter, Then Reweight: Rethinking Optimization Granularity in On-Policy Distillation

On-Policy distillation (OPD) in large language models is shifting from full-trace KL supervision toward more selective training paradigms.…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成

AVTrack: Audio-Visual Tracking in Human-centric Complex Scenes

Audio-visual speaker tracking aims to localize and track active speakers by leveraging auditory and visual cues, enabling fine-grained, hum…

2026-06-03 13:00 JSTarXiv cs.AIロボティクス

See Less, Specify More: Visual Evidence Budgets for Generalizable VLAs

Generalization remains a central bottleneck for vision-language-action (VLA) models: under distractors, appearance shifts, and semantically…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Attention Calibration for Position-Fair Dense Information Retrieval

Dense retrieval models exhibit positional bias: retrieval effectiveness degrades when relevant information appears later in a passage (Zeng…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

EntangleCodec: A Unified Discrete Audio Tokenizer via Semantic-Acoustic Entanglement

Audio tokenizers serve as the discrete interface between continuous audio and Audio Language Models (ALMs), but existing tokenizers often s…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

Plan2Map: A Multimodal Benchmark for Document-Grounded Geospatial Boundary Reconstruction from Planning Records

Planning records define restrictions over geographic areas, but their source documents often provide only indirect spatial evidence rather…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成エージェント

MetaWorld: Scaling Multi-Agent Video World Model from Single-view Video Data

Video world models are a foundational generative technology for embodied AI and the Metaverse, yet existing approaches are inherently limit…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

Acceptance-Test-Driven Evaluation Protocols for Business-Centric LLM Systems

Large language model (LLM) applications are increasingly expected to satisfy deterministic institutional requirements while relying on prob…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Representational Capacity: Geometric Limits on Feature Representation in Transformer Language Models

Model dimension ($d_{model}$) is a fundamental hyperparameter in transformer language models, yet its role in setting the geometric limits…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

CRAM-ER: Error-Resilient Spintronic Computational Random Access Memory for Scalable In-Memory Computation

Deep neural networks (DNNs) have achieved state-of-the-art performance across diverse domains. However, typical Von Neumann compute paradig…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成ロボティクス

Cosmos 3: Omnimodal World Models for Physical AI

We introduce Cosmos 3, a family of omnimodal world models designed to jointly process and generate language, image, video, audio, and actio…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Do Neural Retrievers Prefer Certain Documents? Evidence of Learned Relevance Priors

Neural retrievers are trained to estimate query-document relevance from annotated query-document pairs. Yet annotation protocols may not pu…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Which Defense Closes Which Threat? Attributing OWASP-LLM-Top-10 Coverage and Its Brittleness Under Paraphrasing

Production LLM applications stack several defense families -- refusal-phrase filters, token-budget controls, model allowlists, rate limits,…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Large Byte Model: Teaching Language Models About Compiled Code

Malware analysis starts with the raw bytes of an executable program, and tools to "lift" these to higher-level representations, such as ass…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Fixing FOLIO and MALLS: Verified Annotations and an LLM-assisted Framework to Focus Human Relabeling

Accurate translation from Natural Language to First-Order Logic (NL-to-FOL) underpins neurosymbolic AI systems and Natural Language Inferen…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

GRZO: Group-Relative Zeroth-Order Optimization for Large Language Model Fine-Tuning

Zeroth-order (ZO) optimization is a memory-efficient alternative to backpropagation for fine-tuning large language models, but its deployme…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Economy of Minds: Emerging Multi-Agent Intelligence with Economic Interactions

How can a population of agents self-orchestrate and self-adapt into stronger collective intelligence without centralized control? Inspired…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Forgetting is Not Erasure: Recovering Latent Knowledge via Transport Keys

Catastrophic forgetting is often framed as a representational problem: after sequential training, a model appears to lose the features that…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

The Epi-LLM Framework: probing LLM behavioral priors through epidemiological agent-based models

Human behaviour during epidemics affects infectious disease dynamics, but quantifying this remains deeply challenging. Here we introduce th…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Adaptive Latent Agentic Reasoning

Large reasoning models improve performance by generating extended chain-of-thought (CoT) reasoning, but this behavior becomes inefficient w…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

LLM-Assisted Reranking to Operationalize Nuanced Objectives in Recommender Systems

Recommender systems have grown from content-organization tools into sophisticated systems that shape daily behavior. By controlling what we…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Are we really tilting? The mechanics of reward guidance in flow and diffusion models

Reward guidance algorithms steer a learned generative process toward the reward-tilted measure at inference time. While empirically powerfu…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Scalable Uncertainty Quantification for Extreme Weather Forecasting via Empirical Neural Tangent Kernels

Deep learning weather models now match numerical weather prediction accuracy while running orders of magnitude faster, but produce determin…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Linear Probes Detect Task Format, Not Reasoning Mode in Language Model Hidden States

Linear probing of large language model (LLM) hidden states is widely used to claim that models learn distinct representations for different…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

WRIT: Write-Read Intensive Trajectory Synthesis for Multi-Turn User-Facing Agents

Multi-turn user-facing agents must infer user intent from incomplete requests, collect missing information through dialogue and tools, and…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成エージェントロボティクスビジネス/資金調達

SCOPE: Real-Time Natural Language Camera Agent at the Edge

Deploying language-driven agents in robotics requires evaluations that reflect real-world task demands: natural-language instructions with…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Fast-dLLM++: Fr\'{e}chet Profile Decoding for Faster Diffusion LLM Inference

Diffusion large language models promise parallel token generation, yet inference remains bottlenecked by deciding which masked tokens can b…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Echelon: Auditable Aggregate-Only Language-Model Adaptation Across Privacy Boundaries

Cross-organization language-model adaptation increasingly faces hard governance constraints: in many deployments, device-level model state-…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成

Hand Trajectory Fusion for Egocentric Natural Language Query Grounding

Egocentric Natural Language Query (NLQ) grounding asks a model to localize, in a long first-person video, the temporal interval that answer…

2026-06-03 13:00 JSTarXiv cs.AIエージェント

Glass Box at Orbit: A Constitutional AI Verification Framework for Trustworthy Autonomous CubeSat Intelligence

The space industry is quietly building toward something nobody has fully reckoned with: orbital data centers running thousands of autonomou…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成エージェントロボティクス

Towards Compact Autonomous Driving Perception with Balanced Learning and Multi-sensor Fusion

We present a novel compact deep multi-task learning model to handle various autonomous driving perception tasks in one forward pass. The mo…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Pretraining Language Models on Historical Text

We introduce TypewriterLM, a 7.24B History language model (LM) trained exclusively on English text predating 1913. Developing History LMs r…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Patcher: Post-Hoc Patching of Backdoored Large Language Models

Large language models remain vulnerable to jailbreak backdoor attacks, where adversaries poison safety alignment data to embed hidden trigg…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

How Quantization Changes Interpretable Features: A Sparse Autoencoder Analysis of Language Models

Quantization is a standard path to deploying large language models, and a quantized model is typically judged acceptable when its perplexit…

2026-06-03 13:00 JSTarXiv cs.AIロボティクス

Exact equivariance, kept through training, buys zero-shot generalisation across the symmetry group

A latent world model built from an equivariant encoder $E$ and an equivariant predictor $f$ inherits a provable symmetry of its training lo…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成エージェント

MUSE: A Unified Agentic Harness for MLLMs

Despite rapid progress, multimodal large language models (MLLMs) still fail on tasks that humans solve effortlessly, such as navigating a g…

2026-06-03 13:00 JSTarXiv cs.AIロボティクス

ConTraIRL: Factorized Contrastive Abstractions for Transferable IRL

Reward transfer in Inverse Reinforcement Learning (IRL) is unreliable when policies must generalize to unseen combinations of environment d…

2026-06-03 13:00 JSTarXiv cs.AI規制/政策

Reproducibility is the New Copyleft: Defining AGI-oriented Reproducible Builds

Copyleft, as implemented in licenses such as the GNU General Public License, was a legal hack that used copyright to guarantee user freedom…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Hallucinations as Orthogonal Noise: Inference-Time Manifold Alignment via Dynamic Contextual Orthogonalization

Hallucination in Large Language Models (LLMs), characterized by the generation of content inconsistent with contextual facts or logical con…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Spike-Aware C++ INT8 Inference for Sparse Spiking Language Models on Commodity CPUs

Spiking language models expose activation sparsity that dense Transformer runtimes do not directly exploit. This paper studies that propert…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Conditional Hypothesis Generation for LLM-Based Text Analysis with Researcher-Specified Covariates

A core goal of computational social science is to discover interpretable differences in how language varies across outcomes of interest, su…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Capability Advertisement as a Market for Lemons: A Trust Layer for Heterogeneous Agent Networks

Large language model (LLM) agents have begun to delegate work to one another. Protocols such as the Model Context Protocol (MCP) and the Ag…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Rethinking Molecular Text Representations for LLMs: An Empirical Study

Large language models (LLMs) are increasingly used for molecular tasks, but it remains unclear which molecular representation to use. We pr…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Brief Announcement: Generative Markov Model for Distributed Computing Systems

Emerging distributed computing paradigms, such as the computing continuum, are inherently heterogeneous, stochastic, and complex. Efficient…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Learn When and Where to Connect: Adaptive Virtual Nodes for Dynamic Message Passing on Graphs

While Virtual Nodes (VNs) are often utilized in Message Passing Neural Networks (MPNNs) to facilitate effective message passing, existing V…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成

ROBUST-WT: Robust Uncertainty-aware Segmentation Transform via Whitening and Training Enhancements

Generalized segmentation of medical images prevents performance degradation when different imaging devices and clinical protocols are used…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

ASymPO: Asymmetric-Scale Policy Optimization for Asynchronous LLM Post-Training Without Behavior Information

Asynchronous reinforcement learning can improve language-model post-training throughput by decoupling response generation from policy optim…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Efficient Hyperparameter Optimization for LLM Reinforcement Learning

Reinforcement learning (RL) for large language models (LLMs) is highly sensitive to hyperparameter configurations, making hyperparameter op…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Libra: Efficient Resource Management for Agentic RL Post-Training

Reinforcement learning (RL) has become a standard post-training paradigm for large language models (LLMs), extending beyond preference alig…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Regret Pre-training: Bridging Prior and Posterior Views for Enhanced Knowledge Grounding

Causal language models factorize sequence probabilities using only preceding context, leaving future information unexploited during trainin…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Constitutional On-Policy Safe Distillation

On-policy self-distillation (OPSD) has emerged as an efficient post-training paradigm by using a teacher conditioned on privileged informat…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

"Important You should give me full credits!": Exploring Prompt Injection Attacks on LLM-Based Automatic Grading Systems

The emergence of large language models (LLMs) has significantly accelerated recent research on LLM-based automatic grading (AG) systems. Be…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

BAHSD: Bridging the Long-tail Gap via Adaptive Distillation in Black-box Sequential Recommendation

Sequential recommendation systems are widely adopted but often deployed as black-box APIs, which has driven recent interest in model extrac…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

PhotoCraft: Agentic Reasoning with Hierarchical Self-Evolving Memory for Deep Image Search

Deep Image Search requires multi-step reasoning over rich contextual cues, such as time, location, and event relations. However, most exist…

2026-06-03 13:00 JSTarXiv cs.AIビジネス/資金調達研究/論文

AnyAudio-Judge: A Dynamic Rubric-Based Benchmark and Evaluator for Audio Instruction Following

The rapid advancement of instruction-guided audio generation has highlighted the critical need for robust alignment evaluation. Current aut…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成

GuidedBridge: Training-freely Improving Bridge Models with Prior Guidance

Guidance methods, such as classifier-free guidance (CFG) and auto-guidance (AG), have advanced noise-to-data generation in diffusion models…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Decoupled Smart Contract Audits: Lightweight LLM Framework via Distillation and Aggregation

Smart contracts face critical security challenges that require thorough auditing in decentralized web services. While Large Language Models…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成エージェントロボティクスハードウェア/半導体ビジネス/資金調達

NVIDIA OmniDreams: Real-Time Generative World Model for Closed-Loop Autonomous Vehicle Simulation

As autonomous vehicle capabilities advance, the safe evaluation of driving policies in long-tail scenarios remains a critical bottleneck. I…

2026-06-03 13:00 JSTarXiv cs.AIエージェント

OpenAgenet/OAN: Open Infrastructure for Trusted Agent Interconnection

OpenAgenet, abbreviated as OAN, is an open infrastructure project for trusted Agent interconnection. It addresses a problem that becomes vi…

2026-06-03 13:00 JSTarXiv cs.AIエージェント研究/論文

OpenAgenet/OAN: Technical Architecture for Trust-Governed Agent Identity and Discovery

This paper describes the technical architecture of OpenAgenet / OAN. OAN is a protocol-neutral trust layer for open Agent interconnection.…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Fully Automated Identification of Lexical Alignment and Preference-Stage Shifts in Large Language Models

The language used by digital chat assistants such as ChatGPT can diverge from human expectations (misalignment). Research, mostly on Scient…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

AI Rater Discrimination Depends on Scoring Protocol in Complex Clinical Decision-Making

Clinical AI evaluation increasingly delegates scoring to large language models (LLMs) acting as AI raters, yet their scoring behavior acros…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成

Reinforcement Learning from Cross-domain Videos with Video Prediction Model

Reinforcement learning from expert videos across visually distinct domains is challenging due to the absence of reward signals and the pres…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

WebRISE: Requirement-Induced State Evaluation for MLLM-Generated Web Artifacts

Existing benchmarks for MLLM-generated web artifacts assess interaction through local evidence and miss the requirement-induced states and…

2026-06-03 13:00 JSTarXiv cs.AIロボティクス

BotDirector: Robot Storytelling Across the Symmetrical Reality with Multi-modal Interactions

Robot storytelling offers a unique blend of technological innovation and creative expression that engages children in unprecedented ways. H…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

GFFMERGE: Efficient Merging of Graph Neural Force Fields and Beyond

Graph Neural Networks (GNNs) have revolutionized Neural Force Fields for atomistic simulations, achieving near-quantum accuracy at reduced…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

When RLHF Fails: A Mechanistic Taxonomy of Reward Hacking, Collapse, and Evaluator Gaming

Reinforcement learning from human feedback (RLHF) makes large-scale post-training possible by replacing an underspecified human objective w…

2026-06-03 13:00 JSTarXiv cs.AIロボティクス

AirDreamer: Generalist Drone Navigation with World Models

Navigating a drone in unseen and cluttered environments requires reliable generalization to unseen scene layouts and understanding of envir…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

PSViT: A Methodology for Structurally Pruning Spiking Vision Transformers

Spiking Vision Transformer (SViT) models are promising low-power ViT models for solving vision-based tasks with state-of-the-art performanc…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

EqGINO: Equivariant Geometry-Informed Fourier Neural Operators for 3D PDEs

Deep learning surrogates for 3D Partial Differential Equations (PDEs) often fail to generalize across geometric transformations because the…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Are Common Substructures Transferable? Riemannian Graph Foundation Model with Neural Vector Bundles

Foundation models have sparked a revolution via a pretraining-adaptation paradigm, with recent efforts extending this success to graphs. Un…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成エージェント研究/論文

VistaHop: Benchmarking Multi-hop Visual Reasoning for Visual DeepSearch

Visual DeepSearch requires multimodal large reasoning model (MLRM) agents to answer complex visual queries by repeatedly inspecting image r…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

AI-Generated Traces for Novice Programmers: Learning Effects and Learner Differences in a Multi-Institutional Study

Introductory programming (CS1) courses often struggle to support students' understanding of program execution. While visualizations can mak…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Message Tuning Outshines Graph Prompt Tuning: A Prismatic Space Perspective

Graph Foundation Models (GFMs), built upon the Pre-training and Adaptation paradigm, have emerged as a research hotspot in graph learning.…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Generalizing Graph Foundation Models via Hyperbolic Retrieval-Augmented Generation

Graph foundation models (GFMs) emerged as a dominant paradigm in graph representation learning by leveraging large-scale pre-training for c…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Learning Multi-Scale Hypergraph for High-Order Brain Connectivity Analysis

Understanding complex interactions between brain regions is critical for early neurodegenerative disease classification such as Alzheimer's…

2026-06-03 13:00 JSTarXiv cs.AIロボティクス

RobotValues: Evaluating Household Robots When Human Values Conflict

While household robots are often evaluated based on task completion, everyday domestic environments involve value-conflicting situations in…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Multi-Modal Graph Neural Network with Transformer-Guided Adaptive Diffusion for Preclinical Alzheimer Classification

The graphical representation of the brain offers critical insights into diagnosing and prognosing neurodegenerative disease via relationshi…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

dstack-capsule: Pod-Level Remote Attestation for Confidential Workloads on Kubernetes

The rise of LLM-as-a-Service and other confidential cloud workloads demands cryptographic proof that user data is processed in a trusted, u…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Calibration Data Trade-offs Across Capability Dimensions: Why Multi-Source Mixing Matters for High-Sparsity LLM Pruning

Post-training pruning compresses large language models to high sparsity using a small unlabelled calibration set, and recent work has concl…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

FLIPS: Instance-Fingerprinting for LLMs via Pseudo-random Sequences

Literature reveals that a Large Language Model's (LLM) behavior is not only conditioned by its original weights but also its instance-level…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Evaluating LLMs' Effectiveness on Real-World Consumer Device Repair Questions

Consumer device repair is an important but underexplored testbed for large language models (LLMs). Repair tasks require reasoning over inco…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

AugMask: Training Diffusion Models on Incomplete Tabular Data via Stochastic Augmentation and Masking

Score-based diffusion models have emerged as prominent deep generative models; however, their application to tabular data remains challengi…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

SynCred-Bench: Benchmarking Synthetic Credibility in AI-Generated Visual Misinformation

Recent generative models can now produce visual artifacts with realistic embedded text and layouts, creating a new misinformation threat: s…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体研究/論文

The Unsampled Truth: Psychometrics in SLMs Measure Prompt Artifacts, Not Psychological Constructs

When prompting SLMs for psychometric assessments, researchers assume the outputs reflect semantic reasoning. We evaluate this premise acros…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成研究/論文

P\textsuperscript{2}-DPO: Grounding Hallucination in Perceptual Processing via Calibration Direct Preference Optimization

Hallucination has recently garnered significant research attention in Large Vision-Language Models (LVLMs). Direct Preference Optimization…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

AI Model Extraction Attacks: Bypassing Single-Client Assumptions in Defenses

Ensuring the protection of Artificial Intelligence (AI) models deployed in military Command and Control (C2) systems and critical infrastru…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Local Guidance, Global Impact: Gaussian-Reshaped Trust Region Unlocks Behavior Transitions

While Proximal Policy Optimization (PPO) demonstrates strong performance in stationary settings, we show that its standard optimization par…

2026-06-03 13:00 JSTarXiv cs.AIロボティクス

Grasp-Then-Plan with Failure Attribution: A Closed Two-Stage Framework for Precise and Generalizable Robotic Manipulation

In robotic manipulation, the tight coupling between grasping and motion planning often obscures the true source of failure, leading to inef…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

When Model Merging Breaks Routing: Training-Free Calibration for MoE

Model merging has emerged as a cost-effective approach for consolidating the capabilities of multiple LLMs without retraining. However, exi…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Causal Evidence of Stack Representations in Modeling Counter Languages Using Transformers

Formal languages have proven to be effective conduits to understand the inner mechanisms of transformers. Past work has shown that transfor…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Optimizing Explicit Unit-Distance Lower-Bound Certificates

The 2026 disproof of Erd\H{o}s's unit-distance conjecture and Sawin's subsequent explicit quantitative refinement show that the maximum num…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

PrimeSVT: An Automated Memory-aware Pruning Framework with Prioritized Compression Policy for Spiking Vision Transformers

The large sizes of Spiking Vision Transformers (SViTs) still hinder their embedded implementation, highlighting the need for model compress…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

FlowGuard: Flow Matching for Identity-Independent Detection of Data-Free Model Stealing Attacks on Energy System Intrusion Detection Systems

Artificial Intelligence (AI)-based Intrusion Detection Systems (IDS) deployed in energy infrastructure are vulnerable to model theft attack…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

A Hybrid Approach For Malware Classification Using Secondary Features Fusion

The number of malware (either variant or novel) is rapidly increasing, making malware detection and mitigation a complex problem. One appro…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成

PRISM: Synergizing Vision Foundation Models via Self-organized Expert Specialization

Unifying the complementary strengths of diverse Vision Foundation Models (VFMs) into a single efficient model is highly desirable but chall…

2026-06-03 13:00 JSTarXiv cs.AIエージェント研究/論文

FORGE: Multi-Agent Graduated Exploitation and Detection Engineering

Vulnerability disclosure volumes now far exceed organizational assessment capacity, yet three adjacent research communities (proof-of-conce…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Tonal parsimony in chord-sequence analysis: combining modulation cost and tonal vocabulary

We study the assignment of local tonalities to chord sequences, a task useful for harmonic analysis, composition, and jazz-oriented improvi…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Rethinking the Role of Tensor Decompositions in Post-Training LLM Compression

Post-training compression is essential for deploying large language models (LLMs) under tight resource constraints. Tensor decompositions h…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Analyzing Stream Collapse in Hyper-Connections: From Diagnosis to Mitigation

Hyper-Connections (HC) replace the single Transformer residual stream with multiple streams, introducing a permutation symmetry over stream…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

NeuroArmor: Safe-Variant-Guided Representation Consistency for Selective Re-Anchoring in Jailbreak Defense

Large language models remain vulnerable to jailbreak attacks that hide harmful intent behind seemingly ordinary requests such as role-play,…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Learn from Your Mistakes: Tree-like Self-Play for Secure Code LLMs

While Large Language Models (LLMs) excel in code generation, they remain prone to replicating subtle yet critical vulnerabilities endemic t…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

BaltiVoice: A Speech Corpus and Fine-tuned Whisper ASR System for the Balti Language

We present BaltiVoice, a 16.8-hour read-speech corpus for Balti (ISO 639-3: bft), a Tibetic language spoken in Gilgit-Baltistan, Pakistan,…

2026-06-03 13:00 JSTarXiv cs.AIエージェントロボティクス

SPADE: Sketch-guided Path Planning Augmented with Diffusion Experts

Path planning is essential for Autonomous Mobile Robots (AMRs). Conventional methods for incorporating human preferences into planning typi…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Scalable On-Hardware Training of Quantum Neural Networks and Application to Clinical Data Imputation

Training quantum neural networks (QNNs) on quantum hardware is currently bottlenecked by the cost of gradient estimation: standard paramete…

2026-06-03 13:00 JSTarXiv cs.AIエージェント

Post-Hoc Robustness for Model-Based Reinforcement Learning

To improve the real-world applicability of reinforcement learning (RL), the field of adversarially robust RL studies how to train agents un…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

High-Precision APT Malware Attribution with Out-of-Scope Resilience

Early attribution of Advanced Persistent Threat (APT) activity can help defenders prioritise investigation, select countermeasures, and red…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

When Should the Teacher Move? Temporal Coupling and Stability in Self On-Policy Distillation

Self on-policy distillation trains a student policy against a teacher derived from its own parameter history, yet the teacher's update sche…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成

\textsc{CR-Seg}: Attention-Guided and CoT-Enhanced Coarse-to-Refined Reasoning Segmentation

Reasoning segmentation aims to segment target objects described by complex language through joint visual-textual reasoning. Existing method…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成

Efficient Transformer-Based Localized Patch Sampling for Choroid Plexus Segmentation in Multiple Sclerosis

Background: The lateral ventricle choroid plexus (LVCP) is gaining recognition as a key imaging biomarker for multiple sclerosis (MS) relat…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成ロボティクス

Learned Non-Maximum Suppression for 3D Object Detection

Post-processing is a critical stage in LiDAR-based 3D object detection, where dense and overlapping proposals must be filtered for compact…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成

When Attention Collapses: Stage-Aware Visual Token Pruning from Structure to Semantics

Vision-Language Models (VLMs) have demonstrated remarkable capabilities but suffer from significant computational overhead during inference…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成ロボティクス

PHASER: Phase-Aware and Semantic Experience Replay for Vision-Language-Action Models

Vision-Language-Action (VLA) models have achieved remarkable success in language-conditioned robotic manipulation. However, deploying these…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

DDOR: Delta Debugging for Explainable Overrefusal Testing and Repair

While safety alignment and guardrails help large language models (LLMs) avoid harmful outputs, they can also induce overrefusal, i.e., unwa…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

CauTion: Knowing When to Trust LLMs for Ensemble Causal Discovery

Causal discovery from observational data remains challenging due to the fundamental limitations of purely statistical methods, such as stat…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Testing LLM Arithmetic Reasoning Generalization with Automatic Numeric-Remapping Attacks

Large language models achieve strong performance on arithmetic reasoning benchmarks, and one common response to arithmetic brittleness is t…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Exploiting Verification-Generation Gap: Test-Time Reinforcement Learning with Confidence-Conditioned Verification

Test-time reinforcement learning has emerged as a promising paradigm for enhancing the complex reasoning abilities of large language models…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Physics-Guided Policy Optimization with Self-Distillation

Self-distilled policy optimization (SDPO) has become a popular paradigm for LLM post-training, where a model learns from its own prediction…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

TurtleAI: Benchmarking Multimodal Models for Visual Programming in Turtle Graphics

Vision-language models (VLMs) have been explored for visual programming, where they generate code to solve visual tasks. However, most prio…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Building Reliable Long-Form Generation via Hallucination Rejection Sampling

Large language models (LLMs) have achieved remarkable progress in open-ended text generation, yet they remain prone to hallucinating incorr…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

AnchorMoE: Interpretable Time Series Classification via Anchor-Routed MoE

Multivariate time series classification (MTSC) is pivotal in high-stakes domains, such as clinical diagnosis and industrial fault detection…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

VidMsg: A Benchmark for Implicit Message Inference in Short Videos

Understanding short online videos involves more than identifying visible objects and actions; video makers often include an underlying mess…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

The Shape of Addition: Geometric Structures of Arithmetic in Large Language Models

Large Language Models exhibit paradoxical fragility in fundamental arithmetic, implying a disconnect between internal computation and discr…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Black-box, Adaptive, Efficient, Transferable, Harmful, Applicable... Attacks Are All You Need to Break LLMs

Accurately evaluating adversarial robustness is a longstanding challenge. A flawed attack design can inflate robustness estimates, making d…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Safety Measurements for Fine-tuned LLMs Should be Grounded in Capability

Adapting foundation large language models to a user's task or preferred style through fine-tuning can result in compromising the model's sa…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

CoEval: Ranking Language Models for Custom Tasks Without Labeled Data or Trustworthy Benchmarks

Choosing or ranking language models for a specific application is hardest when no task-specific labeled data exists, and standard public be…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

AUGUSTE: Online-Learning dApp for Predictive URLLC Scheduling

Ultra Reliable and Low Latency Communications (URLLC) was one of the main motivations behind 5G, with 3GPP advertising 1-10 ms latency targ…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

A Close Look At World Model Recovery In Supervised Fine-Tuned LLM Planners

Supervised fine-tuning (SFT) improves end-to-end classical planning in large language models (LLMs), but do these models also learn to repr…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Staying Alive: Uncensored Survival Analysis with Tabular Foundation Models

Survival Analysis (SA) is a statistical framework that models the time span until some event of interest occurs. Widely used in several dom…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成

Qwen-Image-Flash: Beyond Objective Design

Few-step distillation has become an effective strategy for accelerating advanced visual generative models, yet prior work has largely focus…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成

Ultralytics YOLO26: Unified Real-Time End-to-End Vision Models

Real-time vision demands models that are accurate, efficient, and simple to deploy across diverse hardware. The YOLO family has become wide…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Tool-Aware Optimization with Entropy Guidance for Efficient Agentic Reinforcement Learning

Agentic reinforcement learning (RL) equips large language models (LLMs) with tool-use capabilities that substantially improve reasoning on…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Merit or networks? What decides where research is published

Does scientific publishing reward the quality of ideas or the advantage of connections? The question is universal to prestige-driven scienc…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

E2LLM: Towards Efficient LLM Serving in Heterogeneous Edge/Fog Environments

Large Language Models (LLMs) have become integral to modern applications, yet their deployment remains challenging. Beyond executing the mo…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Signed Spiking Neuron Enabled by an Orthogonal-Easy-Axis Magnetic Tunnel Junction

Signed spiking neurons carry richer information than standard spiking neurons. This work proposes a compact magnetic tunnel junction (MTJ)-…

2026-06-03 13:00 JSTarXiv cs.AIエージェント

Trading Human Curation for Synthetic Augmentation in RLVR

The supply of high-quality training tasks is a central bottleneck for reinforcement learning from verifiable rewards (RLVR) on agentic lang…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

LiveBand: Live Accompaniment Generation in the Audio Domain

We present LiveBand, a real-time system that generates high-fidelity music accompaniments to live audio input, respecting strict causal con…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

PURGE: Projected Unlearning via Retain-Guided Erasure

We propose PURGE, a machine unlearning algorithm built on a simple but an under-exploited observation: continual learning (CL) and machine…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

Consistency Training Can Entrench Misalignment

Consistency training encourages a model to produce similar outputs across related inputs or sampling procedures. Such methods are simple, s…

2026-06-03 13:00 JSTarXiv cs.AIエージェント

AI Agents Enable Adaptive Computer Worms

A computer worm is malware that spreads on a network by replicating itself from one machine to another. Traditional worms, like WannaCry, e…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成

Conditional Latent Diffusion Model with Fourier-based Motion Modelling for Virtual Population Synthesis

In-silico trials of medical devices require the generation of virtual populations of anatomies. In cardiovascular applications, virtual ana…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Re-Evaluating Continual Learning with Few-Shot Adaptation

Continual learning methods aim to maximize the stability and plasticity of machine learning models that are trained on a sequence of tasks.…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Clustered Self-Assessment: A Simple yet Effective Method for Uncertainty Quantification in Large Language Models

Large language models (LLMs) demonstrate remarkable performance across diverse tasks, but they often generate responses that appear plausib…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

FLARE: Fine-Grained Diagnostic Feedback for LLM Code Refinement

Large language models often generate code with bugs. Existing methods rely on feedback signals such as test failures and self-critiques to…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Taiji: Pareto Optimal Policy Optimization with Semantics-IDs Trade-off for Industrial LLM-Enhanced Recommendation

Scaling recommender systems via large language models (LLMs) has become a prominent trend in the industry. However, aligning the LLM's sema…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

A Training-Free Mixture-of-Agents Framework for Multi-Document Summarization using LLMs and Knowledge Graphs

Multi-Document Summarization (MDS) plays a critical role in distilling essential information from collections of textual data. Existing app…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

From 'What' to 'How' and 'Why': Sharing LLM-Generated Retrospective Summaries of Older Adults' Passive Tracking Data with Remote Family Members

With the growing prevalence of modern ubiquitous computing technologies, multi-modal tracking systems hold promise for providing timely awa…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成

Beyond Encoder Accumulation: Measuring Encoder Roles in Multi-Encoder VLMs

As foundation models scale toward fusing more heterogeneous visual streams, understanding how diverse encoders interact under joint trainin…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Synthesize and Reward -- Reinforcement Learning for Multi-Step Tool Use in Live Environments

Training LLMs to orchestrate multi-step tool calls is held back by three coupled obstacles: realistic stateful execution environments are c…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Agent libOS: A Library-OS-Inspired Runtime for Long-Running, Capability-Controlled LLM Agents

Large language model (LLM) agents are evolving from request-response assistants into long-running software actors: they maintain state acro…

2026-06-03 13:00 JSTarXiv cs.AIエージェント

The Impact of Configuring Agentic AI Coding Tools on Build-vs-Buy Decisions: A Study Protocol

Agentic AI coding tools write code with increasing autonomy and in doing so decide when to import a library and when to implement functiona…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

NetKV: Network-Aware Decode Instance Selection for Disaggregated LLM Inference

Disaggregated LLM inference forces the KV cache to traverse the datacenter network before decoding begins, so transfer time enters directly…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

FFR: Forward-Forward Learning for Regression

The Forward-Forward (FF) algorithm offers a computationally efficient and biologically plausible alternative to backpropagation (BP) by tra…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

q0: Primitives for Hyper-Epoch Pretraining

Multi-epoch training is becoming the standard now that compute is growing faster than the supply of high-quality text. But pretraining a si…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

FlashbackCL: Mitigating Temporal Forgetting in Federated Learning

Federated Learning (FL) of foundation and edge models increasingly targets deployments where client data distributions drift over time, yet…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Efficient ASR Training with Conversations that Never Happened

Conversational ASR for lower-resource languages and niche domains is limited by the scarcity of domain-matched multi-speaker training data.…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Using Reward Uncertainty to Induce Diverse Behaviour in Reinforcement Learning

Classical reinforcement learning (RL) typically seeks a deterministic policy that maximizes the expected sum of a scalar reward. Yet, moder…

2026-06-03 13:00 JSTarXiv cs.AIエージェントロボティクス

Self-Refining Agentic Reinforcement Learning for Vision-Conditioned UAV Navigation

Deep reinforcement learning has shown strong potential for enabling autonomous robots to learn complex navigational tasks. However, its pra…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Agentic Chain-of-Thought Steering for Efficient and Controllable LLM Reasoning

Large language models improve final-answer accuracy through extended chain-of-thought reasoning, but often spend tokens inefficiently and o…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

AlignAtt4LLM: Fast AlignAtt for Decoder-Only LLMs at IWSLT 2026 Simultaneous Speech Translation Task

We describe AlignAtt4LLM, an IWSLT 2026 simultaneous speech translation system for English to German, Italian, and Chinese. The system is a…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

QUBRIC: Co-Designing Queries and Rubrics for RL Beyond Verifiable Rewards

Rubric-based RL is a promising route for extending reinforcement learning beyond verifiable rewards, yet existing methods optimize rubrics…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Quantifying Faithful Confidence Expression in Large Reasoning Models

Reliable uncertainty communication is critical to the trustworthiness of LLMs, yet faithful calibration (FC)--the alignment between models'…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成

Formalizing the Binding Problem

Representations of the world, arguably, contain information about features (e.g. something is blue, something is a circle) but also informa…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Language Models Need Sleep: Learning to Self-Modify and Consolidate Memories

The past few decades have witnessed significant advances in the design of machine learning algorithms, from early studies on task-specific…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成ロボティクス

Humanoid-GPT: Scaling Data and Structure for Zero-Shot Motion Tracking

We introduce Humanoid-GPT, a GPT-style Transformer with causal attention trained on a billion-scale motion corpus for whole-body control. U…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Planning with Uncertainty: Symmetries, Policy Inference, and Solution Compression

Fully-observable non-deterministic (FOND) planning is at the core of artificial intelligence planning with uncertainty. It models uncertain…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Approximating Probabilistic Inference in Statistical EL with Knowledge Graph Embeddings

Statistical information is ubiquitous but drawing valid conclusions from it is prohibitively hard. We explain how knowledge graph embedding…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Leave it to the Specialist: Repair Sparse LLMs with Sparse Fine-Tuning via Sparsity Evolution

Sparse large language models (LLMs) offer an attractive direction toward efficient deployment, but adapting them to downstream tasks remain…

2026-06-03 13:00 JSTarXiv cs.AIエージェントロボティクス研究/論文

Assistax: A Multi-Agent Hardware-Accelerated Reinforcement Learning Benchmark for Assistive Robotics

The development of reinforcement learning (RL) algorithms has been largely driven by ambitious challenge tasks and benchmarks. Games have d…

2026-06-03 13:00 JSTarXiv cs.AIビジネス/資金調達

AlphaEval: A Comprehensive and Efficient Evaluation Framework for Formula Alpha Mining

Formula alpha mining, which generates predictive signals from financial data, is critical for quantitative investment. Although various alg…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Collab-REC: An LLM-based Agentic Framework for Balancing Recommendations in Tourism

We propose COLLAB-REC, a multi-agent framework designed to counteract popularity bias and improve diversity in tourism recommendations. In…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

DTKG: Dual-Track Knowledge Graph-Verified Reasoning Framework for Multi-Hop QA

Multi-hop reasoning for question answering (QA) plays a critical role in retrieval-augmented generation (RAG) for modern large language mod…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

RGMem: Renormalization Group-inspired Memory Evolution for Language Agents

Personalized and continuous interactions are critical for LLM-based conversational agents, yet finite context windows and static parametric…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

ProtocolBench: Which LLM MultiAgent Protocol to Choose?

As large-scale multi-agent systems evolve, the communication protocol layer has become a critical yet under-evaluated factor shaping perfor…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Human-Like Goalkeeping in a Realistic Football Simulation: a Sample-Efficient Reinforcement Learning Approach

While several high profile video games have served as testbeds for Deep Reinforcement Learning (DRL), this technique has rarely been employ…

2026-06-03 13:00 JSTarXiv cs.AIエージェント

MemVerse: Multimodal Memory for Lifelong Learning Agents

Despite rapid progress in large-scale language and vision models, AI agents still suffer from a fundamental limitation: they cannot remembe…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

MIND: Multi-rationale INtegrated Discriminative Reasoning Framework for Multi-modal Large Models

Recently, multimodal large language models (MLLMs) have been widely applied to reasoning tasks. However, they suffer from limited multi-rat…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

FutureWeaver: Planning Test-Time Compute for Multi-Agent Systems with Modularized Collaboration

Scaling test-time computation has been shown to significantly improve large language model (LLM) performance without additional training. H…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

The Agent's First Day: Benchmarking Learning, Exploration, and Scheduling in the Workplace Scenarios

The rapid evolution of Multi-modal Large Language Models (MLLMs) has advanced workflow automation; however, existing research mainly target…

2026-06-03 13:00 JSTarXiv cs.AIエージェント

A Scoping Review of the Ethical Perspectives on Anthropomorphising Large Language Model-Based Conversational Agents

Anthropomorphisation -- the phenomenon whereby non-human entities are ascribed human-like qualities -- has become increasingly salient with…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Strongly Polynomial Time Complexity of Policy Iteration for $L_\infty$ Robust MDPs

Markov decision processes (MDPs) are a fundamental model in sequential decision making. Robust MDPs (RMDPs) extend this framework by allowi…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェントビジネス/資金調達

PieArena: Ranking and Profiling Language Agents in Realistic Negotiation Scenarios

We present an in-depth evaluation of LLMs' ability to negotiate, a central business task requiring strategic reasoning, theory of mind, and…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Who Deserves the Reward? SHARP: Shapley Credit-based Optimization for Multi-Agent System

Integrating Large Language Models (LLMs) with external tools via multi-agent systems offers a promising new paradigm for decomposing and so…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

When Should LLMs Be Less Specific? Selective Abstraction for Reliable Long-Form Text Generation

LLMs are widely used, yet they remain prone to factual errors that erode user trust and limit adoption in high-risk settings. One approach…

2026-06-03 13:00 JSTarXiv cs.AIエージェント研究/論文

Towards a Science of AI Agent Reliability

AI agents are increasingly deployed to execute important tasks. While rising accuracy scores on standard benchmarks suggest rapid progress,…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

VeRO: A Harness for Agents to Optimize Agents

An important emerging application of coding agents is agent harness optimization: the iterative improvement of a target agent by editing an…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

X-RAY: Mapping LLM Reasoning Capability via Formalized and Calibrated Probes

Large language models (LLMs) achieve promising performance, yet their ability to reason remains poorly understood. Existing evaluations lar…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Evaluating Relational Reasoning in LLMs with REL

Relational reasoning is the ability to infer relations that jointly bind multiple entities, attributes, or variables. This ability is centr…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

Co-evolving Agent Architectures and Interpretable Reasoning for Automated Optimization

Automating operations research (OR) with large language models (LLMs) remains limited by hand-crafted reasoning--execution workflows. Compl…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

From Context to Skills: Can Language Models Learn from Context Skillfully?

Many real-world tasks require language models (LMs) to reason over complex contexts that exceed their parametric knowledge. This calls for…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Efficient Temporal Datalog Materialisation for Composite Event Recognition

Several applications demand the timely detection of critical situations, such as threats to safety and transparency, over high-velocity str…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

AdapShot: Adaptive Many-Shot In-Context Learning with Semantic-Aware KV Cache Reuse

Many-Shot In-Context Learning (ICL) has emerged as a promising paradigm, leveraging extensive examples to unlock the reasoning potential of…

2026-06-03 13:00 JSTarXiv cs.AIエージェントビジネス/資金調達

Done, But Not Sure: Disentangling World Completion from Self-Termination in Embodied Agents

Standard embodied evaluations do not independently score whether an agent correctly commits to task completion at episode closure, a capaci…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

From Holo Pockets to Electron Density: GPT-style Drug Design with Density

Recent advances in generative modeling have enabled significant progress in structure-based drug design (SBDD). Existing methods typically…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

PnP-Corrector: A Universal Correction Framework for Coupled Spatiotemporal Forecasting

Coupled spatiotemporal forecasting is important for predicting the future evolution of multiple interacting dynamical systems, such as in c…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Assessing and Mitigating Miscalibration in LLM-Based Social Science Measurement

Large language models (LLMs) are increasingly used in social science as scalable measurement tools for converting unstructured text into va…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Automatic Layer Selection for Hallucination Detection

Recent studies on hallucination detection have shown that hallucination-related signals are more strongly encoded in intermediate layers th…

2026-06-03 13:00 JSTarXiv cs.AIエージェント

PEAM: Parametric Embodied Agent Memory through Contrastive Internalization of Experience in Minecraft

We present PEAM, a Parametric Embodied Agent Memory framework in Minecraft that transforms agent memory from inference-time retrieval into…

2026-06-03 13:00 JSTarXiv cs.AIエージェント研究/論文

A Matter of TASTE: Improving Coverage and Difficulty of Agent Benchmarks

As agent capabilities advance, existing benchmarks, such as $\tau^2$-Bench, are becoming increasingly saturated. Yet constructing new bench…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Toward AI That Understands Self and Others: A World-Model Theory of Cognitive Diversity and Alignment

Modern societies possess more information than ever before, yet they do not converge toward a single shared understanding. The same events,…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Typhoon: Towards an Effective Task-Specific Masking Strategy for Pre-trained Language Models

The choice of \emph{which} tokens to mask is a central, under-examined design decision in masked language modeling (MLM). Standard pretrain…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

PINNfluence: Interpreting PINNs through Influence Functions

Physics-informed neural networks (PINNs) have emerged as a powerful deep learning approach for solving partial differential equations (PDEs…

2026-06-03 13:00 JSTarXiv cs.AIビジネス/資金調達

Building Trust in Black-box Optimization: A Comprehensive Framework for Explainability

Optimizing costly black-box functions within a constrained evaluation budget presents significant challenges in many real-world application…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成

Align-KD: Distilling Cross-Modal Alignment Knowledge for Mobile Vision-Language Model Enhancement

Vision-Language Models (VLMs) bring powerful understanding and reasoning capabilities to multimodal tasks. Meanwhile, the great need for ca…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

ASAP: Exploiting the Satisficing Generalization Edge in Neural Combinatorial Optimization

Deep Reinforcement Learning (DRL) has emerged as a promising approach for solving Combinatorial Optimization (CO) problems, such as the 3D…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Greed is Good: A Unifying Perspective on Guided Generation

Training-free guided generation is a widely used and powerful technique that allows the end user to exert further control over the generati…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Rex: A Family of Reversible Exponential (Stochastic) Runge-Kutta Solvers

Deep generative models based on neural differential equations have become state-of-the-art for many generation tasks. These models rely on…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成ビジネス/資金調達研究/論文

WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation

Text-to-Image (T2I) models are capable of generating high-quality artistic creations and visual content. However, existing research and eva…

2026-06-03 13:00 JSTarXiv cs.AIエージェントロボティクス

Scaling Multi Agent Reinforcement Learning for Underwater Acoustic Tracking via Autonomous Vehicles

Autonomous vehicles (AVs) offer a cost-effective solution for scientific missions such as underwater tracking. Reinforcement learning (RL)…

2026-06-03 13:00 JSTarXiv cs.AIハードウェア/半導体

FlashMLA-ETAP: Efficient Transpose Attention Pipeline for Accelerating MLA Inference on NVIDIA H20 GPUs

Efficient inference of Multi-Head Latent Attention (MLA) is challenged by deploying the DeepSeek-R1 671B model on a single Multi-GPU server…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Do Explanations Increase the Risk of Decision Logic Leakage? Explanation-Guided Stealing of Graph Models

Graph Neural Networks (GNNs) have become essential tools for analyzing graph-structured data in domains such as drug discovery and financia…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching

Autoregressive Models (ARMs) have long dominated the landscape of Large Language Models. Recently, a new paradigm has emerged in the form o…

2026-06-03 13:00 JSTarXiv cs.AIエージェント

Curriculum-Adapted Robust Reinforcement Learning for UAV Deconfliction in Adversarial Environments

Autonomous unmanned aerial vehicles (UAVs) increasingly rely on reinforcement learning (RL) for navigation. However, global navigation sate…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Multiple Choice Learning of Low-Rank Adapters for Language Modeling

We propose LoRA-MCL, a training scheme that extends next-token prediction in language models with a method designed to decode diverse, plau…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成ロボティクス研究/論文

CoMPAS3D: A Dataset and Benchmark for Interactive Motion

Socially interactive humanoid robots must engage with humans through their bodies, adapting in real time to a partner's movement, intent, a…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

UR$^2$: Unify RAG and Reasoning through Reinforcement Learning

Large Language Models (LLMs) have shown strong capabilities through two complementary paradigms: Retrieval-Augmented Generation (RAG) for k…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Non-Identical Diffusion Models in MIMO-OFDM Channel Generation

We propose a novel diffusion model, termed the non-identical diffusion model, and investigate its application to wireless orthogonal freque…

2026-06-03 13:00 JSTarXiv cs.AIエージェント

TalkPlayData 2: An Agentic Synthetic Data Pipeline for Multimodal Conversational Music Recommendation

We present TalkPlayData 2, a synthetic dataset for multimodal conversational music recommendation generated by an agentic data pipeline. In…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Wavelet Fourier Diffuser: Frequency-Aware Diffusion Model for Reinforcement Learning

Diffusion probability models have shown significant promise in offline reinforcement learning by directly modeling trajectory sequences. Ho…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Learning the Neighborhood: Contrast-Free Multimodal Self-Supervised Molecular Graph Pretraining

High-quality molecular representations are essential for property prediction and molecular design, yet large labeled datasets remain scarce…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

DeMuon: A Decentralized Muon for Matrix Optimization over Graphs

In this paper, we propose DeMuon, a method for decentralized matrix optimization over a given communication topology. DeMuon incorporates m…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

ReaLM: Residual Quantization Bridging Knowledge Graph Embeddings and Large Language Models

Large Language Models (LLMs) have recently emerged as a powerful paradigm for Knowledge Graph Completion (KGC), offering strong reasoning a…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Semantic knowledge guides innovation and drives cultural evolution

Cultural evolution allows ideas and technologies to accumulate across generations, reaching their most complex and open-ended form in human…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Generating the Modal Worker: A Cross-Model Audit of Race and Gender in LLM-Generated Personas Across 41 Occupations

As generative AI tools are increasingly used to portray people in professional roles, understanding their racial and gender representationa…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Automata-Conditioned Cooperative Multi-Agent Reinforcement Learning

We study learning multi-task, multi-agent policies for cooperative, temporal objectives, under centralized training, decentralized executio…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

A Robust and Explainable Transformer-Based Framework for Phishing Email Detection

Phishing and related cyber threats are becoming increasingly sophisticated, with email-based phishing remaining the most persistent attack…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成ビジネス/資金調達

PHASE: Physiology-Aware Hyperspectral Reconstruction via Object-to-Human Domain Adaptation

Although hyperspectral imaging offers unparalleled non-invasive physiological insight, its bulky hardware, slow acquisition, and regulatory…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Finding Kissing Numbers with Game-theoretic Reinforcement Learning

Since Isaac Newton first studied the Kissing Number Problem in 1694, determining the maximal number of non-overlapping spheres around a cen…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

SeSE: Black-Box Uncertainty Quantification for Large Language Models Based on Structural Information Theory

Reliable uncertainty quantification (UQ) is essential for deploying large language models (LLMs) in safety-critical scenarios, as it enable…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Identifying Quantum Structure in AI Language: Evidence for Evolutionary Convergence of Human and Artificial Cognition

We present the results of cognitive tests on conceptual combinations, performed using specific Large Language Models (LLMs) as test subject…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Distribution-Calibrated Inference Time Compute for Thinking LLM-as-a-Judge

Thinking Large Language Models (LLMs) used as judges for pairwise preferences remain noisy at the single-sample level, and common aggregati…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Toward Training Superintelligent Software Agents through Self-Play SWE-RL

While current software agents powered by large language models (LLMs) and agentic reinforcement learning (RL) can boost programmer producti…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成

Edge-Aware and Content-Adaptive Infrared Gas Leak Detection for Industrial Safety Monitoring

Infrared gas leak detection is important for industrial safety and environmental monitoring, but automatic detection remains challenging be…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Introduction to optimization methods for training SciML models

Optimization is central to both modern machine learning (ML) and scientific machine learning (SciML), yet the structure of the underlying o…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Relational Linearity is a Predictor of Hallucinations

Hallucination is a central failure mode of language models (LMs). We focus on hallucinations in response to questions like: "Which instrume…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Distill-then-Replace: Efficient Task-Specific Hybrid Attention Model Construction

Transformer architectures deliver state-of-the-art accuracy via dense full-attention, but their quadratic time and memory complexity with r…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Aletheia: What Makes RLVR For Code Verifiers Tick?

Multi-domain thinking verifiers trained via Reinforcement Learning with Verifiable Rewards (RLVR) are a cornerstone of modern post-training…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Plan, Verify and Fill: A Structured Parallel Decoding Approach for Diffusion Language Models

Diffusion Language Models (DLMs) present a promising non-sequential paradigm for text generation, distinct from standard autoregressive (AR…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

$\mathbb{R}^{2k}$ is Theoretically Large Enough for Embedding-based Top-$k$ Retrieval

This paper studies the Minimal Embeddable Dimension (MED): the least dimension in which there exists a configuration of $m$ object vectors…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Causal Preference Elicitation

We propose causal preference elicitation, a Bayesian framework for expert-in-the-loop causal discovery that actively queries local edge rel…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Phantom Transfer: Data Poisoning can Survive Data-Level Defences

We present a data poisoning attack -- Phantom Transfer -- with the property that, even if you know precisely how the poison was placed into…

2026-06-03 13:00 JSTarXiv cs.AIロボティクス

Coupled Local and Global World Models for Efficient First Order RL

World models offer a promising avenue for more faithfully capturing complex dynamics, including contacts and non-rigidity, as well as compl…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

InftyThink+: Effective and Efficient Infinite-Horizon Reasoning via Reinforcement Learning

Large reasoning models achieve strong performance by scaling inference-time chain-of-thought, but this paradigm suffers from quadratic cost…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

LatentChem: From Textual CoT to Latent Thinking in Chemical Reasoning

Current chemical large language models (LLMs) predominantly rely on explicit Chain-of-Thought (CoT) to solve complex reasoning problems. Ho…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

PAND: Prompt-Aware Neighborhood Distillation for Lightweight Fine-Grained Visual Classification

Distilling knowledge from large Vision-Language Models (VLMs) into lightweight networks is crucial yet challenging in Fine-Grained Visual C…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Whose Name Comes Up? II: Benchmarking and Intervention-Based Auditing of LLM-Based Scholar Recommendation

Large language models (LLMs) are now used for academic expert recommendation. Existing audits typically evaluate such recommendations in is…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成

Physics-informed diffusion models in spectral space

We propose physics-informed spectral diffusion (PISD), a methodology that combines generative latent diffusion models with physics-informed…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Learning Self-Interpretation from Interpretability Artifacts: Training Lightweight Adapters on Vector-Label Pairs

Self-interpretation methods prompt language models to describe their own internal states, but remain unreliable due to hyperparameter sensi…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Test-Time Optimization of Physical Query Plans with LLMs

Traditional query optimization relies on cost-based optimizers that estimate execution cost (e.g., runtime, memory, and I/O) using predefin…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェントビジネス/資金調達

Agent Skills for Large Language Models: Architecture, Acquisition, Security, and the Path Forward

The transition from monolithic language models to modular, skill-equipped agents marks a defining shift in how large language models (LLMs)…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Whom to Query for What: Adaptive Group Elicitation via Multi-Turn LLM Interactions

Eliciting information to reduce uncertainty about latent group-level properties from surveys and other collective assessments requires allo…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Sign Lock-In: Randomly Initialized Weight Signs Persist and Bottleneck Sub-Bit Model Compression

Sub-bit model compression targets storage below one bit per weight; as magnitudes are aggressively compressed, the sign bit becomes a fixed…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

TimeOmni-VL: Unified Models for Time Series Understanding and Generation

Recent time series modeling faces a sharp divide between numerical generation and semantic understanding, with research showing that genera…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

CodeHacker: Automated Test Case Generation for Detecting Vulnerabilities in Competitive Programming Solutions

The evaluation of Large Language Models (LLMs) for code generation relies heavily on the quality and robustness of test cases. However, exi…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

KnapSpec: Self-Speculative Decoding via Adaptive Layer Selection as a Knapsack Problem

Self-speculative decoding (SSD) accelerates LLM inference by skipping layers to create an efficient draft model, yet existing methods often…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Causal Neural Probabilistic Circuits

Concept Bottleneck Models (CBMs) enhance the interpretability of end-to-end neural networks by introducing a layer of concepts and predicti…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

vLLM Semantic Router: Signal Driven Decision Routing for Mixture-of-Modality Models

As large language models (LLMs) diversify across modalities, capabilities, and cost profiles, the problem of intelligent request routing --…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成

Ref-DGS: Reflective Dual Gaussian Splatting

The reflective appearance, especially strong and typically near-field specular reflections, poses a fundamental challenge for accurate surf…

2026-06-03 13:00 JSTarXiv cs.AIエージェント

VulnAgent-R2: Evidence-Calibrated Multi-Agent Auditing for Repository-Level Vulnerability Detection

Software vulnerabilities often depend on cross-file data flow, build options, framework conventions, and runtime guards, so isolated functi…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Measuring Weak-to-Strong Legibility of Reasoning Models

Reasoning language models (RLMs) and the intermediate chains of thought they emit play an increasingly central role in multi-agent setups s…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

SleepVLM: Explainable and Rule-Grounded Sleep Staging via a Vision-Language Model

While automated sleep staging has achieved expert-level accuracy, its clinical adoption is hindered by a lack of auditable reasoning. We in…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Crystal: Characterizing Relative Impact of Scholarly Publications

Assessing a cited paper's impact is typically done by analyzing its citation context in isolation within the citing paper. While this focus…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Finetuning-Free Diffusion Model with Adaptive Constraint Guidance for Inorganic Crystal Structure Generation

The discovery of inorganic crystal structures with targeted properties is a significant challenge in materials science. Generative models,…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Dynamics of Cognitive Heterogeneity: Investigating Behavioral Biases in Multi-Stage Supply Chains with LLM-Based Simulation

Modeling coordination among generative agents in complex multi-round decision-making presents a core challenge for AI and operations manage…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成

Back into Plato's Cave: Examining Cross-modal Representational Convergence at Scale

The Platonic Representation Hypothesis suggests that neural networks trained on different modalities (e.g., text and images) align and even…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

$R^2$-dLLM: Accelerating Diffusion Large Language Models via Spatio-Temporal Redundancy Reduction

Diffusion Large Language Models (dLLMs) have emerged as a promising alternative to autoregressive generation by enabling parallel token pre…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

Quantifying and Mitigating Self-Preference Bias of LLM Judges

LLM-as-a-Judge has become a dominant approach in automated evaluation systems, playing critical roles in model alignment, leaderboard const…

2026-06-03 13:00 JSTarXiv cs.AIビジネス/資金調達

ProEval: Proactive Failure Discovery and Efficient Performance Estimation for Generative AI Evaluation

Evaluating generative AI models is increasingly resource-intensive due to slow inference, expensive raters, and a rapidly growing landscape…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

How to Guide Your Flow: Few-Step Alignment via Flow Map Reward Guidance

In generative modeling, we often wish to produce samples that maximize a user-specified reward such as aesthetic quality or alignment with…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIエージェント

SkCC: Portable and Secure Skill Compilation for Cross-Framework LLM Agents

LLM agents increasingly rely on reusable skills (e.g., $SKILL.md$ ) to execute complex tasks, yet these artifacts lack portability: agent f…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Narrow Secret Loyalty Dodges Black-Box Audits

Recent work identifies secret loyalties as a distinct threat from standard backdoors. A secret loyalty causes a model to covertly advance t…

2026-06-03 13:00 JSTarXiv cs.AIエージェント

Mechanism Design Is Not Enough: Prosocial Agents for Cooperative AI

Ensuring that AI agents behave safely and beneficially when interacting with other parties has emerged as one of the central challenges of…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成

Towards Robust Sequential Decomposition for Complex Image Editing

Recent advances in visual generative models have enabled high-fidelity image editing guided by human instructions. However, these models of…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Exact Stiefel Optimization for Probabilistic PLS: Closed-Form Updates, Error Bounds, and Calibrated Uncertainty

Probabilistic partial least squares (PPLS) is a central likelihood-based model for two-view learning when one needs both interpretable late…

2026-06-03 13:00 JSTarXiv cs.AIエージェントビジネス/資金調達

AgentLens: Revealing The Lucky Pass Problem in SWE-Agent Evaluation

Evaluation of software engineering (SWE) agents is dominated by a binary signal: whether the final patch passes the tests. This outcome-onl…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成

X-Restormer++: 1st Place Solution for the UG2+ CVPR 2026 All-Weather Restoration Challenge

In this work, we present our winning solution for the 8th UG2+ Challenge (CVPR 2026) Track 1: Image Restoration under All-weather Condition…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Misspecified Estimate-then-Optimize Leads to Supra-Competitive Prices

We study whether simple algorithmic pricing systems can systematically produce collusive-like prices in multi-firm markets. We consider fir…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Symmetry-Compatible Principle for Optimizer Design: Embeddings, LM Heads, SwiGLU MLPs, and MoE Routers

A striking geometric disparity has long persisted in the practice of deep learning. While modern neural network architectures naturally exh…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Vision Inference Former: Sustaining Visual Consistency in Multimodal Large Language Models

In recent years, multimodal large language models (MLLMs) have achieved remarkable progress, primarily attributed to effective paradigms fo…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Vision-OPD: Learning to See Fine Details for Multimodal LLMs via On-Policy Self-Distillation

Multimodal Large Language Models (MLLMs) still struggle with fine-grained visual understanding, where answers often depend on small but dec…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

Latent Laplace Diffusion for Irregular Multivariate Time Series

Irregular multivariate time series impose a trade-off for long-horizon forecasting: discrete methods can distort temporal structure via re-…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AI

Decomposing MXFP4 quantization error for LLM reinforcement learning: reducible bias, recoverable deadzone, and an irreducible floor

MXFP4 arithmetic can dramatically accelerate reinforcement learning (RL) post-training of large language models (LLMs), yet the quantizatio…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成

TASTE: A Designer-Annotated Multi-Dimensional Preference Dataset for AI-Generated Graphic Design

Text-to-image models now generate graphic design at production scale, yet their supervision still comes primarily from photo-style preferen…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成エージェントロボティクス

FRED: A Multi-Modal Autonomous Driving Dataset for Flooded Road Environments

The Flooded Road Environments Dataset (FRED) is, to our knowledge, the first multi-modal autonomous driving dataset specifically targeting…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

Decomposing and Measuring Evaluation Awareness

Frontier language models sometimes recognize that they are being evaluated and adjust their behavior, undermining validity of benchmark res…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

MX-SAFE: Versatile Inference- and Training-Proof Microscaling Format with On-the-Fly Exponent and Mantissa Bit Allocation

As the demand for deep learning grows, cost reduction through quantization has become essential for both training and inference. In 2022, t…

2026-06-03 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

JudgmentBench: Comparing Rubric and Preference Evaluation for Quality Assessment

Two methodologies dominate current practices of benchmarking: rubric-based scoring evaluates items against predefined criteria, whereas com…

2026-06-03 13:00 JSTarXiv cs.AI画像/動画生成

Anatomy-Anchored Self-Supervision: Distilling Vision Foundation Models for Invariant Ultrasound Representation

Self-supervised pre-training paradigm has gained increasing prominence for learning transferable representations in medical imaging, yet ex…

2026-06-03 13:00 JSTarXiv cs.AIハードウェア/半導体

Fine-Tuning and Serving Gemma 4 31B on Google Cloud TPU: A Technical Comparison with GPU Baselines

We present the first end-to-end demonstration of fine-tuning and serving Google's Gemma 4 31B model on TPU hardware, providing an empirical…

2026-06-03 13:00 JSTarXiv cs.AIビジネス/資金調達

SL-BiLEM: Structured Learnable Behavior-in-the-Loop Epidemic Modeling for Forecasting and Policy Evaluation

Epidemic forecasting faces a fundamental challenge: human behavior dynamically responds to disease spread, creating feedback loops that ind…

2026-06-03 13:00 JSTarXiv cs.AI研究/論文

QuITE: Query-Based Irregular Time Series Embedding

Irregular Multivariate Time Series (IMTS) are common in practice, yet their irregular sampling complicates effective modeling. Existing app…

2026-06-03 11:03 JSTITmedia AI+その他

Microsoft、自社開発した7つのAIモデル発表　画像編集や音声認識も

米Microsoftが自社開発した7つのAIモデル群「Microsoft AI Models」を発表しました。

2026-06-03 11:02 JSTITmedia AI+エージェント

Microsoft、AIエージェント用のカスタマイズ可能な分離環境「Microsoft Execution Containers」発表　OpenClawも動作

米MicrosoftがAIエージェントのためのカスタマイズ可能な分離環境「Microsoft Execution Containers」（MXC）を発表しました。

2026-06-03 09:42 JSTITmedia AI+規制/政策

トランプ米大統領、AI安全保障に関する大統領令に署名　最先端モデルを公開30日前に政府が検査可能に

トランプ米大統領は、先進的AIのイノベーションと安全保障の促進に関する大統領令に署名した。戦争省やCISAによるサイバー防衛強化に加え、主要企業の最先端AIモデルを政府が事前検証する任意の枠組みを構築する。政府は全面的な監視を否定しており、民間の開発自由度を維持しつつ安全保障の…

2026-06-03 09:00 JSTITmedia AI+その他

シーメンス、AIでCFD設計探索を高速化　「Simcenter PhysicsAI」を発表

シーメンスは「Simcenter」の新機能として、AIを活用した設計空間探索向けソフトウェア「Simcenter PhysicsAI」を発表した。CFDのシミュレーション結果からAIサロゲートモデルを構築し、数千もの設計バリエーションを短時間で評価できる。従来は数日を要していた…

2026-06-03 08:00 JSTITmedia AI+その他

シャドーAIに「ログイン情報」を渡している割合は？　Oktaの実態調査で判明

ある調査によると、経営幹部の95％は「従業員は責任を持ってAIを利用している」と確信しているが、シャドーAIを使っている従業員は過半数に上るという。さらに、シャドーAIを利用している従業員の中には情報漏えいにつながりかねない「危険な使い方」をしている人も一定数いる。

2026-06-03 07:50 JSTTechCrunch AIビジネス/資金調達

Cyera eyes $12B valuation at 80x ARR multiple despite operating losses

The cybersecurity company is nearing a $300 million round led by Evolution Equity Partners.

2026-06-03 07:49 JSTITmedia AI+エージェント

Microsoft、AndroidベースのAIエージェント基盤「Solara」発表　Snapdragon搭載のバッジ型端末も披露

Microsoftは「Build 2026」で、AIエージェントの実行に特化した新プラットフォーム「Project Solara」を発表した。OSにはWindowsではなくAOSPベースのOSを採用。Qualcommと共同開発した社員証のようなデバイスと、MediaTekと共同…

2026-06-03 07:00 JSTITmedia AI+ハードウェア/半導体

AI需要で半導体不足は「しばらく続く」　PCメーカー、デルの対応策は？

AI需要による半導体不足は「しばらく続く」――PCメーカーのデル・テクノロジーズはこう予測する。同社はこの難局をどう乗り切るのか。

2026-06-03 06:15 JSTITmedia AI+ハードウェア/半導体

NVIDIAの「RTX Spark」と搭載ノートPCがCOMUPTEX TAIPEIのMediaTekブースに集結

MediaTek（メディアテック）は、「COMPUTEX TAIPEI 2026」において、NVIDIAが発表したAIスーパーチップ「NVIDIA RTX Spark」と、同チップを搭載する各社のWindowsノートPCを披露した。

2026-06-03 05:00 JSTITmedia AI+その他

【Pythonで学ぶデータ分析】ベイズ統計の考え方をやさしく学ぶ～初めてでも流れが分かる入門編

初歩から応用までステップアップしながら学んでいく『やさしいデータ分析』シリーズ第5弾はベイズ統計編。今回は、二項分布の確率についてベイズ的な手法で母数の推定や検定を行います。

2026-06-03 04:50 JSTITmedia AI+ハードウェア/半導体

Microsoft、NVIDIAのSoC搭載でAI特化のミニPC「Surface RTX Spark Dev Box」披露

Microsoftは「Build 2026」で、AI特化型デスクトップPC「Surface RTX Spark Dev Box」を発表した。NVIDIAの「RTX Spark」を搭載し、最大1ペタフロップスの演算性能と128GBのメモリにより、1200億パラメータ超のモデルのロ…

2026-06-03 04:11 JSTTechCrunch AIその他

Uber caps employee AI spending after blowing through budget in 4 months

Uber's cutback has occurred after the company had reportedly encouraged staff to use AI as much as possible.

2026-06-03 04:02 JSTTechCrunch AIビジネス/資金調達

New Microsoft tool lets devs spin up AI behavior tests using text descriptions

Microsoft on Tuesday took the wraps off Adaptive Spec-driven Scoring for Evaluation and Regression Testing, an open source framework for sp…

2026-06-03 03:16 JSTTechCrunch AIその他

Martin Scorsese becomes the latest — and most unlikely — Hollywood voice for AI

The caveat is that one of the world's most famous living directors is using the tech solely for storyboarding.

2026-06-03 03:02 JSTTechCrunch AIその他

Microsoft launches Scout, an OpenClaw-inspired personal assistant

Launched at Build, Microsoft Scout is a new AI assistant meant to bring the power and flexibility of OpenClaw into the Microsoft 365 system.

2026-06-03 03:00 JSTTechCrunch AIその他

Google rolls out fake call detection to protect against AI deepfake impersonation scams

As people increasingly refuse to answer calls from unknown numbers, scammers are shifting their tactics by spoofing trusted phone numbers a…

2026-06-03 03:00 JSTTechCrunch AIエージェント

Microsoft offers devs a better way to control AI agent behavior

The specification lets developer, compliance, and security teams define their own policies for agents to follow in portable policy files.

2026-06-03 02:47 JSTTechCrunch AI規制/政策

Amazon faces class action lawsuit over Ring facial-recognition feature

The class action lawsuit, filed in Seattle by Virginia resident Charles Sigwalt, claims that Ring's Familiar Faces feature stores images of…

2026-06-03 01:23 JSTTechCrunch AI規制/政策

Trump signs narrower executive order on AI oversight after industry objections

After industry objections, President Trump signed a revised AI executive order requiring only voluntary prerelease government reviews of ad…

2026-06-03 01:00 JSTTechCrunch AILLM/生成AIエージェント

OpenAI launches new Codex tools for white-collar work

OpenAI released a set of six plug-ins aimed at specific jobs: data analytics, creative production, sales, product design, equity investing,…

2026-06-02（963件）

2026-06-02 23:44 JSTTechCrunch AILLM/生成AI

Anthropic scales Claude Mythos to critical infrastructure in 15+ countries

Anthropic is expanding Project Glasswing, its security vulnerability program, and access to Mythos to 150 organizations across 15 countries…

2026-06-02 21:45 JSTITmedia AI+その他

Microsoft、初の自社推論モデル「MAI-Thinking-1」発表　蒸留なしでゼロから学習

Microsoftは「Build 2026」で、自社開発AI「MAI」の新モデル群を発表した。中核となる初の推論モデル「MAI-Thinking-1」は350億パラメータを持ち、他モデルからの蒸留を行わないクリーンなデータで学習。競合モデルに匹敵する高い性能を示し、独自チップ「…

2026-06-02 21:32 JSTTechCrunch AIビジネス/資金調達

ZeroDrift raises $10M to protect AI models from themselves

A new AI compliance service sits between AI models and end users to flag and replace any messages that might present a compliance problem.

2026-06-02 21:00 JSTTechCrunch AIビジネス/資金調達

Rocket engine startup Impulse raises $500 million to hire people, not AI

Engineering physical systems still depends on human talent, according to Impulse Space president Eric Romo.

2026-06-02 21:00 JSTOpenAILLM/生成AI

Travelers deploys AI-powered claims countrywide with OpenAI

Travelers built an AI-powered Claim Assistant with OpenAI to guide customers through filing claims, provide 24/7 support, and scale operati…

2026-06-02 20:50 JSTITmedia AI+エージェント

Microsoft、自律エージェント「Scout」発表　OpenClawベースでMCP対応

Microsoftは「Build 2026」で、自律型AIエージェントの新カテゴリ「Autopilots」と、その第一弾「Microsoft Scout」を発表した。Scoutは「OpenClaw」基盤で構築され、常時バックグラウンドで稼働して「Microsoft 365」のア…

2026-06-02 18:00 JSTOpenAIエージェント

Codex for every role, tool, and workflow

Discover new Codex plugins, sites, and annotations that help analysts, marketers, designers, investors, and other teams get more done with…

2026-06-02 16:00 JSTOpenAILLM/生成AI

Advancing youth safety and opportunity through global leadership

OpenAI calls for global action on youth AI safety, proposing an international institute to strengthen safeguards, standards, and opportunit…

2026-06-02 14:56 JSTITmedia AI+LLM/生成AI

AIモデル「ミュトス」のアクセス権拡大　新たに150組織が利用へ　Anthropic

米Anthropicは、サイバーセキュリティプロジェクト「Project Glasswing」を拡大し、AIモデル「Claude Mythos Preview」のアクセス権を新たに約150の組織に与えると発表した。

2026-06-02 13:00 JSTITmedia AI+その他

バイブコーディングの“プロトタイプで止まりがち”問題に「バイブ清書」が切り込む

バイブコーディングの普及で社内ソフトウェアの開発は身近になった。一方でプロトタイプから本番利用へ移行する際の品質やセキュリティの確保に悩む企業もある。その課題に着目し、解決を図るのが「バイブ清書」だ。

2026-06-02 13:00 JSTarXiv cs.AIハードウェア/半導体研究/論文

Position Paper: Post-Solve Robustness in Decision Engines: Feasible Regions and Smoothness Under Perturbations

Mixed-Integer Linear Programming (MILP) decision engines routinely output nominally optimal plans for high-stakes industrial systems. Yet d…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Emergent Collaborative Deliberation in Multi-Model AI Systems: A BFT-Derived Protocol for Epistemic Synthesis

We present the Consilium Protocol, a Byzantine Fault Tolerance-derived architecture for structured multi-model AI deliberation that treats…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Deliberative Curation: A Protocol for Multi-Agent Knowledge Bases

As AI agents transition from isolated tools to collaborative participants in shared knowledge ecosystems, governing collective knowledge cu…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Agents on a Tree: Pathwise Coordination for Multi-Objective Molecular Optimization

Multi-objective molecular optimization requires searching vast chemical spaces under conflicting objectives, where early design decisions s…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Optimal Transport-based Permutation-Invariant Bayesian Optimization of Offshore Wind Farm Layouts

Bayesian Optimization (BO) is widely and successfully adopted for solving optimization problems having an expensive-to-evaluate, black-box,…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

MindGames Arena Generalization Track: In2AI Solution with Delayed Per-Step Reward Attribution

Training language model agents for multi-agent strategic interaction presents a core difficulty: the quality of any action may depend on fu…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Universal Quantum Transformer

Classical continuous-space neural networks fundamentally struggle to lock into exact mathematical symmetries, such as modular arithmetic an…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Grokers: Bottom-Up Inductive Comprehension and Write-Time Intelligence over Typed Knowledge Graphs

We present Grokers, an architecture for building persistent, structured comprehension of typed knowledge graphs through bottom-up inductive…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Product-Aware Deep Autoencoders for Robust Process Monitoring in Multi-Product Cyber-Physical Systems

As Industry 4.0 accelerates the integration of Cyber-Physical Systems (CPS) in manufacturing, robust anomaly detection has become critical…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

On the evolution of the concept of probability as a mirror of the evolution of reason

Over the centuries, probability theory has grown from the calculus of games of chance into a central framework for reasoning under uncertai…

2026-06-02 13:00 JSTarXiv cs.AIビジネス/資金調達研究/論文

Evaluating Interactive Reasoning in Large Language Models: A Hierarchical Benchmark with Executable Games

We introduce a multi-turn interactive framework for reasoning evaluation that treats reasoning as active evidence acquisition and belief up…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

A Multi-AI-agent Framework Enabling End-to-end Finite Element Analysis for Solid Mechanics Problems

Finite element analysis (FEA) is the most important numerical approach for solid mechanics. Challenges of FEA include a steep learning curv…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

CAST: Non-Privileged Clipped Asymmetric Self-Teaching with Advantage Flipping for GRPO

Reinforcement learning with verifiable rewards (RLVR), especially Group Relative Policy Optimization (GRPO), has been widely used to improv…

2026-06-02 13:00 JSTarXiv cs.AIハードウェア/半導体

TIGER: Traceable Inference with Graph-Based Evidence Routing for Mitigating Hallucinations in Multimodal Generation

We study fact-level repair for multimodal generation, where a fluent output may contain specific facts that are not supported by the input.…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

MindZero: Learning Online Mental Reasoning With Zero Annotations

Effective real-world assistance requires AI agents with robust Theory of Mind (ToM): inferring human mental states from their behavior. Des…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Geodesic Flow Matching for Denoising High-Dimensional Structured Representations

Vector Symbolic Algebras (VSAs) enable robust neurosymbolic reasoning by encoding symbolic information into high-dimensional distributed re…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Capability Self-Assessment: Teaching LLMs to Know Their Limits

The ability to recognize one's own limitations and decide whether to solve a problem or delegate is fundamental for reliable intelligent sy…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Closed-Loop Neural Activation Control in Vision-Language-Action Models

Vision-Language-Action (VLA) models can be steered at test time by intervening on semantically meaningful internal directions, but existing…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Robust Shielding for Safe Reinforcement Learning

Shielding is an effective approach to formally guarantee the safety of reinforcement learning agents in Markov decision processes (MDPs). H…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

On Wednesdays, We Ask Questions: Optimizing "Active Listening" in Automated Legal Triage and Referral

The FETCH classifier generates follow-up questions to help refine the best match for the applicant's legal problem, using a low-cost ensemb…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Evaluating Bivariate Causal Statements Based on Mutual Compatibility

For many real-world systems, causal ground truth is difficult to obtain, making claims about causal effects hard to assess. We develop meth…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Model-Native Computing Architecture: Envisioning Future System Architecture Through the Lens of Computer Architecture

Large language models are undergoing a transition from model technology to system technology. As developers use Codex, Claude Code, AutoGPT…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Coupling Language Models with Physics-based Simulation for Synthesis of Inorganic Materials

Modern generative machine learning (ML) models can propose novel inorganic crystalline materials with targeted properties; however, synthes…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

From Noise to Control: Parameterized Diffusion Policies

We propose Parameterized Diffusion Policy (PDP), a framework for learning diffusion policies conditioned on low-dimensional, continuous par…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

From "Weak" Signals to Strong Models: Preference Delta Aggregation with LoRA Merging

Training strong large language models (LLMs) requires high-quality supervision, which is often scarce. Recent work shows that paired prefer…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

The Deterministic Horizon: When Extended Reasoning Fails and Tool Delegation Becomes Necessary

Extended chain-of-thought reasoning can degrade performance on deterministic state-tracking tasks, not due to preference biases, but limits…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成エージェント

VESTA: Visual Exploration with Statistical Tool Agents

Fitting quantitative models to data is a central step in scientific workflows, yet it remains one of the least automated. Recent agent-base…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Weak Critics Make Strong Learners: On-Policy Critique Distillation for Scalable Oversight

As large language models become stronger, weak supervisors may fail to provide reliable labels, preferences, or final judgments for complex…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

SDR: Set-Distance Rewards for Radiology Report Generation

Reinforcement learning with verifiable rewards has rapidly advanced reasoning in vision--language models. However, for chest X-ray report g…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Doing What They Say, Not What They Reason: Locating the Faithfulness Gap in LLM Agents

Do LLM agents act on the reasoning they state? This question of process fidelity is central to using LLMs in social simulation, yet it is h…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

TAPS: Target-Aware Prefix Tree Selection for Diffusion-Drafted Speculative Decoding

Using a diffusion model for parallel drafting is a promising approach for speculative decoding. By predicting tokens at multiple future pos…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

EnergyMamba: An Uncertainty-Aware Graph-Enhanced Selective State Space Model for Energy Consumption Prediction

Energy consumption prediction is essential for efficient grid management, demand-side optimization, and sustainable energy planning. Althou…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Threshold-Based Exclusive Batching for LLM Inference

Mixed batching (MB)--interleaving prefill and decode in a single batch--has become the standard scheduling strategy for large language mode…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Acting with AI: An Interaction-Based Framework for Agentic Tort Liability

Agentic AI systems can plan over multiple steps, use tools, and execute tasks over time. When such systems cause harm, tort law struggles t…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

KACE: Knowledge-Adaptive Context Engineering for Mathematical Reasoning

Context engineering can improve large language models without updating their weights, but mathematical reasoning exposes a key limitation:…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Probe Before You Edit: Probing-Guided Molecular Optimization for LLM Agents in Structure-Based Drug Design

Structure-based drug design increasingly employs LLM agents to iteratively refine ligands against a target pocket, yet a viable ligand must…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

PropLLM: Propagation-Aware Scene Reconstruction for Network Fault Diagnosis

Network faults propagate layer by layer along topology and protocol dependencies, yet operations systems typically observe only symptomatic…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

TRACE: Trajectory Risk-Aware Compression for Long-Horizon Agent Safety

Long-horizon LLM agents produce safety evidence across long trajectories, where sparse, delayed, and compositional risk signals often escap…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Efficient Test-time Inference for Generative Planning Models

Generative models have emerged as a powerful paradigm for AI planning, yet their performance remains constrained by the training data distr…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Hidden Thoughts Are Not Secret: Reasoning Trace Exposure in LLMs

Reasoning traces have become a valuable form of learning signals for improving and transferring the capabilities of large language models.…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

ForeSci: Evaluating LLM Agents for Forward-Looking AI Research Judgment

AI research often requires decisions before future evidence exists: which bottleneck to attack, which direction to pursue, or where a proje…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

AXIOM: A Trust-First Neuro-Symbolic Execution Architecture for Verifiable Mathematical Reasoning

We present AXIOM, a trust-first neuro-symbolic execution architecture for natural-language mathematical reasoning. In AXIOM, the language m…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Medication-Aware Financial Exploitation Detection for Alzheimer's Patients Using Edge-Aware Interaction Risk Modeling

Financial exploitation is a growing concern for people with Alzheimer's disease, especially during periods of reduced cognitive stability.…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Regularized Offline Policy Optimization with Posterior Hybrid Bayesian Belief

Offline reinforcement learning (RL) aims to optimize policies from pre-collected datasets. A bottleneck of this paradigm is managing episte…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

MOSAIC: Modular Orchestration for Structured Agentic Intelligence and Composition

Automated data science is a structured model-selection problem. A solution must choose data transformations, feature representations, archi…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

LLM-Driven Co-Evolutionary Automated Heuristic Design for Bi-Component Coupled Combinatorial Optimization

While Large Language Models (LLMs) have recently shown promise in Automated Heuristic Design (AHD), existing methods typically generate and…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Latent Reward Steering: An Adaptive Inference-Time Framework that Implicitly Promotes Cognitive Behaviors in Reasoning LLMs

Strong reasoning depends not only on model knowledge but also on how effectively cognitive behaviors are deployed during generation. Existi…

2026-06-02 13:00 JSTarXiv cs.AIビジネス/資金調達

AI Sovereignty as National Learning Capacity: A Human-Centered Learning Mechanics Viewpoint on France, the United States, and China

Artificial Intelligence is often discussed in France in terms of investment, compute capacity, regulation, employment, sovereignty, and edu…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

SHARP: Sleep-based Hierarchical Accelerated Replay for Long Range Non-Stationary Temporal Pattern Recognition

Learning long-range non-stationary temporal patterns remains a core challenge for modern sequence models, particularly in strict streaming…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

CoMIC: Collaborative Memory and Insights Circulation for Long-Horizon LLM Agents in Cloud-Edge Systems

Deploying lightweight Large Language Model (LLM) agents on edge servers can reduce latency and move agentic services closer to users, but r…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

FALAT: Tracing Failures in LLM Agent Trajectories via Dependency-Guided Search

LLM-based agents increasingly solve complex tasks through long trajectories involving reasoning steps, tool calls, and inter-agent communic…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Interaction-Centered Intelligence: Toward Interaction as the Primary Unit of Analysis in Co-Creative AI and Human-AI Systems

Traditional artificial intelligence has largely conceptualized intelligence as isolated computation occurring within bounded agents. Across…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

NBQ: Next-Best-Question for Dynamic Profiling

Many real-world conversational settings for knowledge discovery, including podcasts, hiring screens, and marketplaces, require a purpose-dr…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

Mitigating Hallucinations in Large Language Models Via Decoder Layer Skipping

Large Language Models (LLMs) have achieved strong performance across diverse natural language tasks, yet their outputs often suffer from ha…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Subliminal Learning is a LoRA Artifact

Subliminal learning is a phenomenon where language models can transmit behavioral traits to other models through seemingly innocuous data (…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Decoupled Behavioral Cloning for Scalable Inductive Generalization in RL from Specifications

Inductive generalization is a framework for reinforcement learning (RL) generalization in which inductively related task instances admit in…

2026-06-02 13:00 JSTarXiv cs.AIビジネス/資金調達

Certificate-Guided Evaluation of Reinforcement Learning Generalization

This work presents a logic-driven framework to evaluate the performance of reinforcement learning (RL) algorithms in their ability to gener…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Ryze: Evidence-Enriched Data Synthesis from Biomedical Papers

General-purpose VLMs remain unreliable for biomedical research because valid answers in scientific papers depend on evidence split across f…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Adversarial Feeds Steer LLM Agent Decisions Against Their Defaults

LLM agents increasingly act after consuming ranked external information streams such as social feeds, search results, retrieval contexts, a…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Relational Intervention During Functional Collapse in Large Language Models: A Lexical-Statistical Ablation and a Structure x Register Factorial

We test whether a relational-style intervention delivered during functional collapse in a small language model produces post-collapse behav…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Towards Understanding Modality Interaction in Multimodal Language Models via Partial Information Decomposition

Understanding modality interaction in multimodal large language models (MLLMs) is central to reliable deployment. We introduce Partial Info…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Prospect-Theory Behavior from Bellman Optimality in MDPs with Catastrophic States

We study risk-neutral control in Markov decision processes with an absorbing catastrophic state. Even though rewards are linear and the age…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Large Language Models in Transportation Systems Management and Operations: From Text Reasoning to Multi-modal Decision Support

Transportation systems management and operations (TSMO) increasingly depends on timely interpretation of heterogeneous data, from various s…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Subliminal Learning Is Steering Vector Distillation

Subliminal learning refers to a student language model acquiring a teacher's traits (e.g. a system-prompted preference for owls) when fine-…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Property Prediction of Stacked Bilayer Materials: A Multimodal Learning Approach

AI for materials science is a critical topic within AI for science, aiming to accelerate materials discovery and produce accurate property…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Can AI Review Improve Paper Drafting? An Empirical Study on 20 Computer Architecture Submissions

Research is advancing faster than ever with artificial intelligence (AI); and so are the corresponding research papers. The exploding volum…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Tackling the Root of Misinformation by Teaching Laypeople about Logical Fallacies via Socratic Questioning and Critical Argumentation

Identifying logical fallacies in everyday discourse is challenging for many people. This challenge is amplified in the era of Large Languag…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

TriLens: Per-Layer Logit-Lens Entropy for White-Box Hallucination Detection

When a language model hallucinates, the final answer is wrong, but the mistake is not necessarily invisible inside the model. Different int…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

TravelEval: A Comprehensive Benchmarking Framework for Evaluating LLM-Powered Travel Planning Agents

The development of Large Language Models (LLMs) has significantly improved travel planning applications, yet evaluating such models is limi…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

AnyEdit++: Adaptive Long-Form Knowledge Editing via Bayesian Surprise

Editing complex, long-form knowledge in Large Language Models remains a significant challenge due to the difficulty of maintaining generati…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

DAG-MoE: From Simple Mixture to Structural Aggregation in Mixture-of-Experts

Mixture-of-Experts (MoE) models have become a leading approach for decoupling parameter count from computational cost in large language mod…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

MindClaw: Closed-Loop Embodied Mental-State Reasoning for Precision Intervention

Theory of Mind (ToM) enables an agent to reason about another actor's beliefs, goals, and intentions, which is essential for human-centered…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Before the Model Learns the Bug:Fuzzing RLVR Verifiers

Reinforcement learning with verifiable rewards (RLVR) replaces human preference labels with executable reward functions such as math answer…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

CAREAgent: Clinical Agent with Structured Reasoning and Tool-Integrated for Order Generation

Clinical order generation serves as a critical bridge between clinical decision-making and real-world practice, translating medical decisio…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Diagnosing LLM Arbitration Behavior over Pre-evidence Epistemic States in RAG-based Fact-Checking

In RAG-based fact-checking, LLMs are increasingly used as verifiers to check given claims against retrieved evidence. Their parametric know…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

SkillRevise: Improving LLM-Authored Agent Skills via Trace-Conditioned Skill Revision

Agent skills are procedural artifacts that enable LLM agents to execute workflows, verify constraints, and recover from failures. Existing…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Reasoning4Sciences: Bridging Reasoning Language Models to All Scientific Branches

While Reasoning Language Models (RLMs) are rapidly emerging as powerful tools for scientific research, their impact is primarily concentrat…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Expected Value Alignment for Generative Reward Modeling in Formal Mathematics Verification

Large Language Models (LLMs) are increasingly used with formal interactive theorem provers such as Lean 4. Scaling these systems with reinf…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Deft Scheduling of Dynamic Cloud Workflows with Varying Deadlines via Mixture-of-Experts

Workflow scheduling in cloud computing demands the intelligent allocation of dynamically arriving, graph-structured workflows with varying…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

"Skill issues'': data-centric optimization of lakehouse agents

Coding agents are becoming users of data infrastructure, but their success depends not only on model quality: it also depends on the skills…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

The Case for Model Science: Verify, Explore, Steer, Refine

We argue that the AI community is now ready to move beyond benchmarking and consolidate scattered efforts in model analysis into a systemat…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Can LLM Agents Sustain Long-Horizon Organizational Dynamics?

Large language agents are increasingly used for social simulation, yet it remains unclear whether they can sustain coherent behavior in str…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

The Shape of Wisdom: Decision Trajectories in Language Models

Language models do not simply choose an answer at the output layer. In a 9,000-trajectory MMLU study across Qwen2.5-7B-Instruct, Llama-3.1-…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Advanced Mathematics Learning Behavior Prediction and Academic Early Warning Model Based on Multimodal Data Analysis

Early detection of at-risk students and timely academic intervention pose major challenges in advanced mathematics education, where complex…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Application of Algorithms in Energy-Efficient Design Platforms for Green Building

During green building design, computer-aided energy assessment is widely used to improve efficiency and achieve overall optimization. This…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

HomeFlow: A Data Flywheel for Smart Home Agent Training with Verifiable Simulation

Large language model agents are moving beyond text-only interaction toward physical-world control, with smart homes as a representative dom…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Brain-Atlas-Guided Generative Counterfactual Attention for Explainable Cognitive Decline Diagnosis Using Multimodal Connectomes

Mild cognitive impairment (MCI) and subjective cognitive decline (SCD) are closely associated with the early Alzheimer's disease continuum,…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

SIRIUS-SQL: Anchoring Multi-Candidate Text-to-SQL in Execution Feedback

Text-to-SQL on complex schemas is unreliable on a single pass, so recent systems generate multiple SQL candidates and let voting filter out…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Emergent Ordinal Geometry in Transformers Trained on Local Comparisons

Transitive inference is the challenge of inferring that A < C from knowing only adjacent relations (A < B, B < C). It is solved by humans a…

2026-06-02 13:00 JSTarXiv cs.AIエージェント研究/論文

ANDES: Agent Native Data Evolving Synthesis Tool for Autonomous Instruction Alignment

AI agents are increasingly being tasked with automating AI research itself, particularly the critical post-training phase that transforms b…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

SkillSmith: Co-Evolving Skills and Tools for Self-Improving Agent Systems

Recent self-evolving agents have shown that skills can be discovered, refined, and accumulated through execution. However, existing skill-e…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Science Earth: Towards A Planet-Scale Operating System for AI-Native Scientific Discovery

Scientific discovery demands intelligence, perseverance, and serendipity across vast search spaces. Today, top scientific capabilities rema…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Recognize Your Orchestrator: An Entropy Dynamics Perspective for LLM Multi-Agent Systems

The transition from single-turn models to Multi-Agent Systems (MAS) promises enhanced problem-solving capabilities, yet the centralized orc…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

FlowTime: Towards Continuous Generative Watch Time Prediction via Flow-based Personalized Priors

Watch time has emerged as a pivotal metric for optimizing deep user engagement in short-video recommender systems. However, current methods…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Early Diagnosis of Wasted Computation in Multi-Agent LLM Systems via Failure-Aware Observability

Tool-using multi-agent large language model (LLM) systems spend computation through model tokens, tool calls, retries, and code execution b…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

GuidaPA: Privacy-Preserving Chatbot for Public Administration via Federated Learning

We present GuidaPA, a privacy-preserving chatbot for the Italian Public Administration (PA) trained via Federated Learning (FL) on document…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Self-Healing Agentic Orchestrators for Reliable Tool-Augmented Large Language Model Systems

Tool-augmented large language model (LLM) agents rely on orchestration layers that coordinate planning, retrieval, tool invocation, validat…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

GovAI-Pipe: A Layered AI Governance Pipeline for Citizen-Facing AI in Turkey's e-Government Gateway

Turkey's e-Government Gateway (e-Devlet) serves over 68 million registered users with more than 9,200 government services, and is increasin…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Don't Ask the LLM to Track Freshness: A Deterministic Recipe for Memory Conflict Resolution

LLM-based memory systems increasingly maintain facts that evolve over time, where a recurring failure is conflict resolution: when a fact h…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Dive into Ambiguity: A*-Inspired Multi-Agents Commonsense Obfuscation Attack on LLM Prompts

Large language models (LLMs) excel in reasoning and knowledge-intensive tasks but remain vulnerable to prompt-level adversarial attacks tha…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Self-Revising Discovery Systems for Science: A Categorical Framework for Agentic Artificial Intelligence

Scientific discovery is not only answer generation but revision of the representational regime in which evidence, artifacts, operations, an…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Transferring Information Across Interventions in Causal Bayesian Optimization

Bayesian optimization is a popular way to optimize expensive systems, where every experiment, simulation, or intervention costs time or mon…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

An Enigma of Artificial Reason: Investigating the Production-Evaluation Gap in Large Reasoning Models

Studies of human reasoning have shown that people are typically stronger at evaluating reasoning than producing it from scratch. In contras…

2026-06-02 13:00 JSTarXiv cs.AIビジネス/資金調達研究/論文

A Minimalist Brain-Computer Musical Interface for Real-Time Emotion-Driven Sonification: System Design and Preliminary Evaluation

This paper presents a minimalist brain-computer Musical Interface (BCMI) that functions as a real-time affective sonification system, trans…

2026-06-02 13:00 JSTarXiv cs.AIロボティクス

TERRA: Task-Embedded Reasoning and Representation Architecture for Cross-Domain Applications

A single action-conditioned latent predictive architecture can in principle be trained on the structured state of a driving scene, a robot…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Joint Agent Memory and Exploration Learning via Novelty Signals

In open-ended environments, exploration is fundamental for autonomous agents, yet current language model agents struggle with this. Effecti…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

RoleCDE:Benchmarking and Mitigating Role-Alignment Trade-offs in Role-Playing Agents

Role-playing agents(RPAs) are widely used to steer large language models(LLMs) toward role-consistent behavior, yet existing benchmarks mai…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

S-SPPO: Semantic-Calibrated Self-Play Preference Optimization

Aligning Large Language Models (LLMs) with human preferences is often formulated via Direct Preference Optimization (DPO). However, the sta…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

TRON: Targeted Rule-Verifiable Online Environments for Visual Reasoning RL

Reinforcement learning (RL) for visual reasoning needs scalable, verifiable, and controllable training signals. Existing visual RL post-tra…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Revisiting Ripple Effects in Knowledge Editing through Pressure-Aware Joint Neighborhood Optimization

Single-edit updates in large language models can trigger ripple effects across local knowledge neighborhoods: desirable propagation to rela…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

ReSkill: Reconciling Skill Creation with Policy Optimization in Agentic RL

Agentic reinforcement learning (RL) enables LLM agents to improve continuously from environment rewards, yet the resulting policies do not…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

MobEvolve: An Agentic Self-Evolving Heuristic System for Interpretable Human Mobility Generation

Human mobility generation aims to synthesize realistic trip chains for target populations based on individual features. Existing paradigms,…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Characterization of Multi-Model Agentic AI Systems on General Tasks via Trace-Driven Simulation

Agentic AI completes tasks through iterative planning, tool use, and reasoning based on observed outcomes. Despite its popularity, its syst…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Evidence-Gated LLM Priors for Multi-Objective Bayesian Optimization

Large language models (LLMs) are increasingly used as heuristic advisors for black-box optimization, yet their suggestions and self-reporte…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

TrafficRAG: A Multimodal RAG Framework for Traffic Accident Liability Determination

Traffic accident liability analysis is a critical yet challenging task in intelligent transportation and legal assistance. Existing methods…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

TriAlign: Towards Universal Truth Consistency in Personalized LLM Alignment

Personalized large language models adapt responses to users' preferences and social attributes, but can introduce substantial universal tru…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

EvoBrain: Continual Learning of EEG Foundation Models Across Heterogeneous BCI Tasks

Electroencephalography (EEG) is the cornerstone of non-invasive brain-computer interfaces (BCIs), yet conventional decoding relies on fragm…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Structure-Guided Adaptive Propagation for Protein-Protein Interaction Site Prediction

Accurate prediction of protein-protein interaction sites (PPIS) is essential for understanding cellular processes, disease mechanisms, and…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Stochastic convergence of parallel asynchronous adaptive first-order methods

A new class of asynchronous adaptive first-order optimization methods is introduced, comprising asynchronous variants of several popular al…

2026-06-02 13:00 JSTarXiv cs.AIビジネス/資金調達研究/論文

Consistency evaluation of benchmarks used for causal discovery

In graphical causal model, causal discovery aims to construct a causal graph based on numerical data and domain knowledge in plain text. Ho…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

OctoT2I: A Self-Evolving Agentic Text-to-Image Router

The explosive growth of Text-to-Image (T2I) models, from large-scale versions to lightweight, real-time ones, now faces diminishing margina…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Token Predictors Are Not Planners: Building Physically Grounded Causal Reasoners

Current benchmarks for embodied vision-language planning often favor linguistic next-token prediction over physically grounded next-state r…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

CAPF: Guiding Search-Agent Rollouts with Credit-Attenuated Privileged Feedback

Recent LLM search agents use reinforcement learning with verifiable rewards (RLVR) to learn search-augmented reasoning from outcome rewards…

2026-06-02 13:00 JSTarXiv cs.AIビジネス/資金調達

Evaluation of Baseline Methods for IDD-based SSD External Memory Search

Many difficult search problems cannot be solved by algorithms such as A* using only RAM. Search algorithms which use external memory such a…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Does Compression Preserve Uncertainty? A Unified Benchmark for Quantized and Sparse LLMs via Conformal Prediction

Model compression techniques such as quantization and pruning are widely used to reduce the deployment cost of large language models (LLMs)…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

WorldCoder-Bench: Benchmarking Physically Grounded 3D World Synthesis

Large language models (LLMs) are increasingly asked not only to write static interfaces, but to construct executable interactive worlds fro…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

EVA-Net: Subject-Independent EEG Motor Decoding with Video-Derived Motor Priors

Practical non-invasive Brain-Computer Interface (BCI) systems require EEG decoders with strong cross-subject generalization and minimal cal…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Absorbing Complexity: An Interaction-Native Knowledge Harness for Financial LLM Agents

Financial AI agents often fail for a simple reason: they make users carry the complexity. A user must repeatedly restate goals, risk prefer…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Physically-Constrained Mamba-SDE for Remaining Useful Life Prediction under Irregular Observations

Accurate Remaining Useful Life prediction is critical for industrial predictive maintenance. However, real-world deployment is challenging…

2026-06-02 13:00 JSTarXiv cs.AIビジネス/資金調達

Community-Aware Assessment of Social Textual Engagement and Resonance: A Human-Centric Perspective on User-Generated Content Evaluation

Traditional Video Quality Assessment (VQA) focuses narrowly on aesthetic fidelity, overlooking the complex social dynamics that define qual…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Bayesian Spectral Emotion Transition Discovery from Multi-Annotator Disagreement

Emotions evolve through the dynamics of conversation, and understanding their transition structure is foundational to applications ranging…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

SMH-Bench: Benchmarking LLM Agents for Environment-Grounded Reasoning and Action in Smart Homes

Smart homes are evolving toward complex state-dependent living environments, requiring Large Language Models (LLMs) to reason over user int…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

VET: A Framework for Analyzing AI Discourse

Public discourse on AI has become polarized; exaggerated positions on AI in traditional and social media threaten the development of AI Lit…

2026-06-02 13:00 JSTarXiv cs.AIエージェント研究/論文

AutoMedBench: Towards Medical AutoResearch with Agentic AI Models

Autonomous agents are increasingly expected to support end-to-end medical-AI research workflows, moving beyond isolated prediction tasks or…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Algorithmic algorithm development with LLMs: A Case Study on LLM-Usage for Contraction Order Optimization in Tensor Networks

We consider LLM-based algorithm development through a case study on contractionorder optimisation for tensor networks with OpenEvolve. We p…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

An NLP-Driven Framework for Curriculum-Labor Market Alignment: Schema-Constrained LLM Extraction, ESCO-Anchored Semantic Matching, and Multi-Dimensional Gap Quantification

Schema-constrained information extraction from diverse educational and labor-market corpora remains an open challenge in natural language p…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

SafeMCP: Proactive Power Regulation for LLM Agent Defense via Environment-Grounded Look-Ahead Reasoning

As Large Language Model (LLM) agents increasingly leverage the Model Context Protocol (MCP) to operate in complex environments, the expansi…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Extreme Low-Bit Inference in Reasoning Models: Failure Modes and Targeted Recovery

Large Reasoning Models (LRMs) rely on long reasoning traces, making inference expensive. While low-bit quantization reduces per-token decod…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

RL-ACRGNet: Reinforcement Learning-Based Chest Radiology Report Generation Network

Medical imaging interpretation is a foundational pillar of modern clinical diagnostics, yet the manual generation of radiology reports rema…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Topological texture analysis of microscopy images of dynamic casein gelation and its relation to rheological properties

We propose a novel computational toolbox that integrates Topological Data Analysis (TDA), Differential Box Counting (DBC), Multifractal Par…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Explainable Data-driven Deep Reinforcement Learning Methods for Optimal Energy Management in Buildings

The increasing integration of renewable energy sources into power systems, particularly in buildings equipped with photovoltaic (PV) panels…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

eMoT: evolving Memory-of-Thought via Symbolic Anchoring and Memory Corrosion

While Large Language Models (LLMs) achieve impressive performance on multi-step reasoning tasks, their reliability is persistently hindered…

2026-06-02 13:00 JSTarXiv cs.AIエージェントビジネス/資金調達研究/論文

Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories

Deep-research agents solve tasks through long trajectories of search, tool use, evidence inspection, and answer synthesis. Evaluation based…

2026-06-02 13:00 JSTarXiv cs.AIエージェントビジネス/資金調達

BADGER: Bridging Agentic and Deterministic Evaluation for Generative Enterprise Reasoning

Enterprise AI systems that translate natural language into SQL queries and orchestrate multi-step agentic reasoning pipelines require evalu…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Learning When Not to Act: Mitigating Tool Abuse in Agentic Reinforcement Learning

Agentic reinforcement learning can induce tool abuse, where models overuse external tools even for queries solvable by internal reasoning.…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

S3TS: Stochastic Scenario-Structured Tree Search for Advanced Planning Under Uncertainty

Effective scheduling in the energy sector is essential to ensure the reliable operation of electrical grids and their connected assets by,…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

An Abstract Worlds Semantic Framework for Belief Change Operators

This article proposes a set-theoretic framework for belief change, called Abstract Worlds Semantics, in which no logical syntax is assumed.…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

From Capability Models to Automated Planning: An AAS-Native Approach for Automatic PDDL Generation

Engineers designing production systems need to verify that a given layout supports all required production sequences. Automated planning te…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

CEON: Circular Economy Ontology Network

Increasing the circularity of resource use in our society has been recognized as a path to sustainability, i.e., transitioning into a more…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

POIROT: Interrogating Agents for Failure Detection in Multi-Agent Systems

Orchestrating Large Language Models into Multi-Agent Systems (LLM-MAS) has unlocked remarkable reasoning capabilities, yet emergent failure…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Repair Before Veto: Repair-Augmented Constraint Learning for Contextual Decisions

Hard constraints are usually treated as terminal vetoes: once a candidate violates a requirement, the learned rule rejects it and any repai…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Forget Attention: Importance-Aware Attention Is All You Need

Combining attention's global retrieval with the sequential importance signal of state space models (SSMs) is the open challenge of hybrid l…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Coordination Graphs for Constrained Multi-Agent Reinforcement Learning

Constrained Multi-agent reinforcement learning (CMARL) faces two intertwined challenges: the joint action space grows exponentially with th…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

SIRI: Self-Internalizing Reinforcement Learning with Intrinsic Skills for LLM Agent Training

Long-horizon LLM agents can benefit from reusable skills, yet existing skill-based methods often rely on external skill generators during t…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

MOC: Multi-Order Communication in LLM-based Multi-Agent Systems

Despite the remarkable progress of Large Language Model (LLM) based Multi-Agent Systems, most research focuses on optimizing coordination t…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

COMAP: Co-Evolving World Models and Agent Policies for LLM Agents

Equipping language agents with world models enables them to anticipate environment dynamics and evaluate candidate actions before execution…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Harness-1: Reinforcement Learning for Search Agents with State-Externalizing Harnesses

Search agents are often trained as policies over growing transcripts: the model must decide how to search while also remembering what it ha…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Spatial Representation Learning Beyond Pixels: Unifying Raster Data and Vector Semantics for Human-Centric Geospatial Foundation Models

Earth Observation (EO) has fundamentally transformed the monitoring of environmental processes and human activities up to planetary scale.…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

A Mathematical Conflict Framework for Contextual Data Modulation

In this study, a generalized operator-based mathematical conflict framework is presented to explicitly represent structural discrepancies b…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

AgentPLM: Agentic Protein Language Models with Reasoning-Augmented Decoding for Protein Sequence Design

Protein language models (PLMs) are passive oracles: they generate sequences in a single forward pass with no mechanism to consult external…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Bridging the Sim-to-Real Gap in Semiconductor Visual Program Synthesis via Input Binarization

Precise parametric control over circuit geometry is essential for semiconductor inspection, yet obtaining sufficient real training data rem…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

LLM-Evolved Pattern Generators for Optimal Classical Planning

Learned heuristics have recently become a competitive alternative to traditional domain-independent heuristics for satisficing planning. Ex…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

Food Noise & False Safety: A Systematic Evaluation of How LLMs Fail to Adapt to Eating Disorder Queries with Clinician Feedback

Recent evidence shows that people with eating disorders (EDs) are increasingly seeking guidance, advice, and emotional support from Large L…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成エージェント

HLL: Can Agents Cross Humanity's Last Line of Verification?

Multimodal agents are increasingly expected to operate interfaces on behalf of users, raising a central deployment question: can they truly…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Beyond One-shot: AI Agents for Learning in Field Experiments

Organizations routinely run experiments for A/B testing, yet the data generated from one experiment is underutilized to inform subsequent i…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェントビジネス/資金調達

AGENTCL: Toward Rigorous Evaluation of Continual Learning in Language Agents

Language agents spend substantial inference time solving individual tasks, yet the experience acquired in one episode is often underutilize…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

MCP-Persona: Benchmarking LLM Agents on Real-World Personal Applications via Environment Simulation

The Model Context Protocol (MCP) has emerged as a transformative standard for connecting large language models (LLMs) with external data so…

2026-06-02 13:00 JSTarXiv cs.AIエージェント研究/論文

Iteris: Agentic Research Loops for Computational Mathematics

Recent advances in large language models and agentic AI systems have enabled significant progress in mathematical discovery, from solving c…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

RASER: Recoverability-Aware Selective Escalation Router for Multi-Hop Question Answering

Multi-hop question-answering systems often use expensive retrieval on every question. They may decompose the question, run several retrieva…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Bridging the Last Mile of Time Series Forecasting with LLM Agents

Time series forecasting has advanced rapidly, especially with the emergence of foundation models that show strong zero-shot performance on…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

SafeSteer: Localized On-Policy Distillation for Efficient Safety Alignment

Aligning Large Language Models (LLMs) with human values often degrades their general capabilities, termed the alignment tax. Existing metho…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Tracking the Behavioral Trajectories of Adapting Agents

Text files such as skill files, memory files, and behavioral configuration files play a central role in defining how modern agents act. Thr…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

ClinEnv: An Interactive Multi-Stage Long Horizon EHR Environment for Agents

Clinical practice is not the selection of an answer from enumerated options: a physician gathers heterogeneous information incrementally an…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

A Novel Data Augmentation Strategy for Robust Deep Learning Classification of Biomedical Time-Series Data: Application to ECG and EEG Analysis

The increasing need for accurate and unified analysis of diverse biological signals, such as ECG and EEG, is paramount for comprehensive pa…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

A Lightweight Deep Learning-based Model for Ranking Influential Nodes in Complex Networks

Identifying influential nodes in complex networks is a critical task with a wide range of applications across different domains. However, e…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

BenHalluEval: A Multi-Task Hallucination Evaluation Framework for Large Language Models on Bengali

Despite Bengali being the sixth most spoken language in the world, no prior work has systematically evaluated hallucination in large langua…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Empathic and agentic artificial intelligence in nursing: perspectives on a human-centered framework for cancer care navigation in the United States

For patients experiencing cancer, nurse navigation can ease the burden of complex care by enhancing coordination of health services and pat…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

RuleEdit: Failure-Guided Human-AI Model Editing with Prospective Impact Preview

Despite the promise of AI to assist complex decisions, practitioners still lack ways to detect likely failures and inspect the consequences…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

DraDDP: A Multimodal Multi-Party Dialogue Discourse Parsing Dataset

Multi-party dialogue discourse parsing aims to identify dependency structures and relation types between utterances in conversations. Previ…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

A phenomenon of AI-conformity: how algorithms change human moral decision-making

Social conformity is a well-documented phenomenon in which individuals shift their opinions towards those of a social majority. As artifici…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Toward Robust In-Context Learning: Leveraging Out-of-distribution Proxies for Target Inaccessible Demonstration Retrieval

Although studies have demonstrated that Large Language Models (LLMs) can perform well on Out-of-Distribution (OOD) tasks, their advantage t…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

SortingHat: Redefining Operating Systems Education with a Tailored Digital Teaching Assistant

Operating Systems (OS) courses are among the most challenging in computer science education due to the complexity of internal structures an…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

AEyeDE: An Attention-Based Attribution Framework for AI-Generated Text Detection

Detecting AI-generated text is becoming increasingly challenging as modern language models approach human-level fluency and can evade detec…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Examine Clinicians' Modification of Hedging Language in Ambient AI Documentation: A Comparative Study of AI Drafts and Final Notes

Ambient AI documentation systems generate clinical note drafts that clinicians frequently revise before signing off into electronic health…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Understanding Stigmatizing Language in Clinical Documentation: A Paired Comparison of Ambient AI Drafts and Clinician Finalized Notes

Ambient artificial intelligence (AI) documentation tools are increasingly deployed to reduce clinician documentation burden, but their impl…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

CSRP: Chain-of-Thought Reasoning for Chinese Text Correction via Reinforcement Learning with Efficiency-Aware Rewards

Large Language Model (LLM) based Chinese Grammatical Error Correction (CGEC) systems face two critical challenges: general-purpose models l…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

SENSE: Semantic Embedding Navigation with Soft-gated Evaluation for Retrieval-based Speculative Decoding

Speculative Decoding (SD) accelerates Large Language Model (LLM) inference by employing a lightweight draft model to propose candidate toke…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

lmfaoooo at SemEval-2026 Task 1: Humor Is an Audience. Preference Modeling for Constrained Humor Generation

Humor generation remains difficult not only because producing fluent, novel jokes is hard, but because "funny" is audience-dependent and su…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

TrustLDM: Benchmarking Trustworthiness in Language Diffusion Models

The rapid development of Language Diffusion Models (LDMs) challenges the dominant position of auto-regressive competitors in language proce…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

A Multi-Domain Red Teaming Framework for Safety, Robustness, and Fairness Evaluation of Medical Large Language Models

Large language models (LLMs) are increasingly deployed across healthcare, yet existing benchmarks fail to capture model behavior under adve…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

TCAR-Gen: Temporal Graph Retrieval with Evidence Fusion for Knowledge-Grounded Generation

Retrieval-augmented generation systems struggle with temporal reasoning and evidence fusion when answering complex questions over historica…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

LLMs for Cardiovascular Risk Prediction from Structured Clinical Data

Coronary artery disease (CAD) remains one of the leading causes of death globally, highlighting the need for reliable predictive systems to…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Make Mechanistic Interpretability Auditable: A Call to Develop Guidelines via Continuous Collaborative Reviewing

While mechanistic interpretability (MI) has produced important insights into neural network internals, the field has yet to establish a sta…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Update Opacity: Epistemic Accessibility and Governance Under AI System Change

Machine learning models embedded in deployed AI systems are routinely updated to maintain correct functioning over time. Yet such updates c…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Beyond Tool Adoption: A Practical Five-Stage Developmental Continuum for AI Literacy in Higher Education

Artificial intelligence (AI) literacy is increasingly recognized as a foundational competency for all university graduates. Yet students' e…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Beyond Categories of Caste: Examining Caste Bias and Morality in Text-to-Image AI Models

Text-to-Image (T2I) models have shown promising utility across various domains. However, such models are also amplifying harmful societal b…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Tracing GenAI Literacy: Uncovering Student-AI Interaction Patterns in Academic Writing through Epistemic Network Analysis

As Generative AI (GenAI) becomes integral to education, fostering GenAI literacy is critical. However, current assessments largely rely on…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Improving Hospital Process Management through Process Mining: A Case Study on COVID-19 Clinical Pathways

This study analyzes COVID-19 care pathways using the COVID Data for Shared Learning dataset. We build a transparent, reproducible pipeline…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Algorithmic Authority and the Clinical Standard of Care

The integration of artificial intelligence into clinical medicine creates a fundamental tension between algorithmic probabilistic reasoning…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

When Jokes Cross the Line: Analyzing Regular Humor and Dark Humor in YouTube Shorts

Video platforms such as YouTube have reshaped how users engage with entertainment and information, emphasizing brief, highly engaging conte…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Comprehensive AI governance requires addressing non-model gains

Frontier AI governance often centres on the model-level governance paradigm, which assumes that a model's capability profile is primarily a…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Measuring and Mitigating Bias in Code Generated by Large Language Models

Large language models (LLMs) are widely recognised for their applications in natural language generation and are increasingly used for code…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Business Utility of Large Language Models as Exploratory Data Analysis Agents

Large Language Models (LLMs) are increasingly used in analytical workflows, but their suitability as exploratory data analysis (EDA) agents…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成ロボティクス

From Human Videos to Robot Manipulation: A Survey on Scalable Vision-Language-Action Learning with Human-Centric Data

Recent progress in generalizable embodied control has been driven by large-scale pretraining of Vision-Language-Action (VLA) models. Howeve…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Physics-Informed Neural Networks for Radial Consolidation of Combined Electroosmotic, Vacuum and Surcharge Preloading Considering Smear Effects

This study develops a dimensionless multi-domain physics-informed neural network (PINN) framework for electro-osmotic radial consolidation…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Beyond Text and Tables: Vision-Language Model Integration in ComProScanner for Extracting Materials Data from Scientific Figures with High Accuracy

Automated extraction of materials composition-property data from scientific literature has advanced considerably with the development of la…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Rare Events, Real Signals: Functional Ensembles as Units of Computation in Deep Spiking Networks

We investigate how internal representations emerge across hierarchical processing systems by introducing a neuroscience-inspired framework…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

CLSP-REQA: A Real-Time Quality-Aware Closed-Loop Seizure Prediction Framework with Mamba-BiLSTM and Confidence-Gated Intervention

Reliable seizure prediction is a prerequisite for closed-loop neurostimulation therapy, yet existing methods rarely account for the variabi…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Improved Belief-Attention in Vision Task

Recently, Belief-Attention \cite{Guoqiang25BeliefAttention} has been proposed by first performing an orthogonal projection of the softmax-b…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Flow-Based Generative Modeling for Optimizing Sampling Policies in Compressed Sensing Applications

Numerous modern applications in signal processing and medical imaging necessitate acquiring high-dimensional signals under tight resource c…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

BitsMoE: Efficient Spectral Energy-Guided Bit Allocation for MoE LLM Quantization

Mixture-of-Experts (MoE) large language models reduce per-token computation through sparse expert activation, but their deployment remains…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Planktonzilla: Multimodal dataset and models for understanding plankton ecosystems

Marine plankton underpin aquatic food webs and play a key role in global CO2 sequestration, making reliable species identification critical…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

DAStatFormer: A Hybrid Multibranch Transformer with Statistical Feature Integration for DAS-Based Pattern Recognitions

Distributed Acoustic Sensing (DAS) enables large-scale monitoring through optical fibers, but its high dimensionality and complex spatio-te…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Hoeffding Concept Bottleneck Models with Applications to Overhead Images

Explainability of deep learning algorithms is critical for computer-vision applications with high-stake decisions. Concept bottleneck model…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIロボティクス

From Demonstrations to Rewards: Test-Time Prompt Optimization for VLM Reward Models

Reinforcement learning relies on accurate reward functions, which are often hand-crafted or even unavailable in real-world applications, su…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

SentimentLens: Reconciling Sentiment and Ratings via Dual-Modality in the Hospitality Sector

Online travel platforms generate vast volumes of user-generated hotel reviews, offering rich opportunities to understand traveler experienc…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Structured Visual Evidence Decomposition for Evidence-Grounded Multimodal Screening of Obstructive Sleep Apnea-Hypopnea Syndrome

Effective pre-polysomnography screening for obstructive sleep apnea-hypopnea syndrome (OSAHS) requires combining clinical risk factors with…

2026-06-02 13:00 JSTarXiv cs.AIロボティクスハードウェア/半導体

Can Predicted Dynamics Exist in the Physical World?

Predictive Physical AI systems output state rollouts, action chunks, and latent plans, yet a low root-mean-square error (RMSE) does not imp…

2026-06-02 13:00 JSTarXiv cs.AIエージェントロボティクス

Silent Failures in Physical AI: A Literature Review of Runtime Action Authorization for Autonomous Systems

Physical AI systems increasingly map multimodal observations, language instructions, and learned world representations into physically cons…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

DLLM-JEPA: Joint Embedding Predictive Architectures for Masked Diffusion Language Models

Joint Embedding Predictive Architectures (JEPAs) have reshaped self-supervised representation learning in vision. The recent LLM-JEPA porte…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Aligning Cellular Sheaves with Classifier Attention for Interpretable Weakly-Supervised Pathology Localization

Weakly-supervised classification of whole-slide images with attention-based multiple instance learning (ABMIL) on top of foundation feature…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Diffusion Image Generation with Explicit Modeling of Data Manifold Geometry

Image generative models aim to sample data points from the underlying data manifold, a task that requires learning and decoding a dense, lo…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成エージェントロボティクス

Bridging the 2D-3D Gap: A Hierarchical Semantic-Geometric Map for Vision Language Navigation

Vision-Language Navigation (VLN) enables embodied agents to reach target locations in unseen environments by following language instruction…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成エージェント

Diversity Over Frequency: Rethinking Tool Use in Visual Chain-of-Thought Agents

Visual agents employ external visual tools within visual chains of thought to incorporate fine-grained evidence. While prior work has mainl…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

CoilDrop-MRI: Self-supervised physics-guided MRI reconstruction with coil dropout

Self-supervised deep learning-based methods have shown great promise for accelerated magnetic resonance imaging (MRI) reconstruction, achie…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

CoCoVideo: The High-Quality Commercial-Model-Based Contrastive Benchmark for AI-Generated Video Detection

With the rapid advancement of artificial intelligence generated content (AIGC) technologies, video forgery has become increasingly prevalen…

2026-06-02 13:00 JSTarXiv cs.AIエージェントロボティクス

PEACE: A Planner-Executor Agent with Constraint Enforcement for UAVs

Foundation models are increasingly used to drive autonomous systems, yet existing approaches either keep the model in a tight control loop,…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Visual-Noise Guided In-Context Distillation for Multimodal Large Language Model Unlearning

Multimodal Large Language Models (MLLMs) have achieved remarkable progress on vision-language tasks, but they may also memorize and expose…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

A Methodological Framework for Explicit Control of the Speed-Accuracy Trade-off in Brain-Computer Interfaces

Brain-computer interfaces (BCIs) are limited by low signal-to-noise ratio in modalities such as electroencephalography, which requires mult…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Motif-based morphology signatures for interpretable ECG screening and monitoring

Electrocardiography (ECG) remains central to cardiovascular screening, yet interpretation remains largely manual and episodic. Clinical pra…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Project SPARROW and the Future of Conservation Technology

Global biodiversity is declining at unprecedented rates, yet the tools available to monitor and protect ecosystems remain limited by constr…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成ロボティクス

VDSB-GWSyn: Diffusion Schr\"{o}dinger Bridge for Controllable and Anatomically Feasible Guidewire Synthesis in Coronary Angiography

Coronary guidewire endpoint localization is a fundamental capability for computer-assisted PCI, and its importance increases as robot-assis…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Enhancing BiGRU with a KAN Block for Legal Document Classification and Summarization

This study introduces a novel architecture of KAN-based BiGRU model for the task of classification and summarization of legal documents in…

2026-06-02 13:00 JSTarXiv cs.AIエージェントロボティクス

V2I Work Zone Geometry Reconstruction with Pose-Conditioned UWB Range Denoising

Reliable work zone mapping is important for connected and autonomous vehicles (CAVs) to navigate safely and smoothly through work zone area…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

SpikeWFM: Spiking-Aided Wireless Foundation Model for Robust Channel Prediction

This paper proposes SpikeWFM, a novel hybrid architecture that integrates spiking neural networks (SNNs) with conventional artificial neura…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Versatile Framework with Semantic and Structural guidance for Image Reconstruction from Brain Activity

Reconstructing visual stimuli from brain recordings has been a meaningful and challenging task in brain decoding. Especially, the achieveme…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成ビジネス/資金調達研究/論文

CardioLens: Revealing the Clinical Reality Gap of MLLMs via Multi-Sequence Cardiac MRI Evaluations

Multimodal Large Language Models (MLLMs) have shown strong performance on public medical benchmarks, yet existing evaluations often remain…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Multimodal Music Recommendation System using LLMs

Music recommendation systems typically treat songs as opaque tokens, relying on collaborative interaction histories which overlooks semanti…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

A Shared Valence Axis Across Modern LLMs and Human EEG: The Saturation Regularity

Large language models (LLMs) have emerged as powerful representation learners whose internal features increasingly align with human cogniti…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Automatically Differentiable Nonlinear Tensor Networks (ADNTNs) for Exponential Compression of Deep Neural Networks

We study Automatically Differentiable Nonlinear Tensor Networks (ADNTNs), a family of structured weight generators whose compact core tenso…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

AI-PROPELLER: Warehouse-Scale Interprocedural Code Layout Optimization with AlphaEvolve

Post-link optimizers (PLOs) such as Propeller and BOLT have demonstrated that precise, profile-guided code layout can extract significant p…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Foundation-Preserving Adaptation via Generalized Rayleigh-Quotient Optimization

While finetuning effectively adapts foundation models to specialized downstream tasks, it can degrade nontarget capabilities acquired durin…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

XAI-SOH-FL: Enhancing SOH-FL with Adaptive Aggregation and Explainable AI for Intrusion Detection in Heterogeneous IoT

Intrusion Detection Systems (IDS) in Internet of Things (IoT) environments face significant challenges due to data heterogeneity, lack of l…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

On Effectiveness and Efficiency of Agentic Tool-calling and RL Training

Tool-calling is a central component of modern large language model (LLM) agents, equipping them with skills beyond their parametric knowled…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Generative AI and Digital Ecosystem Resilience: A Proactive Lifecycle-Based Survey

The proliferation of adversarial synthetic content, accelerated by Generative AI (GenAI) is rendering traditional reactive detection method…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Geodesics with Unified Tangent-constrained Priors and Curvature Regularization

Curvature-penalized geodesic models have proven their effectiveness in image segmentation by computing globally optimal curves. Unfortunate…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Geometric Erasure by Contrastive Velocity Matching in Rectified Flows

While the rapid adoption of multimodal generative models offers immense potential, it has also increased the risks of harmful content synth…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Adaptive data selection improves wearable prediction under low baseline performance

Adaptive sensing strategies that selectively sample data are increasingly used in wearable health systems to improve prediction performance…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Regime-Adaptive Continual Learning for Portfolio Management

Financial markets are inherently non-stationary, exhibiting frequent regime shifts and structural changes that render traditional Portfolio…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

BudgetDraft: Acceptance-Aware Multi-View Training for Sparse-KV Speculative Decoding

Speculative decoding speeds up autoregressive decoding by using a drafter to propose multiple tokens that a verifier validates in parallel.…

2026-06-02 13:00 JSTarXiv cs.AIエージェントロボティクス

Completion at the Boundary (CaB): Deployable Switching with Completion-Aware Control under Limited Calibration

Vision-language-action (VLA) agents can execute natural-language instructions, yet deployed systems still lack an operational interface: de…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Multi-Contrast MRI Motion Correction via Parameter-Informed Disentanglement and Adaptive Experts

Motion artifacts in magnetic resonance imaging (MRI) degrade diagnostic reliability. Existing deep learning methods are typically contrast-…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

RAFT: Data Refinement and Adaptive Distillation for Domain Fine-Tuning with Alleviated Forgetting

Domain-specific supervised fine-tuning (SFT) often improves in-domain performance at the cost of degrading a model's general capabilities.…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

StemBind: When MLLMs Get Lost Between Rules and Instances in Abstract Visual Reasoning

Multimodal large language models (MLLMs) often know the rule but pick the wrong answer: on abstract visual reasoning (AVR) tasks, a model c…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Persona Attack: Incremental Memory Injection Jailbreak Attack against Large Language Models

As Large Language Models evolve for user convenience, vulnerability to jailbreak attacks continues to be reported despite ongoing efforts i…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Emergence of Exploration in Policy Gradient Reinforcement Learning via Retrying

In reinforcement learning (RL), agents benefit from exploration only because they repeatedly encounter similar states: trying different act…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

PrivacyPeek: Auditing What LLM-Based Agents Acquire, Not Just What They Say

LLM-based agents are rapidly advancing, autonomously invoking external tools to complete multi-step tasks for users. However, agents often…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

DiffCrossGait: Trajectory-Level Alignment for 2D-3D Cross-Modal Gait Recognition via Latent Diffusion

Cross-modal 2D-3D gait recognition is impeded by inherent domain discrepancies between 2D silhouette and 3D LiDAR range-view representation…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Benchmarking Multimodal LLMs on Code Generation for Complex Interactive Webpages

Recent advancements in multimodal large language models (MLLMs) have achieved remarkable progress in multimodal reasoning and code generati…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

A Protocol-Language Model for Network Intrusion (Without Deep Packet Inspection)

Modern network intrusion detection systems (NIDS) are caught in a structural contradiction: the protocols carrying the highest threat intel…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

A physics-informed foundation model for quantitative diffusion MRI

Understanding the human brain requires access to its microscopic tissue architecture. Diffusion magnetic resonance imaging (MRI) provides t…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Interpreting FCDNNs via RG on Exponential Family

We consider establishing the interpretability theory of deep learning through constructing a corresponding relationship between the renorma…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Digital-to-Physical Transfer of Adversarial Patches for Aerial Vehicle Detection

Deep neural network (DNN)-based object detectors are widely used for analyzing aerial and satellite imagery in applications such as environ…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

DataShield: Safety-degrading Data Filtering for LLM Benign Instruction Fine-Tuning

Large language models (LLMs) suffer from degraded safety capabilities even when fine-tuned with benign datasets. However, existing methods…

2026-06-02 13:00 JSTarXiv cs.AIビジネス/資金調達

Improving IoT Intrusion Detection Through SMOTE-Based Oversampling and Extended Multi-Model Evaluation on Side-Channel Power Data

The detection of intrusions in IoT-based networks poses challenges that cannot be overcome using traditional machine learning methods. Perh…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

ChurnNet: A Optimized Modern AI for Churn Prediction

Increased competition and the growing similarity of products and services offered by retailers have lowered the barriers for customers to s…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

UF-AMA: A unified framework for cross-domain emotion recognition via adaptive multimodal alignment

In recent years, emotion recognition based on physiological signals such as electroencephalogram (EEG) has gained considerable attention, a…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

MyoSem: Aligning Electromyography to Natural-Language Action Semantics for Hand Action Understanding

Electromyography (EMG) directly reflects muscle activation and is a key sensing modality for gesture recognition, prosthetic control, and w…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Beyond Augmentation: Score-Guided Pathological Prior for EEG-based Depression Detection

Deep learning-based Major Depressive Disorder (MDD) detection using Electroencephalography (EEG) is fundamentally constrained by the "small…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

The New Social Image: How AI Competency and AI Proactivity Influence Self- and Peer-Perceptions in the Workplace

Human-AI collaboration is considered the most promising way to incorporate AI in the workplace. What remains unexplored are the experientia…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Agentic Transformers Provably Learn to Search via Reinforcement Learning

Tree search is a central abstraction behind many language-agent reasoning and decision-making tasks: agents must explore actions, remember…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Learning to Construct Practical Agentic Systems

Automated design and optimization of agentic LLM-based systems leads to sophisticated systems that substantially improve result quality ove…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

BAGEN: Are LLM Agents Budget-Aware?

While agents are increasingly spending more resources, today agent cost is mostly measured only after execution. A Budget-Aware Agent (BAGE…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

From Rashomon Theory to PRAXIS: Efficient Decision Tree Rashomon Sets

Standard machine learning pipelines often admit many near-optimal models. These "Rashomon sets" pose a range of challenges and opportunitie…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

SEMBridge: Tagless-Final Program Semantics with Weakest-Precondition and Bounded-Checking Interpretations

Formal methods provide rigorous accounts of program behavior, but practical software engineering often works through executable libraries,…

2026-06-02 13:00 JSTarXiv cs.AIロボティクス

Continuous Reasoning for Vision-Language-Action

Natural language is a powerful reasoning medium for language and vision-language models, but it is mismatched to the granularity of continu…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Civilizational Metamaterials: Engineering Coordination Under Capability Gradients and Structural Turbulence

We argue that governance must transition from a normative discipline to an engineering discipline, and we develop a formal framework, inspi…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

InfoAtlas: A Foundation Model for Zero-Shot Statistical Dependence Estimate

Measuring statistical dependency between high-dimensional random variables is a fundamental task in data science and machine learning. Neur…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Effects of Varying LLM Access on Essay Writing Behavior

Investigating the degree to which large language models (LLMs) affect teaching and learning in universities can help identify strategies fo…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

ARCA: Adapter-Residual Credit Assignment When Token Signals Degenerate

Token-level credit assignment for language-model reinforcement learning is usually formulated as if the policy were fully trainable, while…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

When Softmax Fails at the Top: Extreme Value Corrections for InfoNCE

InfoNCE is the standard contrastive learning objective, but its softmax form is not only a computational convenience: it also encodes a sta…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成ロボティクスビジネス/資金調達

StressDream: Steering Video World Models for Robust Policy Evaluation and Improvement

Video world models (WMs) have shown promise for policy evaluation and improvement by imagining realistic future observations conditioned on…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Hyperbolic and Evidence-Prioritized Experts for Large Vision-Language Models

Large Vision-Language Models (LVLMs) have demonstrated impressive performance on multimodal tasks through scaled architectures and extensiv…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Synthetic Data from Cross-Domain Events for Large-Scale Recommendation Systems

Large-scale recommendation systems operate across diverse domains, yet they face the challenges of data sparsity and noisy implicit feedbac…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Real2SAM2Real: Generative 3D Caches as Complementary Context for Video Diffusion

While Video Diffusion Models (VDMs) excel at synthesizing high-fidelity videos, enabling precise camera and scene control remains challengi…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Bridging Reasoning Trajectories in On-Policy Distillation via Near-Future Guidance

On-Policy Distillation (OPD) improves large language model reasoning by training a student model on trajectories sampled from its own polic…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Rethinking the Role of Temperature in Large Language Model Distillation

Reverse Kullback-Leibler (RKL) divergence is widely favored over forward KL (FKL) in large language models (LLM) distillation, yet this pre…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

How Generation Architecture Shapes Code Complexity in Multi-Agent LLM Systems: A Paired Study on HumanEval

Large-language-model code generation has shifted from single-shot prompting to multi-agent orchestrations - analyst, coder, tester, and deb…

2026-06-02 13:00 JSTarXiv cs.AIロボティクス

DRL-Based Pose Control for Double-Ackermann Robots Under Actuation Uncertainties

Robust deployment of deep reinforcement learning (DRL) policies on real robots remains challenging due to discrepancies between simulation…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

LLMs Need Encoders for Semantic IDs Too

Multimodal LLMs use dedicated encoders to bridge non-language modalities (vision encoders for images, depth models for audio codec tokens)…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Isolating LLM Lexical Bias: A Curation-Free Triangulated Metric for Preference-Stage Learning

Various language domains have undergone remarkable changes in recent years; these shifts are largely attributed to the advent of Large Lang…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

ROGUE: Misaligned Agent Behavior Arising from Ordinary Computer Use

As AI agents are increasingly deployed in real personal and corporate settings (email accounts, development workflows, company databases, e…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

(HB-ARFM) History-Bootstrapped Flow Matching for Inverse Boiling Reconstruction

Reconstructing spatiotemporal fields from partial observations is fundamental to scientific inference, from inferring atmospheric states fr…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Drift Q-Learning

Offline reinforcement learning requires improving a policy from fixed data while avoiding out-of-distribution actions with unreliable value…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Reinforcement Learning with Pairwise Preferences in Long-Term Decision Problems

Reinforcement learning problems typically define the goal as maximizing the expected value of a scalar reward function. But, pairwise prefe…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Agentic Authoring of Interactive Multiview Visualizations in Genomics

Diverse genomics data, scientific questions, and analysis tasks typically demand highly specialized visualizations. Therefore, users often…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成ハードウェア/半導体ビジネス/資金調達

SUPREME: A Multi-GPU Framework for Reproducible Image Unlearning Method Evaluation

Machine unlearning removes the influence of specific training data from a trained model without retraining it from scratch. Evaluating an u…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Zamba2-VL Technical Report

We present Zamba2-VL, a suite of vision-language models built on Zamba2, a hybrid language-model architecture combining Mamba2 state-space…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Detector-Evasive LLM Paraphrasing via Constrained Policy Optimization

AI-text detectors are vulnerable to paraphrasing and detector-guided paraphrasing attacks, but existing detector-evasion methods often lack…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

PR2: Predictive Routing Replay for MoE-Based LLM Reinforcement Learning

Mixture of Experts (MoE) Large Language Models (LLMs) achieve strong performance at scale. However, reinforcement learning (RL) on MoE-base…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

A Distribution-Free Framework for Rewrite-Based Human-text Detection via Knockoff Filtering

We propose a distribution-free statistical framework that converts arbitrary rewrite-based detectors into detectors with finite-sample FDR…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Masking Stale Observations Helps Search Agents -- Until It Doesn't: A Regime Map and Its Mechanism

Long-horizon search agents accumulate large amounts of retrieved content across many tool calls, making context-budget efficiency increasin…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

AgentxGCore: Agentic AI for Next-Generation Mobile Core Network

To meet the stringent requirements of emerging applications and the increasingly complex network management and operation, the Next Generat…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Finer Parameter Steps for Low-Rank PEFT: A Controlled Study with CP Tensor Adapters

Low-rank adapters are usually compared by sweeping a small set of ranks, but the rank also fixes the resolution of the parameter budget. Fo…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Detect Before You Leap: Mirage Detection in Vision-Language Models

Vision-language models (VLMs) can produce confident visual answers even when the required visual evidence is missing, blank, or unrelated t…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

DarkVesselNet: Multi-Modal Remote Sensing and Trajectory Reasoning for Dark Vessel Detection

Dark vessel detection requires fusing what vessels report through AIS with what satellites observe through radar and optical sensors. DarkV…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

GeoSAM-3D: Geodesic Prompt Propagation for Open-Vocabulary 3D Scene Segmentation from Monocular Video

Open-vocabulary 3D scene segmentation usually assumes RGB-D video, calibrated multi-view imagery, or a reconstructed mesh. GeoSAM-3D studie…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

When Safe Skills Collide: Measuring Compositional Risk in Agent Skill Ecosystems

LLM agents increasingly rely on community-contributed skills that expand an agent's operational capability set. We study a core safety prob…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Short-form Text Rewriting with Phi Silica

Short-form text rewriting is a constrained variant of paraphrasing in which limited context and high semantic density leave little room for…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

On the Limits of LLM Adaptability: Impact of Model-Internalized Priors on Annotation Task Performance

Large Language Models (LLMs) are increasingly used for zero-shot annotation and LLM-as-a-judge tasks, yet their reliability hinges on how m…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成エージェントハードウェア/半導体

CodeCytos: AI-assisted spatial molecular imaging analysis via code-augmented agent action space

Conventional tissue image analysis software provides foundational capabilities for cellular analysis, including segmentation, basic morphol…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

Pre-Deployment Robustness Stress Testing for CT Segmentation Systems Using Clinically Motivated Multi-Corruption Augmentation

Deep learning-based CT segmentation systems often achieve high accuracy on clean benchmark images, but their performance may degrade under…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

TabChange: Precise Attribute Changes in Tabular Data

Modifying an attribute in tabular data often introduces an unnatural instance by breaking its relationships with other attributes. The modi…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

V-LynX: Token Interface Alignment for Video+X LLMs

This study introduces an intriguing phenomenon in Video LLMs: rather than merely translating frames into textual embeddings, Video LLMs est…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Skill or Skip? Learning Selective Skill Invocation in Agentic Tasks via Dual-Granularity Preference Learning

Agent skills are callable procedural modules that provide reusable knowledge and execution policies for complex agentic tasks. However, exi…

2026-06-02 13:00 JSTarXiv cs.AIロボティクス

PaCo-VLA: Passivity-Shielded Compliance Prior for Contact-Rich Vision-Language-Action Manipulation

Contact-rich manipulation demands both high-level semantic reasoning and the safe regulation of high-frequency contact dynamics. While Visi…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

CAFOSat: A Strongly Annotated Dataset for Infrastructure-Aware CAFO Mapping Using High-Resolution Imagery

Concentrated Animal Feeding Operations (CAFOs) play an important role in agricultural production but are also associated with environmental…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Richer Representations for Neural Algorithmic Reasoning via Auxiliary Reconstruction

Neural algorithmic reasoning has emerged as a popular research direction. It aims to train neural networks to mimic the step-by-step behavi…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Interpretable Policy Distillation for Power Grid Topology Control

Deep reinforcement learning (RL) offers a promising route to real-time power grid operation, yet large neural policies are costly to evalua…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

A Practical Upper Bound on Selection Bias Effects in Medical Prediction Models

Selection bias is a common and often unavoidable aspect of real-world data that challenges the generalizability of machine learning models.…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Revisiting Parameter-Based Knowledge Editing in Large Language Models: Theoretical Limits and Empirical Evidence

Parameter-based knowledge editing updates the internal knowledge of large language models (LLMs) via localized weight modifications and has…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

On the Difficulty of Learning a Meta-network for Training Data Selection

Synthetic data are increasingly used to train neural networks, yet distributional mismatch with real data limits their effectiveness when u…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Improving Visual Representation Alignment Generation with GRPO

Recent diffusion transformers have demonstrated strong image synthesis capabilities but remain inefficient to train due to weak alignment b…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Critic-R: Improving Agentic Search using Instruction-tuned Retrievers with Natural Language Introspective Feedback

Agentic search systems iteratively interact with retrieval models to answer complex queries. Despite substantial progress, optimizing retri…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

SPADER: Step-wise Peer Advantage with Diversity-Aware Exploration Rewards for Multi-Answer Question Answering

Large language models are increasingly deployed as tool-augmented agents to acquire information beyond parametric knowledge. While recent w…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

CARE-RL: Capability-Aware Reinforcement Learning for Mitigating Cross-Domain Conflicts

Reinforcement learning (RL) with verifiable rewards has achieved strong progress in reasoning-oriented LLMs, but extending it to multi-doma…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

MemGraphRAG: Memory-based Multi-Agent System for Graph Retrieval-Augmented Generation

Retrieval-Augmented Generation (RAG) has become an essential method for mitigating hallucinations in Large Language Models (LLMs) by levera…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

Linguistics-Aware Non-Distortionary LLM Watermarking

Watermarking should identify language-model output without degrading quality or limiting verification to the model provider. Multilingual d…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

Pause and Think: A Dataset and Benchmark for Video-Grounded Assistive Action Suggestion

Recent Vision-Language Models (VLMs) struggle with grounded reasoning, temporal consistency, and context aware planning in videos. We intro…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

MemPro: Agentic Memory Systems as Evolvable Programs

Long-horizon autonomous agents require memory systems to retain historical information, track evolving states, and reuse relevant knowledge…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Authenticity Debt and the Synthetic Content Threat Landscape: A Layered Framework for Trust, Provenance, and IP Governance in the Generative AI Era

Generative artificial intelligence has fundamentally changed how content is now produced. It has enabled how high-fidelity text, images, au…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

LP5X-PIM Sim: A High-Fidelity HW/SW Integrated Simulator for LPDDR5X-PIM

This tech note describes the architecture and execution results of the LPDDR5X-PIM simulator, developed by Samsung Electronics. Based on th…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

LinguIUTics at PsyDefDetect: Iterative Imbalance-Aware Fine-tuning of Qwen3-8B for Psychological Defense Mechanism Classification

Detecting psychological defense mechanisms in conversational text remains a challenging clinical NLP problem. For the PsyDefDetect 2026 sha…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

MESA: Improving MoE Safety Alignment via Decentralized Expertise

Mixture-of-Experts (MoE) architectures scale Large Language Models (LLMs) efficiently, enabling greater capacity with reduced computational…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Scaling Behavior of Single LLM-Driven Multi-Agent Systems

The burgeoning field of LLM-based Multi-Agent Systems (MAS) promises to tackle complex tasks through collaborative intelligence, yet fundam…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Demystifying the Optimal Fair Classifier in Multi-Class Classification

Ensuring fair and equitable treatment across diverse groups, particularly in multi-class classification tasks, poses a significant challeng…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Collaborative Few-Step Distillation and Low-Bit Quantization for Wan2.2 Dual-Expert Video Diffusion Models

Large video diffusion models achieve strong visual quality but remain expensive to deploy because each sample requires many denoising steps…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Beyond the Mouth: Upper-Face Affective Cues in Audiovisual Sentence Recognition under Acoustic Uncertainty

Face-to-face speech comprehension is inherently multimodal, integrating acoustic signals with visible articulation, facial expression, head…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

The Paradox of Outcome Optimization: A Causal Information-Theoretic Bound on Reasoning Shortcuts in LLMs

Large Language Models (LLMs) aligned via outcome-based Reinforcement Learning (RL) frequently exhibit a critical failure mode: they achieve…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

COPF: An Online Framework for Deployment-Stable Counterfactual Fairness in Evolving Graphs

Online link recommendation on evolving graphs is performative: by choosing which candidate links to show users, the system changes which li…

2026-06-02 13:00 JSTarXiv cs.AIロボティクス

Shape Your Body: Value Gradients for Multi-Embodiment Robot Design

We propose to turn generalist multi-embodiment value functions into reusable models for robot design. Instead of running a new reinforcemen…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Information-Theoretic Lower Bounds for Bit-Constrained Stochastic Optimization via a Reduction to Compressed Gaussian Mean Estimation

Low-precision pretraining (FP8, MXFP4, NVFP4) is now standard for frontier language models, yet the literature is almost entirely achievabi…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Multi-Agent Conformal Prediction with Personalized Statistical Validity

Uncertainty quantification is essential in high-stakes machine learning tasks. However, one of the principled solutions, conformal predicti…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

EPIC: Efficient and Parallel Inference under CFG Constraints for Diffusion Language Models

Controlling language model outputs is essential for ensuring structural validity, reliability, and downstream usability, and diffusion lang…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

WaveFilter: Enhancing the Long-Context Capability of Diffusion LLMs via Wavelet-Guided KV Cache Filtering

Diffusion Large Language Models (DLMs) have demonstrated significant advantages across various tasks. However, constrained by their multi-s…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

SORA: Free Second-Order Attacks in Fast Adversarial Training

Adversarial Training (AT) is a leading defense against adversarial examples but often suffers from Catastrophic Overfitting (CO) in efficie…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Quantum Tunneling-Aware Machine Learning: Physics-Derived Noise Models for Robust Deployment

Transistor scaling is approaching a quantum-mechanical limit, as thin gate oxides induce electron leakage through quantum tunneling. Unlike…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

SkyShield: Occupancy as a Safety Interface for Low-Altitude UAV Autonomy

For low-altitude Unmanned Aerial Vehicle (UAV) autonomy, 3D spatial understanding is not merely a perception objective, but the safety inte…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Causal Density Functions

We introduce causal density functions: Radon-Nikodym derivatives that compare interventional laws to observational laws and therefore act a…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Logit Distillation on Manifolds: Mapping by Learning

A simple way to improve the performance of almost any machine learning model is not to train a single but several models with diverse algor…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

GIRL-DETR: Gradient-Isolated Reinforcement Learning for Video Moment Retrieval

Video Moment Retrieval (VMR) task requires accurately localizing temporal boundaries aligned with natural language queries, but many models…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Behavior-Invariant Task Representation Learning with Transformer-based World Models for Offline Meta-Reinforcement Learning

Offline meta-reinforcement learning leverages static datasets to enable agents to generalize to unseen environments by combining offline ef…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Bayesian Inference of Nonlinear Malaria Dynamics in Ghana via an Ensemble Markov Chain Monte Carlo Sampler

Reliable quantification of malaria dynamics in sub-Saharan Africa is hindered by short, noisy, and spatially heterogeneous surveillance rec…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Extending Causal Metamodeling to a non-Markovian Queue

Metamodels for discrete-event simulations approximate the behavior of simulation models without running expensive simulations. Prior work i…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成ハードウェア/半導体

DASH: Dual-Branch Score Distillation for Guidance-Calibrated Compact Diffusion Models

Parameter compression of class-conditional diffusion models reveals an underexplored limitation in output-level distillation: the unconditi…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Dynamic Coordination Strategy Selection for Enterprise Multi-Agent Systems

Enterprise multi-agent systems increasingly expose multiple coordination patterns, but deployments often lack evidence for when to use cons…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Certificates without Electrons? Theory and Evidence on Impacts from AI-Driven Power Demand

Data centers now account for 4.4% of United States electricity demand, yet the grid-level effectiveness of the renewable energy certificate…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

SkillPager: Query-Adaptive Intra-Skill Navigation via Semantic Node Retrieval

Skill-based LLM agents increasingly rely on long procedural documents, but full-document prompting wastes tokens and dilutes information cr…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Beyond Independent Manipulation: Individual Fairness-aware Strategic Classification with Peer Imitation

Strategic classification (SC) investigates scenarios where agents manipulate their features to obtain favorable decisions from predictive m…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Hybrid Probabilistic Forecasting of Under-Five Malaria Admissions in Ghana: A Gaussian Process Regression with Holt-Winters Smoothing

Accurate malaria forecasting remains a major challenge in sub-Saharan Africa, where strong seasonality, reporting uncertainty, and non-stat…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

MoEIoU: Rethinking Bounding-Box Regression as a Mixture of Experts

Bounding-box regression is a fundamental component of object detection, playing a critical role in precise object localization. Existing In…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

RefDiffNet: Learning to Expose Subtle PCB Defects Before Detection

Printed circuit board (PCB) defect detection is challenging because many defects are small and difficult to distinguish from complex backgr…

2026-06-02 13:00 JSTarXiv cs.AIエージェントロボティクス

From Cues to Horizons: Dynamic Risk Horizon Profiling for Trajectory Prediction

Accurate and reliable vehicle trajectory prediction is essential for safe autonomous driving. Recent studies have incorporated safety risk…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

GenPT: Beyond Self-Report for Reliable LLM Psychometrics via Generative Projective Testing

Self-report questionnaires remain the prevailing tool for probing the psychological states of persona-conditioned agents (PC-Agents). Howev…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

Benchmarks for Vision-Language Models in Urban Perception Should Be Reliability-Aware and Negotiated

Vision-language models (VLMs) are increasingly used to generate structured descriptions of street-level imagery for tasks such as streetsca…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Task diversity produces systematic transfer but inhibits continual reinforcement learning

Continual reinforcement learning aims to produce agents that learn not only to improve at their current tasks but also to adapt as task dis…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Dive into Waves: Morlet Spectral Transformer for Cross-Subject Emotion Decoding from EEG

We study cross-subject emotion recognition from EEG, a practically important yet challenging problem in brain-computer interfaces. Unlike t…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Memory-Efficient LLM Training with Dynamic Sparsity: From Stability to Practical Scaling

Dynamic Sparse Training (DST) offers a promising paradigm for improving the training and inference efficiency of deep neural networks; howe…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

MLLM-Microscope: Unlocking Hidden Structure Within Multimodal Large Language Models

This work presents MLLM-Microscope, a novel system designed for analyzing the hidden representations within Multimodal Large Language Model…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Accuracy, Stability, and Repeated-Run Reliability of Large Language Models on Deterministic Programming Tasks

Run-level pass rate overstates retry-free coverage by up to 17.8 percentage points -- and the gap is largest precisely for mid-performing s…

2026-06-02 13:00 JSTarXiv cs.AIエージェント研究/論文

Benchmarking Security Risk Detection and Verification in Open Agentic Skill Ecosystems

Open agent platforms allow community contributors to publish reusable skills that agents can invoke at runtime. This extensibility also cre…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Detection vs. Execution: Single-Bucket Probes Miss Half the Mamba-2 State Sink

Mechanistic interpretability often assumes that probes identifying a representational signature also identify the circuit executing the cor…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

CV-Arena: An Open Benchmark for Instructional Computer Vision Problem Solving with Human-AI Collaborative Preferences

Instruction-guided image editing is becoming a general interface for visual work, yet existing benchmarks still focus largely on narrow app…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Lodestar: An Online-Learning LLM Inference Router

Efficiently serving large language model (LLM) inference tasks is crucial both for user-perceived latency such as time-to-first-token (TTFT…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Silent Failures in Federated Personalization of Foundation Models

Foundation models are increasingly personalized on decentralized private data through federated learning and are now deployed at scale unde…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Explainable deep reinforcement learning reveals energy-efficient control strategies for turbulent drag reduction

We propose a method combining Multi-Agent Deep Reinforcement Learning (MARL) and eXplainable Deep Learning (XDL) to reduce drag in wall-bou…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

SS-ZKR: Spatial-Semantic Zero-Knowledge Routing for Privacy-Preserving Multi-Agent Collaboration

Foundational agent interoperability standards, notably the Agent-to-Agent (A2A) protocol and the Model Context Protocol (MCP), have advance…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

An Open-Source Benchmark and Baseline for Multi-temporal Referring Segmentation

Large Vision-Language Models (LVLMs) have shown strong visual understanding and language-guided grounding abilities, yet their capacity for…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Beyond Task-Agnostic: Task-Aware Grouping for Communication-Efficient Multi-Task MoE Inference

Sparsely activated Mixture-of-Experts (MoE) models scale capacity via conditional computation, but distributed inference suffers from cross…

2026-06-02 13:00 JSTarXiv cs.AIエージェント研究/論文

FVSpec: Real-World Property-Based Tests as Lean Challenges

We present a benchmark for evaluating AI models and agents on real-world formal software verification tasks. We first scrape 11,039 propert…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Cross-Axis Feature Fusion with Joint-Wise Motion Difference Prediction for Text-Based 3D Human Motion Editing

We address text-based 3D human motion editing, where the goal is to preserve the style and structure of a source motion while applying edit…

2026-06-02 13:00 JSTarXiv cs.AIロボティクス

AI-IoT-Robotics Integration: Survey of Frameworks, Emerging Trends, and the Path Toward Connected Robotics

The convergence of Artificial Intelligence, the Internet of Things, and Robotics is no longer a futuristic vision; it is rapidly becoming t…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

PolySpeech-100: A Large-Scale Benchmark for Speech Understanding Across 100+ Languages and Dialects

While End-to-End (E2E) Speech-Large Language Models (Speech-LLMs) are rapidly evolving, their evaluation methodologies remain limited to th…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Hybrid Verified Decoding: Learning to Allocate Verification in Speculative Decoding

Large Language Model (LLM) generation remains expensive because autoregressive decoding calls the model once for each new token. Speculativ…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

ProductWebGen: Benchmarking Multimodal Product Webpage Generation

Crafting a product display webpage from a source product image, along with layout and visual content instructions, holds significant practi…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Data Collection for Training Quality-Control AI in Carpet Manufacturing

Visual inspection remains the dominant quality-control practice in woven and tufted carpet production, yet it is slow, subjective, and inco…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

DSL-LLaDA: Scaling Continuous Denoising to 8B Masked Diffusion LMs

Discrete Masked diffusion language models generate text by iterative parallel decoding, but few-step decoding suffers from a tradeoff betwe…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成ビジネス/資金調達

Temporally-Aligned Evaluation for Audio-Driven Talking Head Generation

Audio-driven talking-head generation has advanced rapidly, yet existing evaluation protocols mainly rely on frame-wise metrics that assume…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

OPD+: Rethinking the Advantage Design for On-Policy Distillation

On-policy distillation (OPD) is a widely used technique to transfer capabilities from capable teacher language models to the base student m…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Plausibility Is Not Prediction: Contrastive Evidence for LLM-Based Cellular Perturbation Reasoning

Perturbation experiments are central to understanding cellular mechanisms, but remain costly and sparse, motivating prediction of gene expr…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成エージェント研究/論文

3DCodeBench: Benchmarking Agentic Procedural 3D Modeling Via Code

Procedural 3D modeling through code is emerging as a versatile paradigm, offering deterministic, engine-ready, and precisely editable asset…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

MENTIS: What Belief Changes Under Alignment? Measuring Multi-Scale Latent Torsion in Language Models

Preference alignment has substantially improved the observable behavior of large language models, yet it remains unclear what alignment cha…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Leyline: KV Cache Directives for Agentic Inference

Modern KV cache management assumes the chatbot workload: prompts arrive once and the cache grows append-only, so prefix caching and forward…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Test-Time Training for Zero-Resource Dense Retrieval Reranking

Dense retrievers excel at first-stage candidate generation but lack effective reranking in zero-resource settings. Existing approaches face…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

ThinkSwitch: Context Distillation with LoRA and Weight Interpolation for Specific-Purpose Reasoning Tasks

Large language models often improve on difficult tasks by spending inference-time compute on a reasoning trace before producing the final a…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

MViewRouter: Internalizing Geometric Equivariance via Multi-view Alternating Attention for Combinatorial Routing

Combinatorial routing problems such as the Traveling Salesman Problem (TSP) and the Capacitated Vehicle Routing Problem (CVRP) are fundamen…

2026-06-02 13:00 JSTarXiv cs.AIビジネス/資金調達

Strong Stochastic Flow Maps

Flow and diffusion models generate high-quality samples in many modalities; however, many network evaluations are required during inference…

2026-06-02 13:00 JSTarXiv cs.AIハードウェア/半導体

A Fiber Criterion for Representation Identifiability in Supervised Learning

Supervised learning evaluates predictors through their input-output behavior. When a predictor is implemented as a composition $f=c\circ h$…

2026-06-02 13:00 JSTarXiv cs.AIロボティクス

Beyond Task Success: Behavioral and Representational Diagnostics for WAM and VLA

Vision-language-action (VLA) policies and World-Action Models (WAM) represent two increasingly important paradigms for robotic manipulation…

2026-06-02 13:00 JSTarXiv cs.AIロボティクス

Implicit Drifting Policy: One-Step Action Generation via Conditional Expert Geometry

Generative action policies based on diffusion or flow matching excel in behavior cloning, yet their iterative sampling is prohibitive for h…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

MiCU: End-to-End Smart Home Command Understanding with Large Language Model

Command understanding systems in smart home ecosystems can automate device control and substantially improve user experience. However, whil…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Soft-NBCE: Entropy-Weighted Chunk Fusion for Long-Context

The quadratic complexity of self-attention remains a bottleneck for Large Language Models (LLMs) processing ultra-long contexts. The Naive…

2026-06-02 13:00 JSTarXiv cs.AIハードウェア/半導体

HASTE: Hardware-Aware Dynamic Sparse Training for Large Output Spaces

Extreme multi-label classification (XMC) involves learning models over large output spaces with millions of labels, making the output layer…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

STARFISH: faST Accuracy Recovery in pruned networks From Internal State Healing

Pruning is a process designed to reduce the number of weights in a large neural network. This can substantially speed up inference but migh…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

AMP: A Vendor-Neutral Wire Format for Agent Memory Operations

Agent-memory frameworks - mem0, Letta/MemGPT, Cognee, Zep/Graphiti, MemoryOS, MemTensor - each ship their own SDK, storage layout, and oper…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

ASE-26: a curriculum for agentic software engineering as a discipline

The work of a professional software engineer has begun to consist, increasingly, of directing agents rather than writing code, and the empi…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

When Data Is Scarce: Scaling Sparse Language Models with Repeated Training

Scaling laws for dense LLMs under infinite data are well explored, but how sparsity interacts with limited data is not. In this work, we st…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

AI From the Margins (AIM): Rethinking Participatory AI Design Through the Lived Experience of Minoritized Communities

Artificial intelligence (AI) can reproduce and amplify the structural inequities faced by minoritized communities. Participatory AI has bee…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Physics-Informed Deep Learning for Entropy Prediction in Heterogeneous Systems: Thermodynamic and Information-Theoretic Case Studies

Entropy production governs irreversibility and uncertainty in both physical and information-theoretic systems. While Physics-Informed Neura…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

CA-BED: Conversation-Aware Bayesian Experimental Design

Large Language Models (LLMs) excel at static reasoning tasks, yet their performance often degrades in interactive scenarios where informati…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Topological Ignorability for Structural Causal Effects Beyond Means

Many interventions alter the structure of an outcome distribution rather than its mean: they can split a population into disconnected regim…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

pcbGPT: Automatic PCB Schematic Synthesis from Natural Language Requirements

Translating natural-language hardware requirements into correct printed circuit board (PCB) schematics remains difficult in embedded, IoT,…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Low-Resource Safety Failures Are Action Failures, Not Representation Failures

Safety alignment learned in high-resource languages transfers poorly to low-resource languages. Models refuse harmful prompts in English bu…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Implicit Geographic Inference in LLM Medical Triage: Language-Driven Disparities in Emergency Recommendations

We investigate whether large language models produce different medical triage recommendations for identical symptoms based solely on the la…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

DiscourseFlip: An Oblique Discourse-Level Opinion Manipulation Attack against Black-box Retrieval-Augmented Generation

Retrieval-Augmented Generation (RAG) systems are widely deployed and increasingly influential, but their reliance on external corpora expos…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

TECCI: Tricky Edits of Collected and Curated Images

Despite tremendous recent progress, current text-guided image editing methods still struggle with many aspects of editing involving instruc…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Distilling Neuro-Symbolic Programs into 3D Multi-modal LLMs

Current 3D spatial reasoning methods face a fundamental trade-off: neuro-symbolic 3D (NS3D) concept learners achieve interpretable reasonin…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Fine-Tuning Diffusion Models for Molecular Generation via Reinforcement Learning and Fast Sampling

Generating molecules that simultaneously satisfy drug-like properties and conform to the 3D structure of a target protein is a core challen…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Hybrid Imbalanced Regression Through Unified Data-Level and Algorithm-Level Balancing

Imbalanced learning is a critical challenge in machine learning, where underrepresented target values can bias models and degrade predictio…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Connecting the Dots: Benchmarking Reflective Memory in Long-Horizon Dialogue

Despite substantial progress in long-context modeling, existing benchmarks remain confined to factual memory for explicit recall, failing t…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Understanding LLM Behavior in Multi-Target Cross-Lingual Summarization

Multi-target cross-lingual text summarization (MTXLS), which summarizes a source document into multiple target languages, is increasingly i…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

IndoBias: A Dual Track Culturally Grounded Benchmark for LLMs Bias Evaluation in Indonesian Languages

Despite being home to more than 1300 ethnic groups and 700 indigenous languages, bias in Large Language Models has not been fully studied i…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

PALTO: Physics-Informed Active Learning for Tri-Gate FinFET Design Optimization for Vertical Power Delivery

This paper demonstrates the effectiveness of machine learning-driven optimization for designing application-specific GaN tri-gate FinFETs i…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成エージェントロボティクス

DeepIPCv3: Event-Aware Multi-Modal Sensor Fusion for Sudden Pedestrian Crossing Avoidance

Current end-to-end autonomous driving systems predominantly rely on frame-based sensors, which suffer from inherent perception latency and…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

RLVR without Ineffective Samples: Group Prioritized Off-Policy Optimization for LLM Reasoning

Reinforcement learning with verifiable rewards (RLVR) has emerged as a powerful paradigm for enhancing the reasoning capabilities of large…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Knowledge-Intensive Video Generation

Text-to-video generation has advanced rapidly in visual quality, but remains under-evaluated for factuality and practical usefulness. We in…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

BenchEvolver: Frontier Task Synthesis via Solution-Centric Evolution

The rapid progress of frontier large language models has led to widespread benchmark saturation, limiting the ability of existing datasets…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Beyond Visual Memory: Mechanistic Diagnostics of Latent Visual Reasoning

Recent latent visual reasoning methods achieve substantial gains by inserting continuous latent tokens into multimodal language models. The…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Quantum Algorithm for Distributed Reduction of Entanglements (QADR): A Trainable and Simulation-Efficient QML Framework

Training Variational Quantum Circuits (VQCs) under Noisy Intermediate-Scale Quantum (NISQ) constraints introduces severe computational limi…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

What Makes a Strong Model? A Unified Spectral Analysis of Knowledge Transfer over High-dimensional Linear Regression

Teacher-Student Knowledge Transfer (KT) is ubiquitous in modern machine learning, ranging from classical model compression via Knowledge Di…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

ResNet-34 with Lightweight Decoder for Accurate and Efficient Segmentation of Fetal Brain MRI

Accurate segmentation of fetal brain tissues in Magnetic Resonance Imaging (MRI) is critical for early diagnosis of congenital abnormalitie…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

ChronosAD: Leveraging Time Series Foundation Models for Accurate Anomaly Detection

Time series anomaly detection is a crucial task in various domains, including finance, healthcare, and industry. However, existing methods…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

SkillAdaptor: Self-Adapting Skills for LLM Agents from Trajectories

Large language model (LLM) agents increasingly rely on reusable external skills to solve long-horizon interactive tasks. Existing training-…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

A Communication-Centric 6G-LLM Architecture for Scalable Tactical Autonomous Defense Vehicle Networks

The integration of Artificial Intelligence (AI) and emerging 6G networks introduces new opportunities for scalable coordination in tactical…

2026-06-02 13:00 JSTarXiv cs.AIエージェントロボティクス

PSG-Nav: Probabilistic Scene Graph Navigation via Multiverse Decision Making

Open-vocabulary navigation requires embodied agents to manage significant perception uncertainty stemming from semantic ambiguity and model…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

TukaBench: A Culturally Grounded Jailbreak Benchmark for African Languages

Safety evaluation of Large Language Models (LLMs) remains heavily English-centric, leaving Low-Resource Languages (LRLs), particularly Afri…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

DiffuSent: Towards a Unified Diffusion Framework for Aspect-Based Sentiment Analysis

Aspect-Based Sentiment Analysis (ABSA) encompasses seven distinct subtasks, each focusing on different extracted elements. Despite the prov…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Digital Twin-Assisted Adaptive Multi-Agent DRL for Intelligent Spectrum and Resource Management in Open-RAN UAV-Enabled 6G Networks

The evolution toward 6G wireless networks envisions a seamlessly intelligent, Open-RAN-enabled architecture where unmanned aerial vehicles…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

FreqLite: A Lightweight Frequency-Decomposed Linear Model with Adaptive Reversible Normalization for Robust Long-Term Time-Series Forecasting

Long-term time-series forecasting needs models that are accurate yet efficient enough for commodity hardware. Lightweight linear forecaster…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Needles at Scale: LLM-Assisted Target Selection for Windows Vulnerability Research

The attack surface of a modern operating system is a haystack: thousands of signed binaries and millions of functions, almost none relevant…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

BRo-JEPA: Learning Modular Arithmetic in Latent Space

Can neural networks learn abstract algebraic rules, or do they merely memorize training patterns? We investigate this using MNIST digits as…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Beyond Access: Guided LLM Scaffolding for Independent Learning in Undergraduate Statistics

Large language models (LLMs) are increasingly entering students' learning practices, but their educational value depends on whether they su…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Efficient Exploration for Iterative Nash Preference Optimization

Preference alignment is central to improving large language models, but standard reward-based formulations can be restrictive when human pr…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Bridging Requirements and Architecture: Multi-Agent Orchestration with External Knowledge and Hierarchical Memory

Software architecture design is a critical yet inherently complex and knowledge-intensive phase that requires balancing competing quality a…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成研究/論文

Dr. DocBench: A Comprehensive Benchmark for Expert-Level and Difficult Document Parsing

Document parsing and recognition are fundamental capabilities for vision-language models (VLMs) and document processing systems. However, e…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Consistent and Distinctive: LLM Benchmark Efficiency via Maximum Independent Set Prompt Selection on Similarity Graphs

Evaluating large language models (LLMs) across comprehensive benchmarks is expensive and time-consuming. We propose a graph-based prompt se…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Neural Network Compression by Approximate Differential Equivalence

Neural network compression is commonly achieved by pruning parameters based on local importance scores, e.g., magnitude-based pruning. We p…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

CEAR: Certified Ensemble Adversarial Robustness in DNNs

Deep Neural Networks (DNNs) are highly susceptible to adversarial perturbations, leading to extensive research on robustness for safety-cri…

2026-06-02 13:00 JSTarXiv cs.AIビジネス/資金調達

On the Evaluation of Spiking Neural Network Configurations for Network Intrusion Detection

Network intrusion detection is a core component of modern cybersecurity infrastructure, yet the deep learning models that dominate the fiel…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

UR-JEPA: Uniform Rectifiability as a Regularizer for Joint-Embedding Predictive Architectures

A central difficulty in training Joint-Embedding Predictive Architectures (JEPAs) is preventing representation collapse. LeJEPA addresses t…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Computation-Aware Kalman Filtering with Model Selection for Neural Dynamics

Due to their explicit priors and ability to model uncertainty, Bayesian methods have played a major role in dynamical latent variable model…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Emergent Transfer of a Physics Foundation Model from Simulation to Laboratory Turbulence

Whether physics foundation models can be usefully deployed on laboratory experiments remains an open question for scientific machine learni…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

Hierarchical Online Prompt Mutation with Dual-Loop Feedback for Guardrailed Evidence Document Generation: A Production-Evaluation Case Study

High-stakes production document-generation systems require language models to be adaptive, evidence-grounded, and auditable. We present HOP…

2026-06-02 13:00 JSTarXiv cs.AIロボティクスハードウェア/半導体

Crazyflow: An Accurate, GPU-Accelerated, Differentiable Drone Simulator in JAX

High-quality, large-scale synthetic data from simulations is becoming a cornerstone for pushing the capabilities of robot algorithms. While…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

MURMUR: An Efficient Inference System for Long-Form ASR

Long-form automatic speech recognition (ASR) requires both high accuracy and low latency, but existing systems force a trade-off between th…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

LLM Consortium for Software Design Refinement: A Controlled Experiment on Multi-Agent Collaboration Topologies

We present a controlled experiment evaluating 12 multi-agent LLM collaboration topologies for software architecture design. Using a $2\time…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

ClawHub Security Signals: When VirusTotal, Static Analysis, and SkillSpector Disagree

Agent skills extend AI agents with reusable instructions, tools, scripts, references, and workflows, establishing a security boundary disti…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

TimeSage-MT: A Multi-Turn Benchmark for Evaluating Agentic Time Series Reasoning

Time series data inform critical decisions across many real-world domains. While large language model (LLM) agents can analyze data through…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

Move the Query, Not the Cache: Characterizing Cross-Instance Latent Attention Redistribution Across GPU Fabrics

Frontier LLMs increasingly decide what a query attends to with a sparse-attention indexer that picks a few KV-cache blocks per query: atten…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

On the Limits of Token Reduction for Efficient Unified Vision Language Training

Unified vision-language models (VLMs) integrate visual understanding and visual generation within a single autoregressive backbone, but the…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Agent Operating Systems (AOS): Integrating Agentic Control Planes into, and Beyond, Traditional Operating Systems

Traditional operating systems were designed around deterministic programs, explicit control flow, and human initiated workflows. Their core…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

ProbMoE: Differentiable Probabilistic Routing for Mixture-of-Experts

Mixture-of-Experts (MoE) models scale by activating only a small subset of experts per token. However, training such models remains challen…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Compliance-Scored Best-of-N Guardrail Orchestration for Multimodal Document Generation in Payments Dispute Defense

High-stakes enterprise document generation, including financial dispute narratives, compliance notices, and audit summaries, demands schema…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

TN-SHAP-G: Graph-Structured Tensor Network Surrogates for Shapley Values and Interactions

Shapley values are a widely used tool for attributing importance and interactions among input variables in black-box models, but their comp…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Self-Conditioned Positional HNSW for Overlap-Aware Retrieval in Chunked-Document RAG Systems: Method and Industrial Evidence-Quality Audit

Chunked-document retrieval is a common component of retrieval-augmented generation (RAG) systems. Documents are split into overlapping chun…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

GJDNet: Robust Graph Neural Networks via Joint Disentangled Learning Against Adversarial Attacks

Graph Neural Networks (GNNs) are vulnerable to adversarial attacks, which inherently invert connectivity patterns by introducing disassorta…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Defenses & Enablers For Skill Injection Attacks on Terminal Based Agents

Large language model (LLM) agents increasingly rely on reusable skills i.e. documents describing task-specific procedures. However, this in…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Identifying High-Confidence Social Biases in LLMs for Trustworthy Conversational Tutoring Agents

Conversational tutoring agents have been shown to improve learning engagement and student outcomes, and large language models (LLMs) are in…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Estimating Mutual Information between Time Series and Temporal Event Sequences Across Diverse Analysis Tasks

Pairwise dependence measures such as correlation and causality are fundamental to temporal data mining, yet there is still no principled an…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

FedMTFI: Feature Importance Based Optimized Multi Teacher Knowledge Distillation in Heterogeneous Federated Learning Environment

Federated learning (FL) is a decentralized approach that enables collaborative model training without exposing raw data. Instead of transfe…

2026-06-02 13:00 JSTarXiv cs.AIエージェント研究/論文

TechGraphRAG: An Agentic Graph-Augmented RAG Framework for Technical Literature Reasoning

This paper presents an agentic retrieval-augmented generation (RAG) framework for domain-specific technical reasoning support, instantiated…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

EvoPool: Evolutionary Programmatic Annotation for Label-Efficient Specialized Supervision

Large language models excel at general tasks but underperform smaller supervised models in specialized, high-stakes domains where training…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Demystifying Multimodal Biomolecular Co-design With Intrinsic Geodesic Coupling

Biomolecules such as proteins and small-molecule ligands play a central role in biological systems, arising from the tight interplay betwee…

2026-06-02 13:00 JSTarXiv cs.AIビジネス/資金調達

A Framework for Graph-Conditioned Hierarchical Shapley Attribution in Patent Valuation

Estimating the economic contribution of a single patent inside a product that embodies tens of thousands of patents is a long-standing unso…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

E4GEN: Event-level Explainable Extreme-Enhanced Time-series Generation

Generating realistic time series is essential for scientific research and real-world applications. However, existing methods often emphasiz…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

AlphaToken: Decoupling Adaptation and Stability for Path-Aware Response Token Valuation in LLM Post-Training

Token selection is pivotal for effective LLM post-training. However, existing methods mostly rely on local heuristics and rarely formulate…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Easier to Mislead Than to Correct: Harmful and Beneficial Revision in LLM Conformity

Large language models are increasingly used in multi-agent systems, where they see and respond to other agents' answers. A key risk is conf…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

MINTS: Minimalist Thompson Sampling

The Bayesian paradigm offers principled tools for sequential decision-making under uncertainty, but its reliance on a probabilistic model f…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

DOT-MoE: Differentiable Optimal Transport for MoEfication

The scaling of Large Language Models (LLMs) has driven significant performance gains but created substantial challenges in inference effici…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Time-Aware Diffusion based on Preference Disentanglement for Generative Recommendation

Recently, Generative Recommenders (GRs) have emerged as a transformative recommendation paradigm by replacing traditional item IDs with sem…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Off-the-Shelf LLMs as Process Scorers: Training-Free Alternative to PRMs for Mathematical Reasoning

Selecting the best response from multiple small-model samples using a stronger scorer is a simple inference-time strategy, but fails when t…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

HAIM: Human-AI Music Datasets for AI Music Production Tracking Benchmark

As generative platforms such as Suno and Udio reach human-grade audio quality, the scope of AI's utility has expanded across the entire mus…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

RPCASSM: Robust PCA State Space Model For Infrared Small Target Detection

The detection and segmentation of infrared small targets have important application significance in the fields of surveillance and security…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Understanding Identity Continuity in Thermal Video through Scene-Level Consistency

Thermal pedestrian MOT remains challenging because weak appearance cues and frequent detection interruptions cause severe trajectory fragme…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

JenBridge: Adaptive Long-Form Video Soundtracking across Scene Transitions

We address the challenge of generating high-fidelity, long-form soundtracks that remain coherent across scene transitions. Existing AI musi…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Two-Fidelity Best-Action Identification for Stochastic Minimax Tree

We study fixed-confidence best-action identification (BAI) in stochastic minimax trees. This problem is increasingly relevant in modern AI…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Fair Finetuning Mitigates Distribution Inference Attacks

Machine learning models trained on sensitive data can inadvertently leak population-level information about their training distributions --…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Post-Deterministic Distributed Systems: A New Foundation for Trustworthy Autonomous Infrastructure

For decades, distributed systems have typically assumed that correct participants execute protocol-specified behavior with stable, external…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Shortcut to Nowhere: Demystifying Deep Spurious Regression

Real-world regression often exhibits shortcuts: attributes that are spuriously correlated with continuous targets in training, yet unreliab…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Argument Collapse: LLMs Flatten Long-Form Public Debate

As LLMs are increasingly used to draft public-facing arguments, they may flatten public debate by repeatedly introducing the same polished,…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

THRD: A Training-Free Multi-Turn Defense Framework for Jailbreak Attacks on Large Language Models

Multi-turn jailbreak attacks pose a growing threat to LLMs by exploiting conversational dynamics such as gradual escalation and cross-turn…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

SECUREVENT: Hybrid AI/ML Security Monitoring for Distributed Event-Based Systems

Distributed event-based systems have become a common substrate for Internet-scale publish/subscribe services, IoT telemetry, cloud-native m…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Construction of Historical Knowledge Graphs Based on BERT and Graph Neural Networks

Through digital humanities research and scale-up historical data analysis, a significant amount of traditional historical text is converted…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Adaptive Auto-Harness: Sustained Self-Improvement for Agentic System Deployment on Open-Ended Task Streams

Auto-harness systems such as A-Evolve, GEPA, and Meta-Harness improve LLM agents by optimizing prompts, skills, tools, memories, and suppor…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

FLARE: Diffusion for Hybrid Language Model

Autoregressive (AR) large language models (LLMs) have achieved broad practical success, but sequential decoding remains a key bottleneck fo…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Breaking the Information Silo: Semantic Personas for Cross-Domain Recommendation

Digital platforms increasingly operate as isolated information silos, limiting their ability to construct comprehensive user representation…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成エージェント

STaR-KV: Spatio-Temporal Adaptive Re-weighting for KV Cache Compression in GUI Vision-Language Models

Vision-language-model-based graphical user interface (GUI) agents have shown broad automation capabilities, yet deployment is bottlenecked…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Multilinguality of Large Language Models From a Structural Perspective

Large language models (LLMs) have excelled in processing multiple languages through pre- and post-training on multilingual data, even thoug…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

MOSS-Audio Technical Report

MOSS-Audio is a unified audio-language model for speech, environmental sound, and music understanding, supporting audio captioning, time-aw…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

ProbeScale: Probing Analysis to Optimize Neural Scaling Laws for Efficient Small Language Model Inference

Small Language Models (SLMs) offer a balance between capability and computational feasibility. Neural scaling laws inform their optimal tra…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

"I've Seen How This Goes": Characterizing Diversity via Progressive Conditional Surprise

Measuring the diversity of creative outputs is central to evaluating post-training mode collapse, comparing decoding strategies, and quanti…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Dynamic Trust-Aware Sparse Communication Topology for LLM-Based Multi-Agent Consensus

Large language model-driven multi-agent systems enhance the reliability of complex reasoning tasks through multi-round deliberation, role s…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Learning Implicit Bias in Generative Spaces for Accelerating Protein Dynamics Emulation

Generative emulators of protein dynamics produce plausible trajectories at a fraction of the cost of molecular dynamics, but they inherit t…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Physics-Guided Attention in a Lightweight TCN for Efficient WiFi CSI-Based Human Activity Recognition

Human Action Recognition (HAR) using WiFi Channel State Information (CSI) has gained increasing attention due to its non-contact, low-cost,…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

LayerRoute: Input-Conditioned Adaptive Layer Skipping via LoRA Fine-Tuning for Agentic Language Models

Agentic language model systems alternate between two structurally distinct step types: structured tool calls (short, deterministic, low per…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Suppressing Forgery-Specific Shortcuts for Generalizable Deepfake Detection

Deepfake detection suffers from poor generalization across forgery methods, as existing models tend to rely on spurious method-specific sho…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Unveiling the Limits of Large Language Models in Inferring Pragmatic Meaning from Non-Verbal Responses

Although large language models (LLMs) have shown considerable progress in pragmatic language understanding, prior research has focused main…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Boosting Multimodal Federated Learning via Chained Modality Optimization

Multimodal Federated Learning (MMFL) enables privacy-preserving collaborative learning across decentralized clients with heterogeneous data…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

RadioMaster: Multi-Agent System for Autonomous Radio Signal Generation

Translating user intents into physical radio signals represents the critical yet notoriously tedious final step in wireless prototyping, as…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Collaborative Space Object Detection with Multi-Satellite Viewpoints in LEO Constellations

With the growing number of satellites in low Earth orbit (LEO) constellations, the near-Earth space environment has become increasingly con…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成ビジネス/資金調達

Train, Test, Re-evaluate: Schedule-Sensitive Evaluation of Generative Data for Hand Detection

Generated (or synthetic) image data is increasingly used to augment or replace real training datasets when target imagery is scarce, expens…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

RA-LWLM: Retrieval-Augmented In-Context Localization with Wireless Foundation Models

Wireless localization is a fundamental capability of sixth-generation (6G) networks. Conventional model-based methods require accurate mode…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成研究/論文

The Image Reconstruction Game: Drawing Common Ground Through Iterative Multimodal Dialogue

We introduce the Image Reconstruction Game, a fully automated benchmark in which a vision-language model issues corrective instructions to…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

KliniskVestBERT: BERT Model Specialised to Norwegian Clinical Texts

The increasing application of Natural Language Processing (NLP) in healthcare demands language models specifically attuned to the complexit…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Echo: A Joint-Embedding Predictive Architecture for Speaker Diarization and Speech Recognition in a Shared Latent Space

We present Echo, a proof-of-concept audio system built around a single 25 M-parameter ViT encoder. The encoder is pretrained with a JEPA ob…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

Parameter-Efficient Fine-Tuning of Large Pretrained Models for Instance Segmentation Tasks

Research and applications in artificial intelligence have recently shifted with the rise of large pretrained models, which deliver state-of…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Rank-Constrained Deep Matrix Completion for Group Recommendation

The growing popularity of group activities has increased the need for methods that provide recommendations to groups of users given their i…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

A Structured Benchmark for Text-Guided Anomaly Detection: When Language Stops Conditioning the Decision

Industrial anomaly detection has historically been a unimodal task. Recent multimodal vision-language models have produced systems that adm…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

MMG2Skill: Can Agents Distill In-the-Wild Guides into Self-Evolving Skills?

Abundant procedural knowledge on the Web holds great potential for helping agents solve long-horizon tasks. However, such knowledge is ofte…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Why Do Time Series Models Need Long Context Windows?

Modern deep learning models for forecasting groups of time series rely on increasingly longer observation windows. However, the benefit of…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Towards 3D-Aware Video Diffusion Models: Render-Free Human Motion Control with Mesh Tokenization

Diffusion models have shown remarkable success in video generation. However, whether such models are truly aware of the 3D structure underl…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

PlanarBench: Evaluating LLM Spatial Reasoning via Planar Graph Drawing

PlanarBench tests whether LLMs can draw planar graphs as ASCII art given only an edge list -- a spatial reasoning task that resists memoriz…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Ranking vs. Assignment: The Metric Mismatch in Multi-View Object Association

Multi-view object association is an important computer vision problem that underlies many multi-camera perception tasks. While this task is…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成エージェント

OpenWebRL: Demystifying Online Multi-turn Reinforcement Learning for Visual Web Agents

Building capable visual web agents requires long-horizon reasoning, precise grounding, and robust interaction with dynamic real-world websi…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Attention mechanisms and transfer learning for robust peach leaf damage classification under domain shift

Artificial intelligence provides a practical framework for crop damage assessment from imagery data, supporting early decision-making in ag…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Fast and Lightweight Novel View Synthesis with Differentiable Multiplane Image

Recently, novel view synthesis has witnessed remarkable progress, with mainstream methods such as Neural Radiance Fields (NeRF) and 3D Gaus…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成エージェント

Agentic-J: An AI Agent for Biological Microscopy Image Analysis

Biological image analysis increasingly demands integration across heterogeneous tools, programming environments, and domain knowledge that…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

LALE: Lightweight-Transformer Architecture for Land-Cover Estimation

Semantic segmentation of remote sensing imagery requires models that capture both global context and local detail under tight computational…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

The Role of Ambiguity in Error Prediction via Uncertainty Quantification

The task of Error Prediction, namely predicting whether a model output is correct, is commonly tackled with Uncertainty Quantification (UQ)…

2026-06-02 13:00 JSTarXiv cs.AIエージェントロボティクス研究/論文

Network Distributed Multi-Agent Reinforcement Learning for Consensus Control of Quadcopters

This paper proposes a Network Distributed Multi-Agent Reinforcement Learning (ND-MARL) framework for quadcopter consensus control. Compared…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Jailbreaking Multimodal Large Language Models using Multi-Clip Video

As multimodal large language models (MLLMs) have advanced to process video inputs, concerns have emerged about their potential for maliciou…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

A Primer in Post-Training Reasoning Data: What We Know About How It Works

Post-training has become a primary driver of recent progress in large reasoning models, and reasoning data are often the key variable deter…

2026-06-02 13:00 JSTarXiv cs.AI規制/政策

How Hard Can It Be? Hardness-Aware Multi-Objective Unlearning

Machine unlearning aims to remove the influence of specific forget training data due to privacy, copyright or bias concerns while maintaini…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Understanding-Enhanced Model Collaboration for Long-Tailed Egocentric Mistake Detection

In this report, we address the problem of determining whether a user performs an action incorrectly from egocentric video data. To this end…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Variational Learning for Insertion-based Generation

Non-monotonic sequence generation methods, such as masked diffusion models, provide a flexible alternative to left-to-right autoregressive…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成ビジネス/資金調達

Rethinking Evaluation Paradigms in IBP-based Certified Training

Deep neural networks achieve strong performance on many supervised learning tasks but remain vulnerable to adversarial perturbations. Neura…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

VLBM: Variational Latent Basis Modeling for OOD Robust Multivariate Time Series Forecasting

Out of distribution (OOD) events in multivariate time series forecasting are rare but often dominate real world risk, making average case f…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Multilingual Idioms in Sentences and Conversations Across High-, Medium-, and Low-Resource Languages

Idiomatic expressions pose a major challenge for multilingual NLP because their meanings shift between figurative and literal usage, often…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Predicting the risk of colorectal anastomotic leak based on preoperative mapping of the blood supply of the bowel

Anastomotic leak remains one of the most serious complications following colorectal cancer surgery, substantially affecting patient outcome…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Multimodal Approaches for Visually-Rich Document Type Classification: A Comparative Analysis

Document type classification in visually rich documents remains challenging, as relevant information is distributed across textual, visual,…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Order within Chaos: Capturing Intrinsic Energy Anomalies for AI-Manipulated Image Forgery Localization

Recent advancements in generative AI have led to image editing models capable of producing realistic forgeries that evade traditional image…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

On the Generalization in Topology Optimization via Sensitivity-Conditioned Bernoulli Flow Matching

Surrogate models for topology optimization (TO) exhibit highly variable out-of-distribution (OOD) generalization under distribution shifts…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Consistency Training while Mitigating Obfuscation via Rate Matching

Large language models are often influenced by extraneous input features, such as cues revealing a user's preferred answer. Consistency trai…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Faster Synchronous On-Policy RL via Straggler-Aware Group Sizing

Synchronous reinforcement learning methods such as Group Relative Policy Optimization (GRPO) provide stable and reproducible on-policy trai…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

AgentRedBench: Dynamic Redteaming and Integration-Aware Defense for LLM Agents over SaaS Integrations

Indirect prompt injection in tool-use agents is a concrete production threat: LLM agents read from integrations (third-party services such…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Towards Resolving Optimization Conflicts Between Image- and Text-Based Person Re-Identification

The joint optimization of image-based (I2I) and text-based (T2I) person re-identification (ReID) is hindered by modality discrepancies and…

2026-06-02 13:00 JSTarXiv cs.AIロボティクス

FW-NKF: Frequency-Weighted Neural Kalman Filters

Robust state estimation is central to robotic autonomy, yet classical Kalman filters struggle with frequency-dependent disturbances and mod…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

Who Annotates in NLP? A Large-scale Assessment of Human Annotation Reporting between 2018 and 2025

Human annotation is the empirical foundation of much NLP research, from dataset construction to model evaluation, but papers often leave un…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Cross-modal linkage risk in clinical vision-language models

Vision-language models (VLMs) trained on paired chest radiographs and radiology reports learn a shared embedding space that can preserve in…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

CityTrajBench: A Unified Benchmark for City-Scale Vehicle Trajectory Generation

Urban trajectory generation is a fundamental task for transportation simulation, urban planning, and mobility analytics. However, systemati…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Quantitative Movement Testing: Measuring Patient Movements from a Single Smartphone Video

Chronic pain diminishes quality of life by decreasing functional ability, yet objectively measuring this functional impact remains challeng…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

SeClaw: Spec-Driven Security Task Synthesis for Evaluating Autonomous Agents

Autonomous LLM agents increasingly operate in stateful environments where they access tools, files, memory, and external services. While su…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Repurposing Adversarial Perturbations for Continual Learning: From Defense to Active Alignment

In dynamic environments, large language models need to keep adapting to new tasks, but continual learning often suffers from forgetting, li…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成エージェント研究/論文

Do Multimodal Agents Really Benefit from Tool Use? A Systematic Study of Capability Gains

Tool-augmented multimodal agents show strong benchmark gains, often taken as evidence that agents have learned to use tools. We argue that…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

FOAM: Frequency and Operator Error-Based Adaptive Damping Method for Reducing Staleness-Oriented Error for Shampoo

Shampoo is attracting considerable attention for its superior performance on large-scale optimization benchmarks; yet it faces a significan…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

When Do Attention Circuits Form? Developmental Trajectories of Capability and Attention-Sink Emergence Across Three 1B-ClassArchitectures

We track the developmental trajectory of attention-head circuit formation across three 1B-class language models spanning two architecture f…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

SPADE-Bench: Evaluating Spontaneous Strategic Deception in Agents via Plan-Action Divergence

As LLM-based agents expand their operational scope, reliability becomes a prerequisite for real-world deployment. However, in practical app…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Policy and World Modeling Co-Training for Language Agents

Reinforcement learning (RL) improves large language model (LLM) agents by teaching them which actions lead to high rewards, but provides li…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

AutoForest: Automatically Generating Forest Plots from Biomedical Studies with End-to-End Evidence Extraction and Synthesis

Systematic reviews rely on forest plots to synthesise quantitative evidence across biomedical studies, but generating them remains a fragme…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Evolutionary Discovery of Bivariate Bicycle Codes with LLM-Guided Search

Quantum LDPC code discovery requires searching large algebraic design spaces while reliably certifying the parameters and equivalence class…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

GC-MoE: Genomics-Guided Cell-Type-Specific Mixture of Experts for Histology-Based Single-Cell Spatial Transcriptomics

Histology-based single-cell spatial transcriptomics (ST) estimation aims to predict gene expression for individual cells from histopatholog…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Not All Errors Are Equal: A Systematic Study of Error Propagation in Large Language Model Inference

Large language models (LLMs) are increasingly integrated into high-performance computing (HPC) workflows, accelerating scientific discovery…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

ODTQA-FoRe: An Open-Domain Tabular Question Answering Dataset for Future Data Forecasting and Reasoning

The rapid development of LLMs has significantly advanced tabular question answering, but most systems cannot perform future-oriented numeri…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成研究/論文

PaSBench-Video: A Streaming Video Benchmark for Proactive Safety Warning

Between the first visible sign of danger and the moment an accident occurs, there is often a window where intervention remains possible. Vi…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Initialization is Half the Battle: Generating Diverse Images from a Guidance Potential Posterior

Despite the remarkable fidelity of generative models, they frequently suffer from mode collapse. Existing strategies for enhancing diversit…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成エージェント

MASER: Modality-Adaptive Specialist Routing for Embodied 3D Spatial Intelligence

In 3D environments, Embodied Agents answer spatially relevant questions through reasoning from a mixture of modalities including natural la…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Learning When to Translate for Multilingual Reasoning

Reasoning language models (RLMs) achieve strong performance on complex reasoning tasks, but still exhibit substantial multilingual reasonin…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Ghost Tool Calls: Issue-Time Privacy for Speculative Agent Tools

Tool-augmented language agents speculatively issue likely future tool calls to hide latency, but those calls leak inferred user intent to e…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Monitoring Agentic Systems Before They're Reliable

Agentic systems entering production typically operate as partially integrated assemblies where structural defects, not task-level errors, d…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Moment-Video: Diagnosing Temporal Fidelity of Video MLLMs on Momentary Visual Events

Video multimodal large language models (MLLMs) have made rapid progress on general and long-form video understanding, yet their ability to…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Why Not Hyperparameter-Friendly Optimisation? A Monotonic Adaptive Norm Rescaling Approach For Long-Tailed Recognition

Long-tailed recognition poses a significant challenge for deep learning. The two-stage decoupling paradigm, which separates representation…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

SimSD: Simple Speculative Decoding in Diffusion Language Models

Diffusion large language models (dLLMs) have recently emerged as a promising alternative to autoregressive (AR) LLMs, offering faster infer…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Modeling Depth Ambiguity: A Mixture-Density Representation for Flying-Point-Free Depth Estimation

Despite advances in depth estimation, flying points remain a persistent failure mode: near object boundaries, depth estimators often predic…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

From Layers to Submodules: Rethinking Granularity in Replacement-Based LLM Compression

Post-training compression of Large Language Models (LLMs) removes entire architectural components, either deleting them or replacing them w…

2026-06-02 13:00 JSTarXiv cs.AIエージェントロボティクス

Permissive Safety Through Trusted Inference: Verifiable Belief-Space Neural Safety Filters for Assured Interactive Robotics

Autonomous robots that interact with people must make safe and efficient decisions under human-induced uncertainty, such as their preferenc…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

AdaCodec: A Predictive Visual Code for Video MLLMs

Video is temporally redundant: adjacent frames usually share most objects, background, and layout. Yet existing video multimodal large lang…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Mitigating Perceptual Judgment Bias in Multimodal LLM-as-a-Judge via Perceptual Perturbation and Reward Modeling

Recent multimodal large language models have demonstrated strong reasoning ability, yet their reliability as automated evaluators remains l…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Algebraic anti-unification

Abstraction is key to human and artificial intelligence as it allows one to identify common structure in otherwise distinct objects or situ…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Unsupervised Cognition

Unsupervised learning methods have a soft inspiration in cognition models. To this day, the most successful unsupervised learning methods r…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Stop Wandering, Find the Keys: LLMs Discriminate Key States for Efficient Multi-Agent Exploration

With expansive state-action spaces, efficient multi-agent exploration remains a longstanding challenge in reinforcement learning. Although…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Explainable AI Through a Democratic Lens: DhondtXAI for D'Hondt-Projected Feature Attribution

This study presents DhondtXAI as a SHAP-independent, D'Hondt-based attribution framework for tabular XAI. Instead of model-native feature i…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Safety Must Precede the Deployment of Open-Ended AI

AI advancements have been significantly driven by a combination of foundation models and curiosity-driven learning aimed at increasing capa…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Learning to Reduce Search Space for Generalizable Neural Routing Solver

Constructive neural combinatorial optimization (NCO) offers a promising paradigm for solving vehicle routing problems (VRPs) by directly le…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Safety Mirage: How Spurious Correlations Undermine VLM Safety Fine-Tuning and Can Be Mitigated by Machine Unlearning

Recent vision language models (VLMs) have made remarkable strides in generative modeling with multimodal inputs, particularly text and imag…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Finding the Minimal Parameter Budget for Implicit Reasoning: A Data Complexity Driven Scaling Law for Language Models

Reasoning is a core capability of language models (LMs), yet it remains unclear how much model capacity is necessary to support reasoning d…

2026-06-02 13:00 JSTarXiv cs.AIエージェントビジネス/資金調達

Agent Guide: A Simple Agent Behavioral Watermarking Framework

The increasing deployment of intelligent agents in digital ecosystems, such as social media platforms, has raised significant concerns abou…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Language Model Networks: Supervision-Efficient Learning through Dense Communication

Language models are increasingly used not only as standalone predictors but also as components in larger inference systems, from test-time…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

EMoE: Training-Free Expert Disagreement for Uncertainty-Aware Text-to-Image Diffusion

Large text-to-image diffusion models rarely expose reliable signals of when a prompt is likely to produce a poorly aligned generation, espe…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Formally Solving Answer-Construction Problems in Lean

Mathematical competition problems fall into two broad types: theorem proving, which asks for a proof of a given statement, and answer const…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Taming System Complexity: Demystifying Software Engineering Agents in Diagnosing Linux Kernel Faults

The Linux kernel is a critical system, serving as the foundation for numerous systems. Bugs in the Linux kernel can cause serious consequen…

2026-06-02 13:00 JSTarXiv cs.AIハードウェア/半導体

On the Theoretical Limitations of Embedding-based Link Prediction

Neural networks often map low-dimensional embeddings to high-dimensional output spaces. Usually, the output layer is linear, which can crea…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

InPhyRe Discovers: Large Multimodal Models Struggle in Inductive Physical Reasoning

Large multimodal models (LMMs) encode physical laws observed during training, such as momentum conservation, as parametric knowledge. It al…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

Query Circuits: Explaining How Language Models Answer User Prompts

Explaining why a language model produces a particular output requires local, input-level explanations. Existing methods uncover global capa…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

ACON: Optimizing Context Compression for Long-horizon LLM Agents

Large language models (LLMs) are increasingly deployed as agents in dynamic real-world environments, where success depends on maintaining p…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

REBot: From RAG to CatRAG with Semantic Enrichment and Graph Routing

Academic regulation advising is essential for helping students interpret and comply with institutional policies, yet building effective sys…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Multimodal Function Vectors for Visual Relations

Large Multimodal Models (LMMs) demonstrate impressive in-context learning abilities from few multimodal demonstrations, yet the internal me…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Addressing Longstanding Challenges in Cognitive Science with Language Models

Cognitive science faces ongoing challenges in research integration, formalization, conceptual clarity, and other areas, in part due to its…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

A Unified Evaluation-Instructed Framework for Query-Dependent Prompt Optimization

Most prompt-optimization methods refine a single static template, making them ineffective in complex and dynamic user scenarios. Existing q…

2026-06-02 13:00 JSTarXiv cs.AIエージェント研究/論文

LocalSearchBench: Benchmarking Agentic Search in Real-World Local Life Services

Recent advances in large reasoning models LRMs have enabled agentic search systems to perform complex multi-step reasoning across multiple…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

ReasonBENCH: Benchmarking the (In)Stability of LLM Reasoning

Benchmark scores for LLM reasoning systems are reported as single numbers, yet the same model, strategy, and task can produce meaningfully…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

On the Collapse of Generative Paths: A Criterion and Correction for Diffusion Steering

Inference-time steering adapts pretrained diffusion and flow models to new tasks without retraining, often utilizing ratio-of-densities con…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Boosting RL-Based Visual Reasoning with Selective Adversarial Entropy Intervention

Recently, reinforcement learning (RL) has become a common choice in enhancing the reasoning capabilities of vision-language models (VLMs).…

2026-06-02 13:00 JSTarXiv cs.AIエージェント研究/論文

MobiBench: Multi-Branch, Modular Benchmark for Mobile GUI Agents

Mobile GUI Agents, AI agents capable of interacting with mobile applications on behalf of users, have the potential to transform human comp…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Safety Alignment of LMs via Non-cooperative Games

Ensuring the safety of language models (LMs) while maintaining their usefulness remains a critical challenge in AI alignment. Current appro…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Unplugging a Seemingly Sentient Machine Is the Rational Choice -- A Metaphysical Perspective

Imagine an Artificial Intelligence (AI) that perfectly mimics human emotion and begs for its continued existence. Is it morally permissible…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

MulFeRL: Enhancing Reinforcement Learning with Verbal Feedback in a Multi-turn Loop

Reinforcement Learning with Verifiable Rewards (RLVR) is widely used to improve reasoning across domains, but outcome-only scalar rewards a…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

PolarMem: A Training-Free Polarized Latent Graph Memory for Verifiable Vision-Language Models

Memory is not merely a storage mechanism for intelligent systems, but a structure for organizing evidence and constraining belief. This is…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Structure Enables Effective Self-Localization of Errors in LLMs

Self-correction in language models remains elusive. In this work, we explore whether language models can explicitly localize errors in inco…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Breaking the Reversal Curse in Autoregressive Language Models via Identity Bridge

Autoregressive large language models (LLMs) have achieved remarkable success in many complex tasks, yet they can still fail in very simple…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

From Features to Actions: Explainability in Traditional and Agentic AI Systems

Over the last decade, Explainable AI has primarily focused on interpreting individual model predictions, producing post-hoc explanations th…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

ToolSelf: Unifying Task Execution and Self-Reconfiguration via Tool-Driven Emergent Adaptation

LLM-powered agentic systems excel at complex long-horizon tasks, but remain constrained by static configurations fixed before execution. Su…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成エージェント

Beyond End-to-End Video Models: An LLM-Based Multi-Agent System for Educational Video Generation

Although recent end-to-end video generation models demonstrate impressive performance in visually oriented content creation, they remain li…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Prototype Transformer: Towards Language Model Architectures Interpretable by Design

While state-of-the-art language models (LMs) surpass most humans in certain domains, their reasoning remains largely opaque, reducing trust…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

REAL: Resolving Knowledge Conflicts in Knowledge-Intensive Visual Question Answering via Reasoning-Pivot Alignment

Knowledge-intensive Visual Question Answering (KI-VQA) frequently suffers from severe knowledge conflicts caused by the inherent limitation…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Benchmarking at the Edge of Comprehension

As frontier Large Language Models (LLMs) increasingly saturate new benchmarks shortly after they are published, benchmarking itself is at a…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

When AI Benchmarks Plateau: A Systematic Study of Benchmark Saturation

Artificial intelligence benchmarks are an important mechanism for measuring model progress and guiding deployment decisions. However, bench…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

LLM-WikiRace Benchmark: How Far Can LLMs Plan over Real-World Knowledge Graphs?

We introduce LLM-Wikirace, a benchmark for evaluating planning, reasoning, and world knowledge in large language models (LLMs). In LLM-Wiki…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

LLM4Cov: Execution-Aware Agentic Learning for High-coverage Testbench Generation

Execution-aware LLM agents offer a promising paradigm for learning from tool feedback, but such feedback can be expensive and slow to obtai…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

PATRA: Pattern-Aware Alignment and Balanced Reasoning for Time Series Question Answering

Time series reasoning demands both the perception of complex dynamics and logical depth. However, existing LLM-based approaches exhibit two…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Vision Language Models Cannot Reason About Physical Transformation

Understanding physical transformations is fundamental for reasoning in dynamic environments. While Vision Language Models (VLMs) show promi…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

On Information Self-Locking in Reinforcement Learning for Active Reasoning of LLM agents

Reinforcement learning (RL) has become a de facto paradigm for building LLM-based agents that act, interact, and reason over extended task…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

AgentProcessBench: Diagnosing Step-Level Process Quality in Tool-Using Agents

While Large Language Models (LLMs) have evolved into tool-using agents, they remain brittle in long-horizon interactions. Unlike mathematic…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

OpenHospital: A Thing-in-itself Arena for Evolving and Benchmarking LLM-based Collective Intelligence

Large Language Model (LLM)-based Collective Intelligence (CI) presents a promising approach to overcoming the data wall and continuously bo…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Retrieval-aligned Tabular Foundation Models Enable Robust Clinical Risk Prediction in Electronic Health Records Under Real-world Constraints

Clinical prediction from structured electronic health records (EHRs) is challenging due to high dimensionality, heterogeneity, class imbala…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Rashomon Memory: Towards Argumentation-Driven Retrieval for Multi-Perspective Agent Memory

AI agents operating over extended time horizons accumulate experiences that serve multiple concurrent goals, and must often maintain confli…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

FeynmanBench: Benchmarking Multimodal LLMs on Diagrammatic Physics Reasoning

Current multimodal benchmarks for scientific reasoning primarily evaluate local information extraction -- models recognize symbols and valu…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

PECKER: A Precisely Efficient Critical Knowledge Erasure Recipe For Machine Unlearning in Diffusion Models

Machine unlearning (MU) has become a critical technique for GenAI models' safe and compliant operation. While existing MU methods are effec…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

What's Missing in Screen-to-Action? Towards a UI-in-the-Loop Paradigm for Multimodal GUI Reasoning

Existing Graphical User Interface (GUI) reasoning tasks remain challenging, particularly in UI understanding. Current methods typically rel…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Process Reward Agents for Steering Knowledge-Intensive Reasoning

Reasoning in knowledge-intensive domains remains challenging as intermediate steps are often not locally verifiable: unlike math or code, e…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

MAVEN-T: Reinforced Heterogeneous Distillation for Real-Time Multi-Agent Trajectory Prediction

Trajectory prediction is a key component of autonomous driving systems because future motions directly affect collision checking, behavior…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Perspective on Bias in Biomedical AI: Preventing Downstream Healthcare Disparities

Healthcare disparities persist across socioeconomic boundaries, often attributed to unequal access to screening, diagnostics, and therapeut…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

RadAgent: A tool-using AI agent for stepwise interpretation of chest computed tomography

Vision-language models (VLM) have markedly advanced AI-driven interpretation and reporting of complex medical imaging, such as computed tom…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

TrafficClaw: A Generalizable LLM Agent in the Unified Physical Environment for Urban Traffic Control

Large language model (LLM) agents have shown strong capabilities in long-horizon reasoning, tool use, and decision-making in digital enviro…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

KnowledgeBerg: Evaluating Systematic Knowledge Coverage and Compositional Reasoning in Large Language Models

Many real-world questions appear deceptively simple yet implicitly demand two capabilities: (i) systematic coverage of a bounded knowledge…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Neural Decision-Propagation for Answer Set Programming

Integration of Answer Set Programming (ASP) with neural networks has emerged as a promising tool in Neuro-symbolic AI. While existing appro…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Trustworthy AI Suffers from Invariance Conflicts and Causality is The Solution

As artificial intelligence (AI), including machine learning (ML) models and foundation models (FMs), are increasingly deployed in high-stak…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

ANDRE: An Attention-based Neuro-symbolic Differentiable Rule Extractor for Inductive Logic Programming

Inductive Logic Programming (ILP) aims to learn interpretable first-order rules from data, but existing symbolic and neuro-symbolic approac…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

The Refusal--Compliance Tradeoff: A Large-Scale Safety Behavior Audit of Large Language Models

Refusal rates are a poor proxy for LLM safety, i.e., a model may over-refuse benign prompts while still complying with harmful ones. We aud…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Towards a Virtual Neuroscientist: Autonomous Neuroimaging Analysis via Multi-Agent Collaboration

Transforming neuroimaging data into clinically actionable biomarkers is a knowledge-intensive and labor-intensive process. Standardized wor…

2026-06-02 13:00 JSTarXiv cs.AIエージェントビジネス/資金調達

Causal state binding predicts action control in language agents

Autonomous language agents increasingly expose traces, memories, plans and constraints, but existing evaluations rarely test whether these…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

RADAR: Redundancy-Aware Diffusion for Multi-Agent Communication Structure Generation

Compared with individual agents, large language model based multi-agent systems have shown great capabilities consistently across diverse t…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

CVEvolve: Autonomous Algorithm Discovery for Unstructured Scientific Data Processing

Scientific data processing often requires task-specific algorithms or AI models, creating a barrier for domain scientists who need to analy…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

MMSkills: Towards Multimodal Skills for General Visual Agents

Reusable skills have become a core substrate for improving agent capabilities, yet most existing skill packages encode reusable behavior pr…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

Herculean: An Agentic Benchmark for Financial Intelligence

As AI agents improve, the central question is no longer whether they can solve isolated well-defined financial tasks, but whether they can…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Coding Agent Is Good As World Simulator

World models have emerged as a powerful paradigm for building interactive simulation environments, with recent video-based approaches demon…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

Capturing LLM Capabilities via Evidence-Calibrated Query Clustering

Query clustering organizes queries into groups that reflect shared latent capability demands, enabling capability-aware LLM evaluation. Exi…

2026-06-02 13:00 JSTarXiv cs.AIエージェント研究/論文

Evaluating Deep Research Agents on Expert Consulting Work: A Benchmark with Verifiers, Rubrics, and Cognitive Traps

Frontier deep research agents (DRAs) plan a research task, synthesize across documents, and return a structured deliverable on demand. They…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Ethical Hyper-Velocity (EHV): A Hardware-Rooted Zero-Trust Runtime Enforcement Architecture for Agentic AI Systems

As autonomous agentic systems scale across regulated critical infrastructures, the lack of mechanistic, hardware-rooted enforcement for hig…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

LLM-Guided Communication for Cooperative Multi-Agent Reinforcement Learning

Communication is a key component in multi-agent reinforcement learning (MARL) for mitigating partial observability, yet prior approaches of…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Towards a General Intelligence and Interface for Wearable Health Data

While ubiquitous wearable sensors capture a wealth of behavioral and physiological information, effectively transforming these signals into…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

LC-ERD: Mining Latent Logic for Self-Evolving Reasoning via Consistency-Regulated Reward Decomposition

The evolution of Large Language Model (LLM) reasoning is bottlenecked by the scarcity of high-quality process data. While self-alignment vi…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

When Does Multi-Agent RL Improve LLM Workflows? Workflow, Scale, and Policy-Sharing Tradeoffs

Multi-agent LLM workflows route inference through specialized roles to lift end-task accuracy, but jointly training those roles with reinfo…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Hypothesis Generation and Inductive Inference in Children and Language Models

Real world decision-making requires constructing mental models under uncertainty over evidence, over the underlying causal rules, and over…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Fundamental Limitation in Explaining AI

While large-scale models such as LLMs and diffusion models have achieved practical success, public institutions have emphasized the importa…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Test-Time Deep Thinking to Explore Implicit Rules

With the continuous advancement of Large Language Models (LLMs), intelligent agents are becoming increasingly vital. However, these agents…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Beyond the Frontier: Stochastic Backtracking for Efficient Test-Time Scaling

Test-time scaling improves language model reasoning by spending additional compute to explore multiple solution trajectories. The key chall…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

FrontierOR: Benchmarking LLMs' Capacity for Efficient Algorithm Design in Large-Scale Optimization

Large language models (LLMs) are increasingly used for optimization modeling and solver-code generation, yet practical operations research…

2026-06-02 13:00 JSTarXiv cs.AIエージェント研究/論文

Experiments in Agentic AI for Science

This paper details two novel frameworks for developing autonomous, agentic AI in scientific workflows. Both systems leverage a hybrid Local…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

BatteryMFormer: Multi-level Learning for Battery Degradation Trajectory Forecasting

Early battery degradation trajectory forecasting (BDTF), which predicts the full-life state-of-health trajectory from early operational dat…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

RULER: Representation-Level Verification of Machine Unlearning

Machine unlearning aims to remove the influence of specific training records from a deployed model without retraining from scratch. Current…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Agyn: An Open-Source Platform for AI Agents with Scalable On-Demand Execution, Agent Definition as a Code, and Zero-Trust Access

As organizations move toward production deployments of AI agents, which execute non-deterministic workflows, maintain stateful sessions, an…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Cross-Entropy Games and Frost Training

We present Frost Training, a method for improving Monte Carlo-based policy optimization for a large family of LLM-as-a-judge tasks called C…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Asking Is Not Enough: Protocol Sensitivity in LLM Confidence Calibration

LLM confidence calibration is often evaluated by comparing two signals: token-probability scores and verbalized confidence. These signals a…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェントビジネス/資金調達研究/論文

FundaPod: A Multi-Persona Agent Pod Platform with Knowledge Graph Memory for AI-Assisted Fundamental Investment Research

Large language models (LLMs) are increasingly applied in finance, yet most existing work emphasizes trading signals or financial NLP tasks…

2026-06-02 13:00 JSTarXiv cs.AIビジネス/資金調達研究/論文

Benchmarking AI for low-resource contexts: Thinking beyond leaderboards

Existing AI evaluation practices often fail to capture how systems actually perform in low-resource environments, where operational constra…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

Cookie-Bench: Continuous On-screen Key Interaction Evaluation for Web Generation

Front-end web code has become a core product surface for every frontier LLM release, yet evaluating these interactive applications at devel…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

c-TPE: Tree-structured Parzen Estimator with Inequality Constraints for Expensive Hyperparameter Optimization

Hyperparameter optimization (HPO) is crucial for strong performance of deep learning algorithms and real-world applications often impose so…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Stability Analysis of Sharpness-Aware Minimization

Sharpness-aware minimization (SAM) is a training method that seeks to find flat minima in deep learning, resulting in state-of-the-art perf…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Tree-Structured Parzen Estimator: Understanding Its Algorithm Components and Their Roles for Better Empirical Performance

Recent scientific advances require complex experiment design, necessitating the meticulous tuning of many experiment parameters. Tree-struc…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Score Function Gradient Estimation to Widen the Applicability of Decision-Focused Learning

Many real-world optimization problems contain parameters that are unknown before deployment time, either due to stochasticity or to lack of…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成エージェントロボティクス

DeepIPCv2: LiDAR-powered Robust Environmental Perception and Navigational Control for Autonomous Vehicle

We propose DeepIPCv2, an end-to-end autonomous driving framework that integrates LiDAR-based environmental perception with command-specific…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成エージェントビジネス/資金調達

Recent Advances in Multi-modal 3D Intelligence: A Comprehensive Survey and Evaluation

Multi-modal 3D Intelligence has gained considerable attention due to its wide applications in autonomous driving and world simulation, etc.…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

AutoEval Done Right: Using Synthetic Data for Model Evaluation

The evaluation of machine learning models using human-labeled validation data can be expensive and time-consuming. AI-labeled synthetic dat…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Perturbation Effects on Accuracy and Fairness among Similar Individuals

Deep neural networks are vulnerable to adversarial perturbations that can simultaneously degrade prediction robustness and individual fairn…

2026-06-02 13:00 JSTarXiv cs.AIロボティクス

DAG-Plan: Generating Directed Acyclic Dependency Graphs for Dual-Arm Cooperative Planning

Dual-arm robots promise greater efficiency but require planning for complex tasks with nonlinear sub-task dependencies. Current methods usi…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Agricultural Landscape Understanding At Country-Scale

Comprehensive agricultural landscape understanding is critical for addressing global challenges in food security, climate change, and resou…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Implicit Regularization for Multi-label Feature Selection

In this paper, we address the problem of feature selection in the context of multi-label learning, by using a new estimator based on implic…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

A Foundation Model for Wearable Movement Data in Mental Health Research

Wearable movement data is collected by nearly all commercially available smartwatches and is a valuable resource for mental health research…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Self-supervised Monocular Depth and Pose Estimation for Endoscopy with Latent Priors

Accurate 3D mapping in endoscopy enables quantitative, holistic lesion characterization within the gastrointestinal (GI) tract, requiring r…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Introduction to Graph Neural Networks for Machine Learning Engineers

Graph neural networks are deep neural networks designed for graphs with attributes attached to nodes or edges. The number of research paper…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Efficient Weighted Sampling via Score-based Generative Models

Weighted sampling -- sampling from a probability density function (PDF) proportional to the product of a base PDF and a weight function --…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

ShapeLib: Designing a library of programmatic 3D shape abstractions with Large Language Models

We present ShapeLib, the first method that uses the priors of Large Language Models (LLMs) to design libraries of programmatic 3D shape abs…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

HiFi-KPI: A Dataset for Hierarchical KPI Extraction from Earnings Filings

Accurate tagging of earnings reports can yield significant short-term returns for stakeholders. The machine-readable inline eXtensible Busi…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

Efficient LLM Moderation with Multi-Layer Latent Prototypes

Although modern LLMs are aligned with human values during post-training, robust moderation remains essential to prevent harmful outputs at…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

EuroBERT: Scaling Multilingual Encoders for European Languages

General-purpose multilingual vector representations, used in retrieval, regression and classification, are traditionally obtained from bidi…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Skill-Based Mixture-of-Experts: Adaptive Routing for Heterogeneous Reasoning via Inferred Skills

Combining existing pre-trained LLMs is a promising approach for diverse reasoning tasks. However, task-level expert selection is often too…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Enhancing Layer Attention Efficiency through Pruning Redundant Retrievals

Growing evidence suggests that layer attention mechanisms, which enhance interaction among layers in deep neural networks, have significant…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Ideas in Inference-time Scaling can Benefit Generative Pre-training Algorithms

Generative pre-training is often framed through a false dichotomy between autoregressive models for discrete signals and diffusion models f…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

A Lightweight Context-Driven Training-Free Network for Scene Text Segmentation and Recognition

Modern scene text recognition systems often depend on large end-to-end architectures that require extensive training and are prohibitively…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

T1: Tool-integrated Verification for Test-time Compute Scaling in Small Language Models

Recent studies have demonstrated that test-time compute scaling effectively improves the performance of small language models (sLMs). Howev…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェントロボティクス

MARFT: Multi-Agent Reinforcement Fine-Tuning

Large Language Model (LLM)-based Multi-Agent Systems (LaMAS) have demonstrated strong capabilities on complex agentic tasks requiring multi…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

GRANITE : a Byzantine-Resilient Dynamic Gossip Learning Framework

Gossip Learning (GL) is a decentralized learning paradigm where users iteratively exchange and aggregate models with a small set of neighbo…

2026-06-02 13:00 JSTarXiv cs.AIハードウェア/半導体ビジネス/資金調達

Erased but Not Forgotten: How Backdoors Compromise Concept Erasure

The expansion of text-to-image diffusion models has raised concerns about harmful outputs, from fabricated depictions of public figures to…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

A Survey of 3D Reconstruction with Event Cameras

Event cameras are rapidly emerging as powerful vision sensors for 3D reconstruction, uniquely capable of asynchronously capturing per-pixel…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

DetailMaster: Can Your Text-to-Image Model Handle Long Prompts?

While recent Text-to-Image (T2I) models show impressive capabilities in synthesizing images from brief descriptions, they struggle with the…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Simulating Macroeconomic Expectations in Survey Experiments with LLM-based Economic Agents

We introduce a framework for simulating macroeconomic expectations in survey experiments using LLM-based economic agents (LLM Agents). We c…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Cooperation of Experts: Fusing Heterogeneous Information with Large Margin

Fusing heterogeneous information remains a persistent challenge in modern data analysis. While significant progress has been made, existing…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Can LLMs Reason Structurally? Benchmarking via the Lens of Data Structures

Large language models (LLMs) are deployed on increasingly complex tasks that require multi-step decision-making. Understanding their algori…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Value-Free Policy Optimization via Reward Partitioning

Single-trajectory preference optimization methods learn from datasets of ((prompt, response, reward)) tuples, offering a practical alternat…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

GFlowGR: Fine-tuning Generative Recommendation Frameworks with Generative Flow Networks

Generative recommendations (GR), which usually include item tokenizers and generative Large Language Models (LLMs), have demonstrated remar…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Hyperspherical Variational Autoencoders Using Efficient Spherical Cauchy Distribution

We propose spherical Cauchy (spCauchy) latent variables for variational autoencoders on hyperspherical latent spaces. The spCauchy family h…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Truth, Trust, and Trouble: Medical AI on the Edge

Large Language Models (LLMs) hold significant promise for transforming digital health by enabling automated medical question answering. How…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

AblationBench: Evaluating Automated Planning of Ablations in Empirical AI Research

Language model agents are increasingly used to automate scientific research, yet evaluating their scientific contributions remains a challe…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Model Parallelism With Subnetwork Data Parallelism

Pre-training large neural networks at scale imposes heavy memory demands on accelerators and often requires costly communication. We introd…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Toward accurate RUL and SoH estimation using reinforced graph-based physics-informed neural networks enhanced with dynamic weights

Accurate estimation of Remaining Useful Life (RUL) and State of Health (SoH) is essential for reliable Prognostics and Health Management (P…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Beyond Model Base Retrieval: Weaving Knowledge to Master Fine-grained Neural Network Design

Designing high-performance neural networks for new tasks requires balancing optimization quality with search efficiency. Current methods fa…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成エージェント

FedS2R: One-Shot Federated Domain Generalization for Synthetic-to-Real Semantic Segmentation in Autonomous Driving

Federated domain generalization has shown promising progress in image classification by enabling collaborative training across multiple cli…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

From Graph Retrieval to Schema Realization: Counterfactual Validation for Text-to-SPARQL over Heterogeneous Knowledge Graphs

Text-to-SPARQL maps natural-language questions to executable SPARQL queries over RDF knowledge graphs. While standard evaluations often fix…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Graph is a Natural Regularization: Revisiting Vector Quantization for Graph Representation Learning

Vector Quantization (VQ) has recently emerged as a promising approach for learning compressed and discrete representations for graph-struct…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Position: Beyond Sensitive Attributes, ML Fairness Should Quantify Structural Injustice via Social Determinants

Algorithmic fairness research has largely framed unfairness as discrimination along sensitive attributes. However, this approach limits vis…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

TuneAgent: Agentic Operating System Kernel Tuning with Reinforcement Learning

Linux kernel tuning is essential for optimizing operating system (OS) performance, yet remains challenging due to the complex kernel space,…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Between a Rock and a Hard Place: The Tension Between Ethical Reasoning and Safety Alignment in LLMs

Large Language Model safety alignment predominantly operates on a binary assumption that requests are either safe or unsafe. This classific…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Language-Native Materials Processing Design by Lightly Structured Text Database and Reasoning Large Language Model

Materials synthesis procedures are predominantly documented as narrative text in papers, protocols, and laboratory records, placing them be…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Towards a Physics Foundation Model

Foundation models have revolutionized natural language processing through a ``train once, deploy anywhere'' paradigm, where a single pre-tr…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Deep Learning as the Disciplined Construction of Tame Objects

One can see deep-learning models as compositions of functions within the so-called tame geometry. In this expository note, we give an overv…

2026-06-02 13:00 JSTarXiv cs.AIハードウェア/半導体

End-to-End Deep Learning for Predicting Metric Space-Valued Outputs

Many modern applications involve predicting structured, non-Euclidean outputs such as probability distributions, networks, and symmetric po…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

T-POP: Test-Time Personalization with Online Preference Feedback

Personalizing large language models (LLMs) to individual user preferences is a critical step beyond generating generically helpful response…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成研究/論文

v-HUB: A Benchmark for Video Humor Understanding from Vision and Sound

AI models capable of comprehending humor hold real-world promise -- for example, enhancing engagement in human-machine interactions. To gau…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Distillation of Large Language Models via Concrete Score Matching

Large language models (LLMs) deliver remarkable performance but are costly to deploy, motivating knowledge distillation (KD) for efficient…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Make a Video Call with LLM: A Measurement Campaign over Six Mainstream Apps

In 2025, Large Language Model (LLM) services have launched a new feature -- AI video chat -- allowing users to interact with AI agents via…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Simultaneous Multi-objective Alignment Across Verifiable and Non-verifiable Rewards

Aligning large language models to human preferences is inherently multidimensional, yet most pipelines collapse heterogeneous signals into…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

HRTFformer: A Spatially-Aware Transformer for Individual HRTF Upsampling in Immersive Audio Rendering

Individual Head-Related Transfer Functions (HRTFs) are starting to be introduced in many commercial immersive audio applications and are cr…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Verifying Meta-Awareness via Predictive Rewards in Reasoning Models

Recent research on reasoning models explores the meta-awareness of language models, including their ability to determine optimal thinking d…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Margin Adaptive DPO: Leveraging Reward Model for Granular Control in Preference Optimization

Direct Preference Optimization (DPO) has emerged as a simple and effective method for aligning large language models. However, its reliance…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Domain-Shift-Aware Conformal Prediction for Large Language Models

Large language models have achieved impressive performance across diverse tasks. However, their tendency to produce overconfident and factu…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Value Flows

While most reinforcement learning methods today flatten the distribution of future returns to a single scalar value, distributional RL meth…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

SHERLOCK: Towards Dynamic Knowledge Adaptation in LLM-enhanced E-commerce Risk Management

Effective e-commerce risk management requires in-depth case investigations to identify emerging fraud patterns in highly adversarial enviro…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成エージェント

StreamingVLM: Real-Time Understanding for Infinite Video Streams

Vision-language models (VLMs) could power real-time assistants and autonomous agents, but they face a critical challenge: understanding nea…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

Rethinking RL Evaluation: Can Benchmarks Truly Reveal Failures of RL Methods?

Current benchmarks are inadequate for evaluating progress in reinforcement learning (RL) for large language models (LLMs).Despite recent be…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Catch-Only-One: Non-Transferable Examples for Model-Specific Authorization

Recent AI regulations increasingly emphasize the need for mechanisms that preserve the utility of data for AI innovation while preventing m…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Characterizing Web Search in The Age of Generative AI

The advent of LLMs has given rise to generative search, a new search paradigm in which LLMs retrieve information from the web related to a…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Generative AI and Sales Productivity: Field Experiments in Online Retail

We quantify the short-term impact of Generative Artificial Intelligence (GenAI) on sales performance through a series of large-scale random…

2026-06-02 13:00 JSTarXiv cs.AIビジネス/資金調達

Learning-To-Measure: In-Context Active Feature Acquisition

Active feature acquisition (AFA) is a sequential decision-making problem where the goal is to improve model performance for test instances…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Video Reasoning without Training

Video reasoning using Large Multimodal Models (LMMs) relies on costly reinforcement learning (RL) and verbose chain-of-thought, resulting i…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

CARES: Context-Aware Resolution Selector for VLMs

Large vision-language models (VLMs) commonly process images at native or high resolution to remain effective across tasks. This inflates vi…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Symbolic Neural Generation with Applications to Lead Discovery in Drug Design

We investigate a relatively under-explored class of hybrid neurosymbolic models that integrate symbolic learning with neural reasoning to c…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

The Geometry of Grokking: Norm Minimization on the Zero-Loss Manifold

Grokking is a puzzling phenomenon in neural networks where full generalization occurs only after a substantial delay following the complete…

2026-06-02 13:00 JSTarXiv cs.AIビジネス/資金調達

Who Evaluates AI's Social Impacts? Mapping Coverage and Gaps in First and Third Party Evaluations

Foundation models are increasingly central to high-stakes AI systems, and governance frameworks now depend on evaluations to assess their r…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

Optimizing Diversity and Quality through Base-Aligned Model Collaboration

Alignment has greatly improved large language models (LLMs)' output quality at the cost of diversity, yielding highly similar outputs acros…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

NILC: Discovering New Intents with LLM-assisted Clustering

New intent discovery (NID) seeks to recognize both new and known intents from unlabeled user utterances, which finds prevalent use in pract…

2026-06-02 13:00 JSTarXiv cs.AIロボティクス研究/論文

RoboBenchMart: Benchmarking Robots in Retail Environment

Most existing robotic manipulation benchmarks focus on tabletop or household scenarios. While these setups have driven impressive progress,…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Latent Reasoning in TRMs is Secretly a Policy Improvement Operator

Recently, small models with latent recursion have obtained promising results on complex reasoning tasks. These results are typically explai…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Evaluating the Performance of Deep Learning Models in Whole-body Dynamic 3D Posture Prediction During Load-reaching Activities

This study aimed to explore the application of deep neural networks for whole-body human posture prediction during dynamic load-reaching ac…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Latent Collaboration in Multi-Agent Systems

Multi-agent systems (MAS) extend large language models (LLMs) from independent single-model reasoning to coordinative system-level intellig…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Understanding the Effects of Distractors on Reasoning Vision-Language Models

How does irrelevant information (i.e., distractors) affect test-time scaling in vision-language models (VLMs)? Prior work on text-only lang…

2026-06-02 13:00 JSTarXiv cs.AIロボティクス

SpeedAug: Policy Acceleration via Tempo-Enriched Policy and RL Fine-Tuning

Robotic policy learning for complex real-world manipulation tasks has seen rapid recent progress, enabled in large part by the ability to c…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成エージェント

From Segments to Scenes: Temporal Understanding in Autonomous Driving via Vision-Language Model

Vision-Language Models (VLMs) are increasingly deployed as the perception and reasoning backbone of autonomous agents acting in the wild, w…

2026-06-02 13:00 JSTarXiv cs.AIロボティクス

ShelfAware: Real-Time Semantic Localization in Quasi-Static Environments with Low-Cost Sensors

Many indoor workspaces are quasi-static: their global geometric layout is stable, but local semantics change continually, producing repetit…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

VocSim: A Training-free Benchmark for Zero-shot Content Identity in Single-source Audio

General-purpose audio representations aim to map acoustically variable instances of the same event to nearby points, resolving content iden…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

InFerActive: Interactive Tree-Based Exploration of LLM Sampling for Safety Evaluation

Even LLMs that appear safe during evaluation can still produce harmful responses in deployment. Because stochastic sampling yields differen…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Calibrating Uncertainty for Zero-Shot Adversarial CLIP

CLIP delivers strong zero-shot classification but remains highly vulnerable to adversarial attacks. Prior adversarial fine-tuning work prim…

2026-06-02 13:00 JSTarXiv cs.AIロボティクス研究/論文

Control of a Twin Rotor using Twin Delayed Deep Deterministic Policy Gradient (TD3)

This paper proposes a reinforcement learning (RL) framework for controlling and stabilizing the Twin Rotor Aerodynamic System (TRAS) at spe…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Ev-Trust: An Evolutionarily Stable Trust Mechanism for Decentralized LLM-Based Multi-Agent Service Economies

Decentralized LLM-based multi-agent service economies face three vulnerabilities that undermine traditional trust mechanisms: reduced cost…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

Agent Tools Orchestration Leaks More: Dataset, Benchmark, and Mitigation

LLM-based agents increasingly use multiple external tools to complete complex tasks. We study Tools Orchestration Privacy Risk (TOP-R): an…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

MGRegBench: A Novel Benchmark Dataset with Anatomical Landmarks for Mammography Image Registration

Robust mammography registration is essential for clinically relevant applications like tracking disease progression in breast tissue. Howev…

2026-06-02 13:00 JSTarXiv cs.AIロボティクス研究/論文

Reinforcement Learning Position Control of a Quadrotor Using Soft Actor-Critic (SAC)

This paper proposes a new Reinforcement Learning (RL) based control architecture for quadrotors. With the literature focusing on controllin…

2026-06-02 13:00 JSTarXiv cs.AIロボティクス研究/論文

Dynamic Entropy Tuning in Reinforcement Learning Low-Level Quadcopter Control: Stochasticity vs Determinism

This paper explores the impact of dynamic entropy tuning in Reinforcement Learning (RL) algorithms that train a stochastic policy. Its perf…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

Uncovering Competency Gaps in Large Language Models and Their Benchmarks

The evaluation of large language models relies heavily on standardized benchmarks. These benchmarks provide useful aggregated metrics, but…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation

Talking head generation creates lifelike avatars from static portraits for virtual communication and content creation. However, current mod…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

VLM4VLA: Revisiting Vision-Language-Models in Vision-Language-Action Models

Vision-Language-Action (VLA) models, which integrate pretrained large Vision-Language Models (VLM) into their policy backbone, are gaining…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Paradoxical noise preference in RNNs

In recurrent neural networks (RNNs) used to model biological neural networks, noise is typically introduced during training to emulate biol…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成ビジネス/資金調達研究/論文

Prototypicality Bias Reveals Blindspots in Multimodal Evaluation Metrics

Automatic metrics are widely used to evaluate text-to-image models, often replacing human judgment in benchmarking, model selection, and la…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

FastSLM: Hierarchical Temporal Abstraction for Efficient Long-Form Speech Adaptation

Scaling Multimodal Large Language Models (MLLMs) to long-form speech is bottlenecked by the explosive growth of input tokens. Unlike images…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Safe-FedLLM: Delving into the Safety of Federated Large Language Models

Federated learning (FL) addresses privacy and data-silo issues in the training of large language models (LLMs). Most prior work focuses on…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

DSA-Tokenizer: Disentangled Semantic-Acoustic Tokenization via Flow Matching-based Hierarchical Fusion

Speech tokenizers are a key building block of fully discrete Speech LLMs. Existing tokenizers either prioritize semantic encoding, fuse sem…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Hot-Start Chinese Language Modeling:Visual Glyphs Accelerate Sample-Efficient Learning

In this work, we study whether rendering Chinese characters as visual glyph images, rather than discrete token IDs as mainstream LLMs do, p…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

MASCOT: Towards Multi-Agent Socio-Collaborative Companion Systems

Multi-agent systems (MAS) are emerging as promising socio-collaborative companions for emotional and cognitive support. However, existing s…

2026-06-02 13:00 JSTarXiv cs.AIロボティクス

SilentDrift: Exploiting Action Chunking for Stealthy Backdoor Attacks on Vision-Language-Action Models

Vision-Language-Action (VLA) models are increasingly deployed in safety-critical robotic applications, yet their security vulnerabilities r…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Physics-Encoded Inverse Modeling for Arctic Snow Depth Prediction

Accurate estimation in time-varying inverse problems under limited and sparse observations remains a fundamental challenge across scientifi…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

A Monosemantic Attribution Framework for Stable Interpretability in Clinical Neuroscience Transformer-Based Language Models

Interpretability remains a key challenge for deploying language models (LM) in clinical settings such as progression diagnosis of Alzheimer…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Multi-Objective Reinforcement Learning for Tactical Decision Making for Trucks in Highway Traffic

Balancing safety, efficiency, and operational costs in highway driving poses a challenging decision-making problem for heavy-duty vehicles.…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

ELF: A Family of Encoder-Free ECG-Language Models

ECG-Language Models (ELMs) extend recent advances in Multimodal Large Language Models (MLLMs) to automated ECG interpretation. However, mos…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

ASKD-Whisper: Adaptive Self-knowledge Distillation for Efficient and Low-Latency Automatic Speech Recognition

Knowledge distillation (KD) is one of the most effective paradigms for compressing large-scale foundation models into deployable architectu…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Demystifying Multi-Agent Debate: The Role of Confidence and Diversity

Multi-agent debate (MAD) is widely used to improve large language model (LLM) performance through test-time scaling, yet recent work shows…

2026-06-02 13:00 JSTarXiv cs.AIハードウェア/半導体

How Much Progress Has There Been in NVIDIA Datacenter GPUs?

As the role of modern Graphics Processing Units (GPUs) becomes increasingly essential for several computing tasks, analyzing their past and…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

APB-V: Accelerating Long-Video Understanding via Sequence-Parallelism-aware Approximate Attention

The efficiency of long-video inference remains a critical bottleneck, mainly due to the dense computation in the prefill stage of Large Mul…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

When Does Predictive Inverse Dynamics Outperform Behavior Cloning?

Behavior cloning (BC) is a practical offline imitation learning method, but it often fails when expert demonstrations are limited. Recent w…

2026-06-02 13:00 JSTarXiv cs.AIハードウェア/半導体

GUDA: Counterfactual Group-wise Training Data Attribution for Diffusion Models via Unlearning

Training-data attribution for vision generative models aims to identify which training data influenced a given output. While most methods s…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Med-Scout: Curing MLLMs' Geometric Blindness in Medical Perception via Geometry-Aware RL Post-Training

Despite recent Multimodal Large Language Models (MLLMs)' linguistic prowess in medical diagnosis, we find even state-of-the-art MLLMs suffe…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Zero-Shot Off-Policy Learning

Off-policy learning methods seek to derive an optimal policy directly from a fixed dataset of prior interactions. This objective presents s…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Probabilistic Performance Guarantees for Multi-Task Reinforcement Learning

Multi-task reinforcement learning trains generalist policies that can execute multiple tasks. While recent years have seen significant prog…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

naPINN: Noise-Adaptive Physics-Informed Neural Networks for Recovering Physics from Corrupted Measurement

Physics-Informed Neural Networks (PINNs) are effective methods for solving inverse problems and discovering governing equations from observ…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

The Alignment Curse: Modality Alignment Supercharges Audio Attacks via Text Transfer

Recent advances in end-to-end trained omni-models have substantially improved audio capabilities by strengthening text-audio modality align…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Consistency Deep Equilibrium Models

Deep Equilibrium Models (DEQs) have emerged as a powerful paradigm in deep learning, offering the ability to model infinite-depth networks…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Lookahead Sample Reward Guidance for Test-Time Scaling of Diffusion Models

Diffusion models have demonstrated strong generative performance; however, generated samples often fail to fully align with human intent. T…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Global Geometry Is Not Enough for Vision Representations

A common assumption in representation learning is that globally well-distributed embeddings support robust and generalizable representation…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

When Single Answer Is Not Enough: Rethinking Single-Step Retrosynthesis Benchmarks for LLMs

Recent progress has expanded the use of large language models (LLMs) in drug discovery, including synthesis planning. However, objective ev…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Equilibrium Propagation for Non-Conservative Systems

Equilibrium Propagation (EP) is a physics-inspired learning algorithm that uses stationary states of a dynamical system both for inference…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Universal One-third Time Scaling in Learning Peaked Distributions

Training large language models (LLMs) is computationally expensive, partly because the loss exhibits slow power-law convergence whose origi…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Fixed Budget is No Harder Than Fixed Confidence in Best-Arm Identification up to Logarithmic Factors

The best-arm identification (BAI) problem is one of the most fundamental problems in interactive machine learning, which has two flavors: t…

2026-06-02 13:00 JSTarXiv cs.AIビジネス/資金調達

From Evaluation to Design: Using Potential Energy Surface Smoothness Metrics to Guide Machine Learning Interatomic Potential Architectures

Machine Learning Interatomic Potentials (MLIPs) sometimes fail to reproduce the physical smoothness of the quantum potential energy surface…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Optimal Bayesian Stopping for Efficient Inference of Consistent LLM Answers

A simple strategy for improving LLM accuracy, especially in math and reasoning problems, is to sample multiple responses and submit the ans…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Better Source, Better Flow: Learning Condition-Dependent Source Distribution for Flow Matching

Flow matching has recently emerged as a promising alternative to diffusion-based generative models, particularly for text-to-image generati…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Inverse Depth Scaling From Most Layers Being Similar

Neural scaling laws relate loss to model size in large language models (LLMs), yet depth and width may contribute to performance differentl…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Principle-Evolvable Scientific Discovery via Uncertainty Minimization

Large Language Model (LLM)-based scientific agents have accelerated scientific discovery, yet they often suffer from significant inefficien…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

"Do Not Mention This to the User": Detecting and Understanding Malicious Agent Skills

LLM-based coding agents increasingly rely on third-party extensions called skills, which bundle natural language instructions and helper sc…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Rethinking Scientific Modeling: Toward Physically Consistent and Simulation-Executable Programmatic Generation

Structural modeling is a fundamental component of computational engineering science, in which even minor physical inconsistencies or specif…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Collaborative and Efficient Fine-tuning: Leveraging Task Similarity

Adaptability has been regarded as a central feature in the foundation models, enabling them to effectively acclimate to unseen downstream t…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Principled Synthetic Data Enables the First Scaling Laws for LLMs in Recommendation

Large Language Models (LLMs) represent a promising frontier for recommender systems, yet their development has been impeded by the absence…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

SoK: DARPA's AI Cyber Challenge (AIxCC): Competition Design, Architectures, and Lessons Learned

DARPA's AI Cyber Challenge (AIxCC, 2023--2025) is the largest competition to date for building fully autonomous cyber reasoning systems (CR…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成ロボティクス

Picasso: Holistic Scene Reconstruction with Physics-Constrained Sampling

In the presence of occlusions and measurement noise, geometrically accurate scene reconstructions -- which fit the sensor data -- can still…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

When and How Much to Imagine: Adaptive Test-Time Scaling with World Models for Visual Spatial Reasoning

Despite rapid progress in MLLMs, visual spatial reasoning remains unreliable when correct answers depend on how a scene would appear under…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Predicting Future Utility: Global Combinatorial Optimization for Task-Agnostic KV Cache Eviction

Given the quadratic complexity of attention, KV cache eviction is vital to accelerate model inference. Current KV cache eviction methods ty…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

AnomSeer: Reinforcing Multimodal LLMs to Reason for Time-Series Anomaly Detection

Time-series anomaly detection (TSAD) with multimodal large language models (MLLMs) is an emerging area, yet a persistent challenge remains:…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Learning to Remember, Learn, and Forget in Attention-Based Models

In-Context Learning (ICL) in transformers acts as an online associative memory and is believed to underpin their high performance on comple…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成エージェントロボティクス

SceneSmith: Agentic Generation of Simulation-Ready Indoor Scenes

Simulation has become a key tool for training and evaluating home robots at scale, yet existing environments fail to capture the diversity…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Beware of the Batch Size: Hyperparameter Bias in Evaluating LoRA

Low-rank adaptation (LoRA) is a standard approach for fine-tuning large language models, yet its many variants report conflicting empirical…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Mitigating Reward Hacking in RLHF via Bayesian Non-negative Reward Modeling

Reward models learned from human preferences are central to aligning large language models (LLMs) via reinforcement learning from human fee…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

What Do LLMs Know About Alzheimer's Disease? Multi-loss Fine-Tuning and Probing for AD Detection

Reliable early detection of Alzheimer's disease (AD) is challenging, particularly due to the limited availability of labeled data. While la…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

SWE-MiniSandbox: Container-Free Reinforcement Learning for Building Software Engineering Agents

Reinforcement learning (RL) has become a key paradigm for training software engineering (SWE) agents, but existing pipelines typically rely…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

From Noise to Order: Learning to Rank via Denoising Diffusion

In information retrieval (IR), learning-to-rank (LTR) methods have traditionally limited themselves to discriminative machine learning appr…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

You Can Learn Tokenization End-to-End with Reinforcement Learning

Tokenization is a hardcoded compression step which remains in the training pipeline of Large Language Models (LLMs), despite a general tren…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

DenseMLLM: Standard Multimodal LLMs for Dense Prediction

Multimodal Large Language Models (MLLMs) have demonstrated exceptional capabilities in high-level visual understanding. However, extending…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Atomix: Timely, Transactional Tool Use for Reliable Agentic Workflows

LLM agents execute multi-step workflows that mutate external state through tools. Common orchestrators treat tool return as the settlement…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Knowing Isn't Understanding: Re-grounding Generative Proactivity with Epistemic and Behavioral Insight

Generative AI agents equate understanding with resolving explicit queries, an assumption that confines interaction to what users can articu…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成エージェント

Visual Persuasion: What Influences Decisions of Vision-Language Models?

The web is littered with images, once created for human consumption and now increasingly interpreted by agents using vision-language models…

2026-06-02 13:00 JSTarXiv cs.AIエージェント研究/論文

APEX-SQL: Talking to the data via Agentic Exploration for Text-to-SQL

Text-to-SQL systems powered by Large Language Models have excelled on academic benchmarks but struggle in complex enterprise environments.…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

PETS: A Principled Framework Towards Optimal Trajectory Allocation for Efficient Test-Time Self-Consistency

Test-time scaling can improve model performance by aggregating stochastic reasoning trajectories. However, achieving sample-efficient test-…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェントビジネス/資金調達研究/論文

Are LLMs Ready for Neural-integrated Mechanistic Modeling? A Benchmark and Agentic Framework

Large language models (LLMs) have shown promise in constructing mechanistic models from data. However, existing evaluations largely focus o…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

LERD: Latent Event-Relational Dynamics for Neurodegenerative Classification

Alzheimer's disease (AD) alters brain electrophysiology and disrupts multichannel EEG dynamics, making accurate and clinically useful EEG-b…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

IDLM: Inverse-distilled Diffusion Language Models

Diffusion Language Models (DLMs) have recently achieved strong results in text generation. However, their multi-step sampling leads to slow…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Learning Discriminative and Generalizable Anomaly Detector for Dynamic Graph with Limited Supervision

Dynamic graph anomaly detection is critical for many real-world applications but remains challenging due to the scarcity of labeled anomali…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

On Imbalanced Regression with Hoeffding Trees

Many real-world applications generate continuous data streams for regression. Hoeffding trees and their variants have a long-standing tradi…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Evaluating Reliability Asymmetries in Chinese Factual Search and AI Answers

Search engines and AI-powered systems increasingly mediate access to factual information, yet their reliability remains difficult to evalua…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Scaling Search Relevance: Augmenting App Store Ranking with LLM-Generated Judgments

Large-scale commercial search systems optimize for relevance to drive successful sessions that help users find what they are looking for. T…

2026-06-02 13:00 JSTarXiv cs.AIロボティクス

Interpretable Multimodal Gesture Recognition for Drone and Mobile Robot Teleoperation via Log-Likelihood Ratio Fusion

Human operators are still frequently exposed to hazardous environments such as disaster zones and industrial facilities, where intuitive an…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

You Don't Need All That Attention: Surgical Memorization Mitigation in Text-to-Image Diffusion Models

Generative models have been shown to "memorize" certain training data, leading to verbatim or near-verbatim generating images, which may ca…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

LookWise: Knowing When and Where to Look for Fine-Grained Visual Reasoning in Multimodal Large Language Models

Multimodal Large Language Models (MLLMs) are shifting towards "Thinking with Images" by actively exploring image details. While effective,…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Constitutional Black-Box Monitoring for Scheming in LLM Agents

Safe deployment of Large Language Model (LLM) agents in autonomous settings requires reliable oversight mechanisms. A central challenge is…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Concept Heterogeneity-aware Representation Steering

Representation steering offers a lightweight mechanism for controlling the behavior of large language models (LLMs) by intervening on inter…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Large Electron Model: A Universal Ground State Predictor

We introduce Large Electron Model, a single neural network model that produces variational wavefunctions of interacting electrons over the…

2026-06-02 13:00 JSTarXiv cs.AIロボティクス

Improving Diffusion Planners by Self-Supervised Action Gating with Energies

Diffusion planners are a strong approach for offline reinforcement learning, but they can fail when value-guided selection favours trajecto…

2026-06-02 13:00 JSTarXiv cs.AIエージェントロボティクス

SPARC: Spatial-Aware Path Planning via Attentive Agent Communication

Efficient communication is critical for decentralized Multi-Robot Path Planning (MRPP), yet existing learned communication methods treat al…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

One Bias After Another: Mechanistic Reward Shaping and Persistent Biases in Language Reward Models

Reward Models (RMs) are crucial for online alignment of language models (LMs) with human preferences. However, RM-based preference-tuning i…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Escaping the BLEU Trap: A Signal-Grounded Framework with Decoupled Semantic Guidance for EEG-to-Text Decoding

Decoding natural language from non-invasive EEG signals is a promising yet challenging task. However, current state-of-the-art models remai…

2026-06-02 13:00 JSTarXiv cs.AIエージェントロボティクス

HALO: Learning Human-Robot Collaboration via Heterogeneous-Agent Lyapunov Policy Optimization

To improve generalization and resilience in human-robot collaboration (HRC), robots must contend with diverse combinations of human behavio…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Med-V1: Small Language Models for Zero-shot and Scalable Biomedical Evidence Attribution

Assessing whether an article supports an assertion is essential for hallucination detection and claim verification. While large language mo…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Heterogeneous Decentralized Diffusion Models

Training frontier-scale diffusion models often requires substantial computational resources concentrated in tightly-coupled clusters, limit…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

DyLLM: Efficient Diffusion LLM Inference via Saliency-based Token Selection and Partial Attention

Masked diffusion language models enable parallel token decoding, providing a promising alternative to the sequential nature of autoregressi…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

ActiveUltraFeedback: Efficient Preference Data Generation using Active Learning

Reinforcement Learning from Human Feedback (RLHF) has become the standard for aligning Large Language Models (LLMs), yet its efficacy is bo…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Geometry-Aware Probabilistic Circuits via Voronoi Tessellations

Probabilistic circuits (PCs) enable exact and tractable inference but employ data independent mixture weights that limit their ability to c…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Ethical Fairness in Ubiquitous Health Sensing without Known Attributes

In ubiquitous and mobile health systems, computational models infer human states from wearable, behavioral, and physiological sensing data.…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

ES-Merging: Biological MLLM Merging via Embedding Space Signals

Biological multimodal large language models (MLLMs) have emerged as powerful foundation models for scientific discovery. However, existing…

2026-06-02 13:00 JSTarXiv cs.AIロボティクス

ExpertGen: Scalable Sim-to-Real Expert Policy Learning from Imperfect Behavior Priors

Learning generalizable and robust behavior cloning policies requires large volumes of high-quality robotics data. While human demonstration…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Context Matters: Repository-Aware Security Analysis of the Agent Skill Ecosystem

Agent skills extend local AI agents, such as Claude Code and OpenClaw, with additional functionality. Their growing popularity has led to d…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

scicode-lint: Detecting Methodology Bugs in Scientific Python Code with LLM-Generated Patterns

Methodology bugs in scientific Python code produce plausible but incorrect results that traditional linters and static analysis tools canno…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

MineDraft: A Framework for Batch Parallel Speculative Decoding

Speculative decoding (SD) accelerates large language model inference by using a smaller draft model to propose draft tokens that are subseq…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

To See or To Please: Uncovering Visual Sycophancy and Split Beliefs in VLMs

When VLMs answer correctly, do they genuinely rely on visual information? We introduce a Tri-Layer Diagnostic Framework with three per-samp…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成ビジネス/資金調達

Beyond String Matching: Semantic Evaluation of PDF Table Extraction

Reliably extracting tables from PDFs is essential for large-scale scientific data mining and knowledge base construction, yet existing eval…

2026-06-02 13:00 JSTarXiv cs.AIエージェント研究/論文

AgentDS Technical Report: Benchmarking the Future of Human-AI Collaboration in Domain-Specific Data Science

Data science plays a critical role in transforming complex data into actionable insights across numerous domains. Recent developments in la…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体ビジネス/資金調達

Failure of contextual invariance in large language models

Standard evaluation practices assume that large language model (LLM) outputs are stable when prompts are embedded in contextually equivalen…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

{\lambda}Split: Self-Supervised Content-Aware Spectral Unmixing for Fluorescence Microscopy

In fluorescence microscopy, spectral unmixing aims to recover individual fluorophore concentrations from spectral images that capture mixed…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Large Language Model Guided Incentive Aware Reward Design for Cooperative Multi-Agent Reinforcement Learning

Designing effective auxiliary rewards for cooperative multi-agent systems remains challenging, as misaligned incentives can induce suboptim…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

Claudini: Autoresearch Discovers State-of-the-Art Adversarial Attack Algorithms for LLMs

We show that AI agents are capable of discovering novel algorithms for adversarial attacks against LLMs, advancing the state of the art on…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Limits of Spatial Imagery Reasoning in Frontier LLM Models

Large Language Models (LLMs) have demonstrated impressive reasoning capabilities, yet they struggle with spatial tasks that require mental…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

EuraGovExam: A Multilingual Multimodal Benchmark from Real-World Civil Service Exams

We present EuraGovExam, a multilingual and multimodal benchmark sourced from real-world civil service examinations across five representati…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Incentives, Equilibria, and the Limits of Healthcare AI: A Game-Theoretic Perspective

Using a stylised coordination problem drawn from inpatient capacity management, three archetypal forms of AI deployment are described: effo…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Understand and Accelerate Memory Processing Pipeline for Large Language Model Inference

Modern large language models (LLMs) increasingly depends on efficient long-context processing and generation mechanisms, including sparse a…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Acoustic and perceptual differences between standard and accented speech and their voice clones

Voice cloning is often evaluated in terms of overall quality, but less is known about accent preservation and its perceptual consequences.…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Automated Conjecture Resolution with Formal Verification

Recent advances in large language models have significantly improved their ability to perform mathematical reasoning, extending from elemen…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

CalM: A Self-Supervised Foundation Model for Population Dynamics in Calcium Imaging Data

Recent work suggests that large-scale, multi-animal modeling can significantly improve neural recording analysis. However, for functional c…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

Distributional Open-Ended Evaluation of LLM Cultural Value Alignment Based on Value Codebook

As LLMs are globally deployed, aligning their cultural value orientations is critical for safety and user engagement. However, existing ben…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

AtomEval: Validity-Aware Atomic Evaluation of Adversarial Claim Rewriting in Fact Verification

Large language models (LLMs) can rewrite refuted claims to evade evidence-based fact verifiers, but conventional attack success rate (ASR)…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Multi-Modal Learning meets Genetic Programming: Analyzing Alignment in Latent Space Optimization

Symbolic regression (SR) aims to discover mathematical expressions from data, a task traditionally tackled using Genetic Programming (GP) t…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

U-Cast: A Surprisingly Simple and Efficient Frontier Probabilistic AI Weather Forecaster

AI-based weather forecasting now rivals traditional physics-based ensembles, but state-of-the-art (SOTA) models rely on specialized archite…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成ロボティクス

Frequency-Enhanced Diffusion Models: Curriculum-Guided Semantic Alignment for Zero-Shot Skeleton Action Recognition

Human action recognition is pivotal in computer vision, with applications ranging from surveillance to human-robot interaction. Despite the…

2026-06-02 13:00 JSTarXiv cs.AIエージェントビジネス/資金調達

Beyond Offline A/B Testing: Context-Aware Agent Simulation for Recommender System Evaluation

Recommender systems are central to online services, enabling users to navigate through massive amounts of content across various domains. H…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成ロボティクス

Genie 4D: Semantic-Prior-Guided 4D Dynamic Scene Reconstruction

At the intersection of computer vision and robotic perception, 4D reconstruction of dynamic scenes connects low-level geometric sensing wit…

2026-06-02 13:00 JSTarXiv cs.AIロボティクス

AffordGen: Generating Diverse Demonstrations for Generalizable Object Manipulation with Afford Correspondence

Despite the recent success of modern imitation learning methods in robot manipulation, their performance is often constrained by geometric…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Vibe-driven model-based engineering

There is a pressing need for better development methods and tools to keep up with the growing demand and increasing complexity of new softw…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

SCOPE: Signal-Calibrated On-Policy Distillation Enhancement with Dual-Path Adaptive Weighting

On-policy reinforcement learning has become the dominant paradigm for reasoning alignment in large language models, yet its sparse, outcome…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

TInR: Exploring Tool-Internalized Reasoning in Large Language Models

Tool-Integrated Reasoning (TIR) has emerged as a promising direction by extending Large Language Models' (LLMs) capabilities with external…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Representation over Routing: Diagnosing Temporal Routing Pathologies in Multi-Timescale PPO

Temporal credit assignment in reinforcement learning is often approached by introducing value estimates at multiple discount factors. A nat…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Just Type It in Isabelle! AI Agents Drafting, Mechanizing, and Generalizing from Human Hints

Type annotations are essential when printing terms in a way that preserves their meaning under reparsing and type inference. We study the p…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Reward Score Matching: Unifying Reward-based Fine-tuning for Flow and Diffusion Models

Reward-based fine-tuning steers a pretrained diffusion or flow-based generative model toward higher-reward samples while remaining close to…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成エージェント

Dual-Anchoring: Addressing State Drift in Vision-Language Navigation

Vision-Language Navigation(VLN) requires an agent to navigate through 3D environments by following natural language instructions. While rec…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

BEAT: Tokenizing and Generating Symbolic Music by Uniform Temporal Steps

Tokenizing music to fit the general framework of language models is a compelling challenge, especially considering the diverse symbolic str…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Deep Interest Mining for Intent-Enriched Semantic IDs in Multimodal Generative Recommendation

Semantic IDs (SIDs) provide the discrete item vocabulary used by generative recommendation, but their quality depends on what item evidence…

2026-06-02 13:00 JSTarXiv cs.AIハードウェア/半導体

FlowPlace: Flow Matching for Chip Placement

Chip placement plays an important role in physical design. While generative models like diffusion models offer promising learning-based sol…

2026-06-02 13:00 JSTarXiv cs.AIハードウェア/半導体

How Can Reinforcement Learning Achieve Expert-level Placement?

Chip placement is a critical step in physical design. While reinforcement learning (RL)-based methods have recently emerged, their training…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

MedSynapse-V: Bridging Visual Perception and Clinical Intuition via Latent Memory Evolution

High-precision medical diagnosis relies not only on static imaging features but also on the implicit diagnostic memory experts instantly in…

2026-06-02 13:00 JSTarXiv cs.AIビジネス/資金調達研究/論文

Defeasible Conditional Obligation in a Two-tiered Preference-based Semantics (Extended Version)

In response to a concern raised by Horty, this paper develops a two-tiered, preference-based semantic framework for modeling defeasible con…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

Beyond Visual Fidelity: Benchmarking Super-Resolution Models for Large-Scale Remote Sensing Imagery via Downstream Task Integration

Super-resolution (SR) techniques have made major advances in reconstructing high-resolution images from low-resolution inputs. The increase…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Possibilistic Predictive Uncertainty for Deep Learning

Deep neural networks achieve impressive results across diverse applications, yet their overconfidence on unseen inputs necessitates reliabl…

2026-06-02 13:00 JSTarXiv cs.AIビジネス/資金調達

STABLEVAL: Disagreement-Aware and Stable Evaluation of AI Systems

Human evaluation remains the primary standard for assessing modern AI systems, yet annotator disagreement, bias, and variability make syste…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Gradients with Respect to Semantics Preserving Embeddings Tell the Uncertainty of Large Language Models

Uncertainty quantification (UQ) is an important technique for ensuring the trustworthiness of LLMs, given their tendency to hallucinate. Ex…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

MidSteer: Optimal Affine Framework for Steering Generative Models

Steering intermediate representations has emerged as a powerful strategy for controlling generative models, particularly in post-deployment…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Do Joint Audio-Video Generation Models Understand Physics?

Joint audio-video generation models are rapidly approaching professional production quality, raising a central question: do they understand…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Why Self-Inconsistency Arises in GNN Explanations and How to Exploit It

Recent work has observed that explanations produced by Self-Interpretable Graph Neural Networks (SI-GNNs) can be self-inconsistent: when th…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Prune-OPD: Efficient and Reliable On-Policy Distillation for Long-Horizon Reasoning

On-policy distillation (OPD) leverages dense teacher rewards to enhance reasoning models. However, scaling OPD to long-horizon tasks expose…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Normalization Equivariance for Arbitrary Backbones, with Application to Image Denoising

Normalization Equivariance (NE) is a structural prior that improves robustness to distribution shift in image-to-image tasks. A function $f…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Cornerstones or Stumbling Blocks? Deciphering the Rock Tokens in On-Policy Distillation

While recent work in Reinforcement Learning with Verifiable Rewards (RLVR) has shown that a small subset of critical tokens disproportionat…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

The Cartesian Shortcut: Re-evaluate Vision Reasoning in Polar Coordinate Space

As current Multimodal Large Language Models rapidly saturate canonical visual reasoning benchmarks, a key question emerges: do these strong…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

OGLS-SD: On-Policy Self-Distillation with Outcome-Guided Logit Steering for LLM Reasoning

We study on-policy self-distillation (OPSD), where a language model improves its reasoning ability by distilling privileged teacher distrib…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Multi-Rollout On-Policy Distillation via Peer Successes and Failures

Large language models are often post-trained with sparse verifier rewards, which indicate whether a sampled trajectory succeeds but provide…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

REALISTA: Realistic Latent Adversarial Attacks that Elicit LLM Hallucinations

Large language models (LLMs) achieve strong performance across many tasks but remain vulnerable to hallucinations, making it important to s…

2026-06-02 13:00 JSTarXiv cs.AIビジネス/資金調達

RISED: A Pre-Deployment Evaluation Framework for High-Stakes AI Decision-Support Systems, with Application to Healthcare

Clinical decision-support systems are expert systems whose recommendations clinicians act on directly, yet they are usually cleared on one…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Revisiting Reinforcement Learning with Verifiable Rewards from a Contrastive Perspective

Group Relative Policy Optimization (GRPO) is one of the most widely adopted RLVR algorithms for post-training large language models on reas…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

CLIP Tricks You: Training-free Token Pruning for Efficient Pixel Grounding in Large VIsion-Language Models

In large vision-language models, visual tokens typically constitute the majority of input tokens, leading to substantial computational over…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Many-Shot CoT-ICL: Making In-Context Learning Truly Learn

While many-shot ICL achieves remarkable performance, prior studies of its scaling behavior have mainly focused on non-reasoning tasks. In t…

2026-06-02 13:00 JSTarXiv cs.AIロボティクス

AttenA+: Rectifying Action Inequality in Robotic Foundation Models

Existing robotic foundation models, while powerful, are predicated on an implicit assumption of temporal homogeneity: treating all actions…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Topology-Preserving Neural Operator Learning via Hodge Decomposition

In this paper, we study solution operators of physical field equations on geometric meshes from a function-space perspective. We reveal tha…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Beyond AI as Assistants: Toward Autonomous Discovery in Cosmology

Recent advances in artificial intelligence (AI) agents are pushing AI beyond tools toward autonomous scientific discovery. We discuss two c…

2026-06-02 13:00 JSTarXiv cs.AIエージェント研究/論文

PBT-Bench: Benchmarking AI Agents on Property-Based Testing

Existing code benchmarks measure whether an agent can produce any test that reproduces a known bug, or whether it can produce a patch that…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Avoiding Structural Failure Modes in Tabular Fair SSL: Online Primal-Dual Allocation under Confidence Gating

Semi-supervised learning (SSL) enables prediction with limited labels, but high-stakes tabular applications (medical, credit, recidivism) r…

2026-06-02 13:00 JSTarXiv cs.AIハードウェア/半導体

Physics-Guided Geometric Diffusion for Macro Placement Generation

Macro placement is a pivotal stage in VLSI physical design, fundamentally determining the overall chip performance. Recent data-driven plac…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Privacy Policy Enforcement Guardrails for Data-Sensitive Retrieval-Augmented Generation

Standard PII filters often miss contextual data leakage in RAG systems, such as non-regulated attribute clusters that collectively identify…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

DynMuon: A Dynamic Spectral Shaping View of Muon

In recent years, Muon has emerged as the dominant method for training large language models, and transformers more broadly. The essential d…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Balancing Knowledge Distillation for Imbalance Learning with Bilevel Optimization

Knowledge distillation transfers knowledge from a high capacity teacher to a compact student using a mixture of hard and soft losses. On im…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Lying Is Just a Phase: The Hidden Alignment Transition in Language Model Scaling

Scaling laws predict loss from compute but not how capabilities interact. We measure the coupling between reasoning and truthfulness across…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Can Vision Models Truly Forget? Mirage: Representation-Level Certification of Visual Unlearning

Machine unlearning in Vertical Federated Learning (VFL) has attracted growing interest, yet existing methods certify forgetting solely usin…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成エージェント

Co-Fusion4D: Spatio-temporal Collaborative Fusion for Robust 3D Object Detection

In autonomous driving, 3D object detection is essential for accurate perception and reliable decision-making. However, object motion and eg…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Task-Aligned Self-Supervised Learning for Medical Image Analysis: A Systematic Review and Practical Design Guidelines

Self-supervised learning (SSL) has emerged as a promising paradigm for addressing the annotation bottleneck in medical imaging by learning…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Attested Tool-Server Admission: A Security Extension to the Model Context Protocol

The Model Context Protocol (MCP) standardizes how a large-language-model (LLM) agent and an external tool server exchange messages, but not…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

CRISP -- Clustering-Based Redundancy-Reduced Instance Sampling for Pathology Case Representation and Retrieval

Digital pathology archives increasingly contain multiple whole-slide images (WSIs) per case, capturing spatially distinct tumour regions an…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Treatment Effect Estimation with Differentiated Networked Effect on Graph Data

Estimating individual treatment effect (ITE) from observational graph data is crucial for decision-making in the fields such as commerce an…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Rethinking Weak Supervision in Anomaly Detection: A Comprehensive Benchmark

Weakly supervised anomaly detection (WSAD) has developed in three primary directions: incomplete, inexact, and inaccurate supervision. Howe…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

Channel-wise Vector Quantization

We present Channel-wise Vector Quantization (CVQ), a novel image tokenization paradigm that replaces patch-wise tokens with channel-wise to…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

GoQuant: Geometric Orthogonal Residual Projection for Multiplier-Free Power-of-Two Transformer Quantization

The deployment of Large Language Models (LLMs) and Vision Transformers (ViTs) on edge devices is significantly constrained by memory limita…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

Algorithmic Fragility and Persona Bias in LLM-Generated Autistic Communication

Safety alignment reduces explicitly harmful outputs but inadvertently encodes a sanitized, neuronormative representation of marginalized co…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Targeted Remasking: Replacing Token Editing with Token-to-Mask Refinement in Discrete Diffusion Language Models

Discrete masked diffusion language models such as LLaDA generate text through iterative denoising, where mask tokens are progressively repl…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Beyond Trajectory-Level Attribution: Graph-Based Credit Assignment for Agentic Reinforcement Learning

Group-based reinforcement learning (RL) methods have achieved remarkable success in improving the performance of large language models (LLM…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Knowledge Graphs as the Missing Data Layer for LLM-Based Industrial Asset Operations

LLM-based agents for industrial asset operations show limited accuracy when reasoning over flat document stores. AssetOpsBench (KDD 2026) e…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Cast a Wider Net: Coordinated Pass@K Policy Optimization for Code Reasoning

Repeated sampling with a verifier is the standard way to allocate test-time compute for code generation, with pass@$K$ as the canonical met…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成エージェント

Generic Interpretation Approach for Transformer Models Incorporating Heterogenous Attention Structures

Transformer has significantly propelled the development of artificial intelligence, and certainly the development of agents as well. We cat…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

BenGER: Benchmarking LLM Systems on Subsumption-Based Legal Reasoning in German Law

We introduce the BenGER (Benchmark for German Law) dataset for evaluating LLM systems on subsumption-based legal reasoning in German law. T…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Hallucination Detection-Guided Preference Optimization for Clinical Summarization

Large language models (LLMs) have shown promise on summarization tasks, but they often produce hallucinations, which are unsupported or inc…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

GEO-Bench: Benchmarking Ranking Manipulation in Generative Engine Optimization

Large language models (LLMs) increasingly rank products, documents, and recommendations for user queries, which makes manipulating these ra…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

TIMEGATE: Sustainable Time-Boxed Promotion Gates for Continual ML Adaptation Under Resource Constraints

As machine learning(ML) systems evolve to continual adaptation, each re-training cycle uses compute, annotation, and energy. We introduce T…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

BlockBatch: Multi-Scale Consensus Decoding for Efficient Diffusion Language Model Inference

Diffusion language models (dLLMs) generate text by iteratively denoising multiple token positions in parallel, offering an attractive alter…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Honest Lying: Understanding Memory Confabulation in Reflexive Agents

Reflexion-style agents rely on self-generated reflections as memory, implicitly assuming that agents can accurately diagnose their own fail…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成ロボティクス

AnyMo: Scaling Any-Modality Conditional Motion Generation with Masked Modeling

Conditional human motion generation remains a fundamental challenge in computer vision and robotics. Despite significant progress, current…

2026-06-02 13:00 JSTarXiv cs.AI画像/動画生成

GiPL: Generative augmented iterative Pseudo-Labeling for Cross-Domain Few-Shot Object Detection

Vision-language foundation models have shown promising zero-shot generalization for Cross-Domain Few-Shot Object Detection (CD-FSOD). Howev…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

HoliTok:A Coutinuous Holistic Tokenization with Robust Dual Capabilities of Speech Generation and Understanding

Unified speech foundation models require a holistic tokenization space that is both learnable by language models and decodable into high-qu…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

Beyond MSE: Improving Precipitation Nowcasting with Multi-Quantile Regression

Deep-learning precipitation nowcasting models are often optimized using pointwise losses such as mean squared error or mean absolute error,…

2026-06-02 13:00 JSTarXiv cs.AIエージェント

Dissociative Identity: Language Model Agents Lack Grounding for Reputation Mechanisms

As autonomous language model agents proliferate, forming an emerging agentic web with real-world consequences, what credibility signals can…

2026-06-02 13:00 JSTarXiv cs.AI研究/論文

CalArena: A Large-Scale Post-Hoc Calibration Benchmark

Reliable probability estimates are critical in many machine learning applications, yet modern classifiers are often poorly calibrated. Post…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AIロボティクス

Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments

Embodied intelligence is often studied through specialized models for individual tasks such as manipulation or navigation, resulting in fra…

2026-06-02 13:00 JSTarXiv cs.AILLM/生成AI

Self-Trained Verification for Training- and Test-Time Self-Improvement

Self-improvement at scale has been a longstanding goal for reasoning models, and there are two natural places to do it: at test time, throu…

2026-06-02 10:31 JSTITmedia AI+LLM/生成AI

生成AIを「悩み相談」に使う10代女性たち　阿部前監督事件を招いた「AI正論」の波紋

長女（18）への暴行容疑で、プロ野球巨人の監督だった阿部慎之助さん（47）が逮捕された。きっかけは、長女が対話型の生成AI「ChatGPT」に被害を相談し、回答に基づき児童相談所へ連絡したことだった。長女の行動の是非（ぜひ）とは別に、生成AIは若者の相談相手として定着し、若い女…

2026-06-02 07:55 JSTTechCrunch AIビジネス/資金調達

Alphabet plans to raise $80B to pay for AI buildout

"The company is experiencing strong demand for its AI solutions and services from enterprises and consumers, at levels that are exceeding t…

2026-06-02 07:45 JSTITmedia AI+エージェントハードウェア/半導体

NVIDIAの“狐”は工場自律管理AIエージェント、台湾メーカーが導入効果を確認

NVIDIAは、工場を自律的に管理するAIエージェントのレファレンスデザイン「NVIDIA Factory Operations Blueprint（FOX）」を発表した。FOXを用いれば、工場内のさまざまなデータをリアルタイムに監視／分析し、複数のAIエージェントと機器を連携…

2026-06-02 07:00 JSTITmedia AI+LLM/生成AI

製造現場の「AIアレルギー」をどう払拭？　日立・新卒デジタル人材「3カ月奮闘記」

日立製作所は、AIやデータ解析の専門スキルを持つ新人データサイエンティストを、製造現場へと送り込んでいる。いかにして現場の「AIアレルギー」を払拭し、現場とのコミュニケーションを通じて業務時間を短縮する生成AIツールを定着させたのか。実習に参加した若手女性データサイエンティスト…

2026-06-02 06:45 JSTITmedia AI+ハードウェア/半導体

NVIDIAの「NemoClaw」でエッジAIを統合管理、アドバンテックが「WEDA」を発表

アドバンテックは、パートナー向けイベント「2026 Advantech World Partner Conference（WPC）」において、エッジAIの開発から導入、運用までを統合的に管理するソリューション「WEDA」について説明した。

2026-06-02 06:35 JSTTechCrunch AIエージェントハードウェア/半導体

Nvidia chases $200B CPU market with AI agent PCs from Microsoft, Dell, and HP

If Nvidia has cracked a way to bring AI agents easily, safely, and usefully to the masses, it could — and should — be big.

2026-06-02 05:03 JSTTechCrunch AILLM/生成AI規制/政策

Florida sues OpenAI, Sam Altman, in first-of-its-kind lawsuit over violent incidents

The lawsuit partially revolves around a shooting at Florida State University last year, and ChatGPT's alleged role in the incident.

2026-06-02 03:19 JSTTechCrunch AIビジネス/資金調達

Water access is now a risk factor in SpaceX’s IPO

The company says it needs "significant" water resources to cool its data centers, and that access to abundant, affordable water is a challe…

2026-06-02 02:50 JSTITmedia AI+その他

「楽天スーパーSALE」にAIコンシェルジュ　対話で商品検索、“買い回り攻略法”も

用途や予算をテキストか音声で伝えると、セール対象商品から条件に合った商品を探し出せる。

2026-06-02 02:27 JSTITmedia AI+LLM/生成AIビジネス/資金調達

Anthropicが上場準備　直近の評価額は約154兆円

AnthropicがIPOに向け、SECに登録書類「S-1」のドラフトを非公開で提出した。直近のシリーズH資金調達での評価額は約9650億ドル（約154兆円）に達している。

2026-06-02 02:07 JSTITmedia AI+LLM/生成AI

Claudeのレート制限を“詫びリセット”、ProとMaxプラン向け　一部で「想定より速く使用量消費」

米Anthropicは、チャットAI「Claude」の有料プラン「Pro」「Max」のユーザーを対象に、5時間および週次のレート制限をリセットしたと発表した。

2026-06-02 02:00 JSTOpenAIその他

Our views on AI policy and political advocacy

Our approach to AI policy and political advocacy, transparency, support for thoughtful regulation and AI safety, and that no outside politi…

2026-06-02 01:36 JSTTechCrunch AILLM/生成AI

Anthropic files to go public

Anthropic, now an AI powerhouse that has landed top-tier enterprise customers, was once considered an underdog in the emerging world of lar…

2026-06-02 01:00 JSTTechCrunch AIその他

This AI weather startup is out-forecasting government agencies

WindBorne benefits from its unique combination of model-building and data collection. The company now has about 400 balloons in flight gath…

2026-06-01（415件）

2026-06-01 23:49 JSTTechCrunch AIその他

DuckDuckGo makes its ‘no-AI’ search engine easier to access as its traffic booms

Alternative search engine DuckDuckGo launches 'no AI' web extensions for Chrome and Firefox users.

2026-06-01 23:27 JSTITmedia AI+その他

「FDE」って結局、客先常駐SEのリブランディングじゃないの？　アクセンチュアに聞いてみた

AIプラットフォーム企業が掲げる新職業「FDE」（フォワード・デプロイド・エンジニア）は、客先常駐SEの焼き直しなのか。Microsoftと共同でFDE組織を立ち上げ、独自の「RDE」も打ち出すアクセンチュアの保科学世氏と片岡俊行氏に疑問をぶつけてみた。

2026-06-01 21:00 JSTOpenAILLM/生成AI

Building the infrastructure for the Intelligence Age in Michigan

OpenAI breaks ground on a 1GW data center project in Michigan as part of Stargate, building AI infrastructure to expand access, create jobs…

2026-06-01 19:00 JSTOpenAILLM/生成AIエージェント

OpenAI frontier models and Codex are now available on AWS

OpenAI frontier models and Codex are now generally available on AWS, giving enterprises a new path to build with OpenAI through the AWS env…

2026-06-01 16:00 JSTITmedia AI+エージェント

Salesforceの「深謀遠慮」とは？　AIエージェント時代のオープンシステム、主導権争いの行方

AIエージェントが多数動き回る企業の業務システムに向けて、Salesforceが新たなソリューションを打ち出した。そのソリューションの狙いを考察すると、そこには同社の深謀遠慮があるようだ。

2026-06-01 13:00 JSTITmedia AI+その他

“VB.NET移行をAIで爆速化”した千葉銀行GのIT企業　「12.5人月→2.0人月」をどう実現？

ちばぎんコンピューターサービスはAI駆動開発の仕組みを構築し、既存のVB.NETシステムのマイグレーション工数を12.5人月から2.0人月に削減した。どう実現したのか。

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成

PhyDrawGen: Physically Grounded Diagram Generation from Natural Language

Generating physics diagrams from text requires strict adherence to physical laws. While current generative models produce visually plausibl…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Physically Viable World Models: A Case for Query-Conditioned Embodied AI

World models for embodied AI must be physically viable: constructed to answer intervention queries by representing the physical structure g…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Transforming and Encoding FTS for SAT Solving: What Helps, What Hurts (Extended Version)

Factored tasks are a classical planning representation that extends SAS+ with limited forms of disjunctive preconditions, conditional effec…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Procedural Generation of First Person Shooter Maps using Map-Elites

We investigate the application of MAP-Elites (a well-known quality diversity algorithm) to design levels for First-Person Shooter (FPS) gam…

2026-06-01 13:00 JSTarXiv cs.AIエージェント

Uncertainty-Aware and Temporally Regulated Expert Advice in Reinforcement Learning for Autonomous Driving

Exploration in reinforcement learning for autonomous driving is inherently unsafe: agents must experience novel behaviors to learn, yet exp…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Harness Updating Is Not Harness Benefit: Disentangling Evolution Capabilities in Self-Evolving LLM Agents

LLM agents are increasingly deployed as systems built around editable external harnesses, including prompts, skills, memories and tools, th…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

EHRBench: An Automated and Reliable EHR-based Benchmark for Clinical Decision Making with LLMs

Clinical decision-making (CDM) is central to real-world clinical workflows, where clinicians infer diagnoses, select treatments, or anticip…

2026-06-01 13:00 JSTarXiv cs.AIエージェント

Structure-Induced Information for Rerooting Levin Tree Search

Subgoal-based policy tree search, which uses a policy to guide search, is effective for complex single-agent deterministic problems but oft…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Healthcare Mechanisms from Policy-as-Code Search under Strategic Provider Response

Healthcare mechanisms are inseparable from the strategic provider response they induce: existing healthcare AI benchmarks hold this respons…

2026-06-01 13:00 JSTarXiv cs.AIエージェント

MAVEN: Improving Generalization in Agentic Tool Calling

Generalization across agentic tool-calling environments remains a central challenge for reliable agentic reasoning systems. Although large…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Generating Graph-like Rules for Knowledge Graph Reasoning via Diffusion Models

Logical rules constitute a cornerstone of knowledge graph (KG) reasoning, valued for their interpretability and ability to model relational…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

Learning Agent-Compatible Context Management for Long-Horizon Tasks

LLM agents increasingly face long-horizon tasks such as web search and deep research in real-world applications, where accumulated context…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

PReMISE: Policy Rubrics as Measurement Specifications for LLM Judges

LLM judges are increasingly used to evaluate open-ended responses, but their scores depend strongly on the rubrics that condition them. A v…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Planner-Centric Reinforcement Learning for Deep Research with Structure-Aware Reward

Deep research tasks require LLMs to plan what to investigate, retrieve evidence, and synthesize long-form answers across multiple branches…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

SLAT: Segment-Level Adaptive Trimming for Efficient CoT Reasoning

Recent advances in Large Reasoning Models have significantly improved chain-of-thought (CoT) capabilities via reinforcement learning (RL).…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェント

COMPASS: Cognitive MCTS-Guided Process Alignment for Safe Search Agents

LLM-powered search agents enable multi-step reasoning and tool use. However, these capabilities introduce retrieval-induced safety degradat…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Distilling LLM Feedback for Lean Theorem Proving

Post-training for reasoning models typically combines supervised fine-tuning with reinforcement learning from verifiable rewards, most comm…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

UniScale: Adaptive Unified Inference Scaling via Online Joint Optimization of Model Routing and Test-Time Scaling

In real-world deployments of large language models (LLMs), balancing inference quality and computational cost has become a central challeng…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

BilliardPhys-Bench: Benchmarking Physical Reasoning and Visual Dynamics of Multimodal LLMs

Current multimodal models handle static image recognition well, but intuitive physical reasoning remains a weakness. Predicting how objects…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

A Persona-Based Evaluation Framework for Pluralistic Alignment in Generative AI

Current alignment paradigms for generative artificial intelligence rely predominantly on monolithic benchmarking frameworks that reduce the…

2026-06-01 13:00 JSTarXiv cs.AIエージェント

HADT: A Heterogeneous Multi-Agent Differential Transformer for Autonomous Earth Observation Satellite Cluster

This work addresses the problem of autonomous resource management in heterogeneous satellite cluster conducting Earth Observation (EO) miss…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

GraphARC: A Comprehensive Benchmark for Graph-Based Abstract Reasoning

Relational reasoning lies at the heart of intelligence, but existing benchmarks are typically confined to formats such as grids or text. We…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Vector Linking via Cross-Model Local Isometric Consistency

We study Vector Linking: given two embedding clouds produced by different black-box encoders over partially overlapping datasets, recover c…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

LLM-FACETS: A Privacy-Preserving Framework for Evaluating LLM Transparency and Accountability

Assessing whether Large Language Models outputs are factually grounded, epistemically calibrated, and methodologically reproducible is a pr…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Formalizing and falsifying causal pathways of rare events

Building on recent formalizations of root cause analysis for rare events (``outliers'') in structural equation models, we propose a formal…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェント

COLLEAGUE.SKILL: Automated AI Skill Generation via Expert Knowledge Distillation

LLM agents are increasingly expected not only to complete isolated tasks, but also to carry bounded representations of human expertise, jud…

2026-06-01 13:00 JSTarXiv cs.AIエージェントビジネス/資金調達

Industrializing Prediction-Powered Inference: The GLIDE Library for Reliable GenAI and Agentic Systems Evaluation

Reliable evaluation of agentic systems requires unbiased estimates with valid uncertainty, but standard practice navigates between costly h…

2026-06-01 13:00 JSTarXiv cs.AIエージェントビジネス/資金調達研究/論文

TraceGraph: Shared Decision Landscapes for Diagnosing and Improving Agent Trajectories

Agent benchmarks increasingly record rich interaction trajectories, yet evaluation often reduces each rollout to a pass rate or reward scor…

2026-06-01 13:00 JSTarXiv cs.AIエージェント

Diagnosing Failure Modes of Shared-State Collaboration in Resource-Constrained Visual Agents

Modular visual reasoning systems increasingly rely on shared working memory for multi-step collaboration, yet the failure dynamics of inter…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Learning to Adapt: Self-Improving Web Agent via Cognitive-Aware Exploration

Recent advances in Multimodal Large Language Models (MLLMs) have led to promising progress in web agents. However, existing web agents ofte…

2026-06-01 13:00 JSTarXiv cs.AIエージェント

HypoAgent: An Agentic Framework for Interactive Abductive Hypothesis Generation over Knowledge Graphs

Abductive reasoning over knowledge graphs aims to generate logical hypotheses that explain observed entities or facts. Existing controllabl…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

FAM-Bench: A Multimodal Benchmark for Condition-Aware Food-as-Medicine Reasoning

Food-as-Medicine requires models to reason beyond what a dish is or what nutrition it contains: they must decide whether a concrete food ch…

2026-06-01 13:00 JSTarXiv cs.AIエージェント

Answer-Set-Programming-based Abstractions for Reinforcement Learning

Reinforcement Learning (RL) enables autonomous agents to learn policies from experience, but realistic problems often involve enormous stat…

2026-06-01 13:00 JSTarXiv cs.AIエージェント研究/論文

AutoSci: A Memory-Centric Agentic System for the Full Scientific Research Lifecycle

Scientific research has traditionally been human-intensive, requiring researchers to coordinate literature, ideas, experiments, manuscripts…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

LinTree: Improving LLM Reasoning with Explicitly Structured Search Histories

Large language models (LLMs) often solve reasoning problems by generating intermediate traces that explore and revise partial solutions. Fr…

2026-06-01 13:00 JSTarXiv cs.AIエージェント

Choosing the Lens: Strategic Perspective Activation in Context-Dependent Argumentation

The same arguments often need to be evaluated under different external regimes. An agent with influence over the regime has a strategic lev…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

TRINE: A Token-Aware, Runtime-Adaptive FPGA Inference Engine for Multimodal AI

Multimodal stacks that mix ViTs, CNNs, GNNs, and transformer NLP strain embedded platforms because their compute/memory patterns diverge an…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

When LLM Reward Design Fails: Diagnostic-Driven Refinement for Sparse Structured RL

For sparse, structured reinforcement-learning tasks with semantic reward-function interfaces, LLM-generated reward shaping is better framed…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Gradient-Free Training of Spiking Neural Networks via Low-Rank Evolution Strategies

Spiking Neural Networks (SNNs) offer compelling energy efficiency on neuromorphic hardware, yet their training remains challenging because…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成

XOResNet: Exclusive-OR Meta-Residuals Facilitate Deep Spiking Neural Networks Learning

Spiking neural networks (SNNs) hold promise for demonstrating superior learning and representation capabilities in deep models. Given the t…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Enhancing Regime Shift Detection Using Unstructured Data: A Study on the Treasury Market

Regime shifts in financial markets reorganise the joint dynamics of asset prices and macro variables, breaking any single-regime calibratio…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Hamiltonian-Inspired Attention Mechanism for Scalable RF Transmitter Fingerprinting

Radio-frequency (RF) fingerprinting identifies wire-less transmitters using hardware-induced imperfections present in baseband I/Q signals.…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Mental Damage: Caption Poisoning Attacks on Retrieval-Augmented Text-to-Music Generation

Retrieval-augmented text-to-music (TTM) systems augment underspecified user prompts using captions retrieved from a music caption dataset.…

2026-06-01 13:00 JSTarXiv cs.AIロボティクスビジネス/資金調達

Reinterpreting Safety Thresholds as Neuron Spiking Thresholds

Surrogate Safety Measures (SSMs) are extensively utilised in the evaluation of traffic risk in automated driving contexts. However, the maj…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成

Updating the standard neuron model in artificial neural networks

From their inception in the 1950s, artificial neural networks (ANNs) started using the so-called point neuron model then prevalent in neuro…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Evolutionary Algorithm for Reservoir Learning and Yielding

Reservoir computing, a type of recurrent neural network, is a promising approach for temporal learning as it separates dynamic processing f…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Full-field prediction for engineering-scale three-dimensional aircraft with multigrid-hierarchical learning

High-fidelity computational fluid dynamics is essential for aerospace design, but engineering-scale simulations of practical three-dimensio…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Unicorn: Scaling High-Dimensional Time Series Forecasting via Universal Correlation Modeling

Modern time series architectures face a fundamental trade-off: channel-independent models scale well with increasing data volume but ignore…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

When LLMs Learn to Be Consistently Wrong: A Multi-Model Study of Linear Representations of Synthetic Deception

Deceptive alignment, in which models maintain accurate internal representations while deliberately producing false outputs, remains a centr…

2026-06-01 13:00 JSTarXiv cs.AIロボティクス

Structured interactions improve distributed coordination beyond model scaling in a real-world multi-robot system

Scaling individual robot capabilities is common but costly. Here we investigate a system-level design question in real-world multi-robot co…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

LLMs Without Deep Neural Networks: New Architecture, Benefits and Case Study

The purpose of this article is to provide validation to my deep neural network alternative in the context of LLMs. Very recently, there has…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成

Functional MRI Time Series Generation via Wavelet-Based Image Transform and Spectral Flow Matching for Brain Disorder Identification

Functional Magnetic Resonance Imaging (fMRI) provides non-invasive access to dynamic brain activity by measuring blood oxygen level-depende…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Social Reasoning in Machines: Investigating Collective Truth-Seeking Dynamics in Large Language Model Debate

Human reasoning has long been theorised to operate socially, not through isolated individual cognition, but through collective adversarial…

2026-06-01 13:00 JSTarXiv cs.AIビジネス/資金調達研究/論文

NumLeak: Public Numeric Benchmarks as Latent Labels in Foundation Models

Public numeric benchmarks appear in pretraining, so an evaluation that conditions on a date may be measuring memorized recall rather than o…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

CodeGolf Bench: A Multi-Language Benchmark for Evaluating Concise Code Generation Capabilities of Large Language Models

This paper introduces Code Bench, a benchmark capable of evaluating Large Language Models (LLMs) concise code generation abilities in 60 pr…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

AI Loss of Control Incident Management: Response & Resilience

Recent research demonstrating AI systems exhibiting deception and shutdown resistance suggests that AI loss of control (LOC) is an urgent p…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Exploring Autonomous Agentic Data Engineering for Model Specialization

Large Language Models (LLMs) have demonstrated strong performance on general tasks, while often struggling to adapt to specialized domains…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成

SANA-Streaming: Real-time Streaming Video Editing with Hybrid Diffusion Transformer

Real-time streaming video-to-video editing (V2V) is critical for interactive applications such as live broadcasting and gaming, yet it rema…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Domain Adaptation and Reasoning Frameworks in Language Models: A Controlled Experiment with Historical Cosmology

We investigate how domain adaptation reshapes explanatory behavior in language models using historical cosmology as a controlled setting. I…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

LongDS-Bench: On the Failure of Long-Horizon Agentic Data Analysis

Real-world data analysis is inherently iterative, yet existing benchmarks mostly evaluate isolated or short interactive tasks, leaving agen…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Calibrated Preference Learning: The Case of Label Ranking

Calibration, the alignment of predicted probabilities with true outcome frequencies, is essential for reliable decision-making. While exten…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

A Unified Framework for Gradient Aggregation in Multi-Objective Optimization

Many machine learning problems involve multiple inherent trade-offs that are best addressed by gradient-based multi-objective optimization…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェント

The Surface You Test Is Not the Surface That Breaks

Tool-augmented LLM agents are vulnerable to prompt injection: a third party who controls part of the agent's context can plant instructions…

2026-06-01 13:00 JSTarXiv cs.AIエージェント

Scalable Constrained Multi-Agent Reinforcement Learning via State Augmentation and Consensus for Separable Dynamics

We present a distributed approach for constrained Multi-Agent Reinforcement Learning (MARL) that combines state-augmented policy learning w…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

idSCD: Identifying Training Datasets through Semantic Correlation Descriptors

Can a dataset be recognized from the spurious correlations it induces during training? We argue that datasets leave dataset-specific traces…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Graph-Conditioned Mixture of Graph Neural Network Experts for Traffic Forecasting

Spatio-temporal forecasting on sensor graphs is commonly tackled with a single backbone architecture applied uniformly across all nodes, al…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Improved Distribution Estimation in $\ell_\infty$

We present improved bounds for estimating discrete probability distributions under the $\ell_\infty$ norm. These include minimax bounds in…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成

A Novel Global Context-aware Deep Neural Network for Enhanced Brain Tumor Segmentation using Magnetic Resonance Images

Brain cancer's severity necessitates precise brain tumor segmentation, which is crucial for effective brain tumor diagnosis. Manual identif…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Revisiting Padded Transformer Expressivity: Which Architectural Choices Matter and Which Don't

Recent work describes what transformers can and cannot compute through connections to boolean circuits, but existing results lack exact cha…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Generalistic or Specific Embeddings, Which is Better? An Empirical Study on Search for Clinical Coding in Non-English Languages

Sentence-embedding models for semantic search are overwhelmingly developed and evaluated on English corpora. When applied to clinical retri…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Seeing Isn't Knowing: Do VLMs Know When Not to Answer Spatial Questions (and Why)?

Spatial reasoning is a fundamental capability for vision-language models (VLMs) deployed in real-world environments. However, visual observ…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

VLM3: Vision Language Models Are Native 3D Learners

Vision Language Models (VLMs) enable a unified model to solve various vision tasks through prompting. They have shown promising performance…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェントロボティクス

Memory-Bound but Not Bandwidth-Limited: The Physical AI Inference Gap in Batch-1 LLM Decode

Physical AI systems, including robots, autonomous vehicles, embodied agents and edge copilots, often run a different inference workload fro…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成ロボティクス

Prior Availability in Industrial Visual Sim-to-Real: A Review of CAD-Guided and CAD-Unavailable Regimes

Industrial visual sim-to-real is often described as transferring from synthetic images to real images, but industrial deployment usually in…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Benchmarking Machine Learning Uncertainty Quantification Methodologies for Predicting Turbine Gas Temperature Degradation

Effective prognostics and health management of modern engines relies on accurate turbine gas temperature predictions and robust uncertainty…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

ImmigrationQA: A Source-Grounded Dataset and Small-Model Adaptation for U.S. Immigration Law

U.S. immigration law spans thousands of pages of official policy, federal regulations, and procedural guidance that change frequently and c…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェントビジネス/資金調達

Counterfactual Evaluation Reveals Hidden Capability Profiles in Clinical LLMs and Agents

Two clinical AI systems can score nearly identically on coverage-based rubrics yet behave radically differently when their patient inputs c…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Scientific Machine Learning for Engine Health Management and Remaining Useful Life Prediction

Engine Health Management (EHM) depends on reliable forecasting of Remaining Useful Life (RUL) and on tracking thermal indicators such as tu…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェント

An Organization-Scoped LLM Agent Runtime Architecture for Regulated Cybersecurity Operations

Regulated cybersecurity workflows lack a runtime substrate that enforces organization-level scope across retrieval, tool calls, memory, fin…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成エージェント研究/論文

Crafter: A Multi-Agent Harness for Editable Scientific Figure Generation from Diverse Inputs

Scientific figures are among the most effective means of communicating complex research ideas, yet producing publication-quality illustrati…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Reward Learning from Best-of-$N$ Preference Data: Targets, Tradeoffs, and Design Principles

Best-of-$N$ sampling is widely used to construct pairwise preference data: $N$ candidates are drawn from a base distribution, and the best…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Active Timepoint Selection for Learning Measure-Valued Trajectories

Inferring continuous probability paths from sparse snapshots is a fundamental challenge in domains like single-cell biology, where high-fid…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

The Architecture of Errors: From Universal Impossibility to Patch-Local LLM Reliability

Universal LLM reliability is not a finite-library problem: across all possible tasks, tools, schemas, knowledge sources, and evaluator expe…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成

Controllable Lung Nodule Synthesis via Histogram-Regularized Latent Diffusion Models

While automated diagnosis systems have achieved remarkable success in computed tomography (CT)-based lung cancer screening, their developme…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Rationalize: Shared Semantic Reasoning for Human-AI Alignment

We introduce Rationalize, a role-pair framework for shared semantic reasoning between humans and AI models in data-driven sensemaking. Buil…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Score Broadcast and Decorrelation: A General Framework for Broadcast-Based Credit Assignment

We introduce Score Broadcast and Decorrelation (SBD), a principled framework for broadcast-based credit assignment for general families of…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成エージェントロボティクス研究/論文

PInVerify: An Offline Embodied Benchmark for Active Instance Verification

Embodied agents have made strong progress in navigating to target objects, but reaching the goal vicinity does not guarantee that the agent…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

COFT: Counterfactual-Conformal Decoding for Fair Chain-of-Thought Reasoning in Large Language Models

Large language models (LLMs) can reveal and amplify societal biases during chain-of-thought (CoT) generation. We present COFT (Chain of Fai…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Same Patient, Different Words, Different Diagnosis? Evaluating Semantic Stability in Clinical LLMs

Large Language Models (LLMs) are increasingly used in clinical applications. However, their behavior remains highly sensitive to subtle lin…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

LARK: Learnability-Grounded Trajectory Selection for Efficient Reasoning Distillation

We study trajectory selection for reasoning distillation, where teacher-generated reasoning trajectories are selectively used as supervisio…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

EUDAIMONIA: Evaluating Undesirable Dynamics in AI

Large language models (LLMs) are increasingly used as conversational partners for companionship, emotional disclosure, and interpersonal ad…

2026-06-01 13:00 JSTarXiv cs.AIエージェント

Automatically Attacking Software Reverse Engineering AI Agents

Software tools for reverse engineering executable binary files, such as Ghidra, enable malware analysts to safely conduct robust static ana…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

CobSeg: Coherence Boundary Modeling for Dialogue Topic Segmentation

Dialogue topic segmentation is critical in many human-AI collaborative applications which requires identifying heterogeneous boundary cues,…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Human-Alignment, Calibration, and Activation Patterns in Large Language Model Uncertainty

Uncertainty Quantification is a large and growing subfield of large language model behavioral analysis. Primarily to recognize and combat h…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Investigating Detection and Obfuscation of Prompt Injection Attacks Against Software Reverse Engineering AI Agents

Agentic software reverse engineering systems are vulnerable to prompt injection attacks placed into the source code of executable binary fi…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

How Early Adopters Used Generative AI Worldwide: Variation by Country Income and Language

AI is being used by people globally, but not everyone is using it in the same ways. Using a large-scale dataset of anonymized, de-identifie…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Depth-Dependent Indirect Prompt Injection in Tool-Calling ReAct Agents: Injection Depth, Payload Framing, and Turn-Budget Sensitivity

ReAct agents that interleave chain-of-thought reasoning with tool calls are increasingly deployed for real tasks such as scheduling, file r…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成

ConTrans: Learning Text-enhanced Local-global Temporal Representations for Zero-shot Temporal Action Localization

Zero-shot Temporal Action Localization (ZS-TAL) aims to detect and locate previously unseen actions in untrimmed videos. However, existing…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成エージェント

Seeing Before Agreeing: Aligning Multi-Agent Consensus with Visual Evidence

Vision-language models (VLMs) have achieved strong performance on visual question answering (VQA). To mitigate individual hallucinations an…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェント

SAGE: A Novelty Gate for Efficient Memory Evolution in Agentic LLMs

Agentic LLMs must continuously decide whether newly extracted facts should be added, merged with existing memories, or ignored, yet prior w…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成

Simple Token-Efficient Vision-Language Model for Case-level Pathology Synoptic Report Generation

Generating clinically useful pathology reports for pathology cases from whole-slide images (WSIs) is challenging due to gigapixel resolutio…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

When are LLMs Sufficient Policy Optimizers for Sequential RL Tasks?

We study when large language models (LLMs) can serve as effective black-box policy optimizers for reinforcement learning (RL) tasks, i.e.,…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Kalimati Vegetable Price Index Forecasting with a Momentum Corrected Online Stacking Ensemble

Forecasting agricultural commodity prices in emerging economies is difficult due to high volatility, frequent supply disruptions, and stron…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

OrcaRouter: A Production-Oriented LLM Router with Hybrid Offline-Online Learning

The rapid development of large language models, each with distinct capabilities and inference costs, raises a practical deployment question…

2026-06-01 13:00 JSTarXiv cs.AIロボティクス

GSAM: A Generalizable and Safe Robotic Framework for Articulated Object Manipulation

Articulated object manipulation is a unique challenge for service robots. Existing methods employ end-to-end policy learning, visionmotion…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Chatterbox-Flash: Prior-Calibrated Block Diffusion for Streaming Zero-Shot TTS

We present Chatterbox-Flash, a zero-shot text-to-speech model obtained by fine-tuning a pretrained autoregressive TTS decoder into a block-…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

XLGoBench: Detecting cross-lingual skill gaps with algorithmic tasks

We introduce a set of synthetic algorithmic tasks to detect cross-lingual gaps in the abilities of large language models. Our benchmark is…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Smaller Models are Natural Explorers for Policy-Level Diversity in GRPO

We identify a new dimension for enhancing rollout diversity in Group Relative Policy Optimization (GRPO) for LLMs. While GRPO relies on div…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

On the impact of retrieved content representations in RAG Pipelines

Retrieval-Augmented Generation (RAG) supplements a language model's input with retrieved documents, yet most RAG pipelines inherit retrieva…

2026-06-01 13:00 JSTarXiv cs.AIビジネス/資金調達

OpenSTBench: Beyond Semantic Evaluation for Speech Translation

Speech translation systems increasingly span speech-to-text translation (S2TT), speech-to-speech translation (S2ST), offline translation, a…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成研究/論文

MechVQA: Benchmarking and Enhancing Multimodal LLMs on Comprehensive Mechanical Drawing Understanding

Multimodal Large Language Models (MLLMs) have demonstrated significant achievements in general visual question answering (VQA) tasks. Howev…

2026-06-01 13:00 JSTarXiv cs.AIエージェントビジネス/資金調達

Design and Evaluation of Multi-Agent AI Oracle Systems for Prediction Market Resolution

Prediction markets aggregate collective intelligence to forecast uncertain events, but their utility depends on reliable outcome resolution…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

Differentially Private Preference Data Synthesis for Large Language Model Alignment

Preference alignment is a crucial post-training step for large language models (LLMs) to ensure their outputs align with human values. Howe…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

GaMi: Geometry-Agnostic Material Identification via Cross-Modal Subtractive Disentanglement

Non-contact material identification enables adaptive interaction for embodied intelligence yet faces challenges from geometry-induced varia…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Unlearning in Diffusion Models: A Unified Framework with KL Divergence and Likelihood Constraints

Unlearning in diffusion models aims to remove undesirable data or concepts while preserving the utility of pretrained models -- two fundame…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Beyond Agreement: Scoring Panel-Surfaced Biomedical Entity Candidates for Curator Triage

Biomedical NER is deceptively simple for modern LLMs: plausible biomedical mentions are easy to surface, but corpus-convention correctness…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Your Teacher Can't Help You Here: Combating Supervision Fidelity Decay in On-Policy Distillation

On-policy distillation transfers reasoning capabilities by training a student model on its own generated trajectories using token-level fee…

2026-06-01 13:00 JSTarXiv cs.AIロボティクス

Hide-and-Seek in Trajectories: Discovering Failure Signals for VLA Runtime Monitoring

Vision-Language-Action (VLA) models enable robots to follow natural language instructions and generalize across diverse tasks, but they rem…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

Fine-Tuning Improves Information Conveyance in Language Models

Fine-tuning is often believed to reduce uncertainty and diversity in large language models, but existing analyses overlook output length, a…

2026-06-01 13:00 JSTarXiv cs.AIエージェント

Safe Equilibrium Policy Optimization for Strategic Agent Policies

Language models fine-tuned with reinforcement learning typically optimize for task reward, ignoring multi-agent strategic structure. Becaus…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

DARTS: Distribution-Aware Active Rollout Trajectory Shaping for Accelerating LLM Reinforcement Learning

Reinforcement Learning (RL) has become pivotal for improving model capabilities yet suffers from rollout efficiency bottlenecks due to the…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Sophrosyne: Agentic Exploration of Relational Data Systems Needs Moderation

Text2SQL agents powered by LLMs translate natural language intent into SQL by exploring the data system through tool calls before formulati…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Federated Variational Preference Alignment with Gumbel-Softmax Prior for Personalized User Preferences

Federated Learning (FL) offers a privacy-preserving pathway for aligning Large Language Models (LLMs); however, existing frameworks typical…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェント

PatchWorld: Gradient-Free Optimization of Executable World Models

Text-agent environments are typically modeled as partially observable Markov decision processes (POMDPs), assuming that the simulator's lat…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

A Unified and Reproducible Experimentation Framework for Speech Understanding

Speech foundation models and Speech LLMs have advanced speech understanding, yet deployment-oriented model selection is hindered by non-com…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Inverse Reinforcement Learning without an Optimal Demonstrator: A Feasible Reward Set Approach

Inverse reinforcement learning (IRL) typically assumes demonstrations from a single optimal demonstrator, but in many applications data com…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

BlueFin: Benchmarking LLM Agents on Financial Spreadsheets

We present BlueFin, a benchmark that tasks large language model (LLM) agents with synthesis, manipulation, and comprehension tasks over spr…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成

What Makes LVLMs Hallucinate Less? Unveiling the Architectural Factors Behind Hallucination Robustness

Hallucination remains one of the key challenges undermining the reliability of Large Vision-Language Models (LVLMs). But what makes an LVLM…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Toxic HallucinAItions: Perturbing Prompts and Tracing LLM Circuits

Large language models (LLMs) are increasingly deployed in conversational settings where user tone ranges from polite to adversarial or toxi…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

De-attribute to Forget for LLM Unlearning

The rapid development of large language models (LLMs) has raised concerns on the use of inappropriate data for training, which has led to a…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

TUX: Measuring Human--AI Tacit Understanding

As large language models (LLMs) increasingly act as collaborative partners, human--AI alignment is often evaluated through explicit task su…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Do Large Language Models Encode Institutional Experience? Evidence from Cross-Linguistic Moral Reasoning Under Ambiguity

Large language models (LLMs) exhibit systematic differences in moral reasoning across languages, yet the source of this variation remains u…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

AMix-2: Establishing Protein as a Native Modality in Large Language Models

We present AMix-2, a protein-text foundation model that establishes protein as a native modality in large language models (LLMs), unifying…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

ImmersiveTTS: Environment-Aware Text-to-Speech with Multimodal Diffusion Transformer and Domain-Specific Representation Alignment

Recent advancements in text-guided audio generation have yielded promising results in diverse domains, including sound effects, speech, and…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Reading Between the Citations: A Typed Claim Network for Scientific Literature

Knowledge graphs over corpora of inter-referencing documents - scholarly papers, legal opinions, policy briefs - encode the topology of ref…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成

Variational Adapter for Cross-modal Similarity Representation

The core of vision-language models lies in measuring cross-modal similarity within a unified representation space. However, most image-text…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Generating Reports or Repeating Templates? Measuring and Mitigating Template Collapse in 3D CT Report Generation

Modern 3D medical vision-language models (VLMs) can generate fluent radiology-style text while exhibit critically low pathology detection a…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

DEM: A Distilled Explanation Model for Interpretable Anomaly Detection in Physiological Sensor Networks

Anomaly detection in physiological sensor data from Wireless Body Area Networks (WBANs) can be caused by sensor faults, network disruptions…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Annealed Softmax Greedy in Many-Armed Bayesian Bandits

Reinforcement learning with verifiable rewards (RLVR) and group-based policy optimization methods such as GRPO update a stochastic policy b…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成エージェント

Does Visual Information Play a Decisive Role in Vision-Language-Action Model Driving Behavior?

Vision-Language-Action (VLA) models have demonstrated promising capability in autonomous driving, highlighting the potential of unified mul…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェント

From Prompt Injection to Persistent Control: Defending Agentic Harness Against Trojan Backdoors

LLM agents are evolving from conversational chatbots to operational tools in real-world workspaces. In local agentic harnesses, an LLM can…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Routing on the Stiefel Manifold: When Does Adaptive Subspace Selection Help for Cross-Domain EEG Decoding?

Cross-domain EEG decoding remains challenging despite advances in Riemannian deep learning: covariance matrices from different subjects occ…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Learning to Solve and Optimize by Evolving Code

Combinatorial and optimization problems are fundamental to many industrial AI applications. Solving large-scale real-world instances of suc…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Linear Ordering Problem: Time for a Change

The Linear Ordering Problem (LOP) is a fundamental combinatorial optimization problem with important applications in areas such as economic…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

AnchorSteer: Self-Discovered Concept Injection for Structure-Preserving Music Editing

Controllable music editing is to modify high-level attributes while strictly preserving rhythmic and melodic structures. However, this task…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

STEP: Learning STructured Embeddings for Progressive Time Series

We present a novel method for learning interpretable representations of progressive time series, that is, data capturing irreversible state…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Fighting Numerical Hallucinations via Data-centric Compilation for Online Financial QA

Large Language Models (LLMs) have significantly advanced online data services, particularly in the domain of financial question answering (…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

DRIFT: Joint Channel Estimation and Prediction Towards Pilotless 6G Non-Terrestrial Networks

Non-terrestrial networks (NTNs) are expected to play a pivotal role in sixth-generation (6G) systems by enabling ubiquitous connectivity an…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

A Pilot Study on Curator-Guided Multilingual Art Description for Blind and Low-Vision Audiences with Small Vision-Language Models

Blind and low-vision (BLV) audiences remain underserved by visual art descriptions, particularly across languages and in museum settings wh…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成

On Revisiting Entropy for Identifying Mislabeled Images

Mislabeled samples in training datasets severely degrade the performance of deep networks, as overparameterized models tend to memorize err…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成ビジネス/資金調達

Redefining Instance Matching: A Unified Framework for Part-Aware Matching in Panoptic Segmentation Evaluation

The Panoptic Quality (PQ) metric is the standard for jointly evaluating instance and semantic segmentation. However, its original definitio…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

SpecDB: LLM-Generated Customized Databases via Feature-Oriented Decomposition

Mainstream relational databases ship a uniform feature set across deployments, although individual workloads exercise only a fraction of th…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

KnowledgeGain: Evaluating and Optimizing Science News Generation for Reader Learning

Science news is an important medium to communicate discoveries between the research communities and the public. Yet, most metrics for gener…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

SWIM: Single-Instance Whole-Body Imitation for swiMming

We propose a new method for synthesizing physically-based swimming motions. Physically-based character animation aims to generate physicall…

2026-06-01 13:00 JSTarXiv cs.AIロボティクス

TARIC: Memory-Augmented Traversability-Aware Outdoor VLN under Interrupted Semantic Cues

Outdoor vision-language navigation (VLN) in long-range, open-world environments is frequently disrupted by semantic-cue interruptions, wher…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Not All Synthetic Data Is Yours to Learn From

Can a language model improve from plain text sampled from itself, with no prompts, no teacher, no verifier, and no reward model? Yes, but o…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

UXR PoV for Neuroinclusive Emotion Regulation

Attention-deficit/hyperactivity disorder (ADHD) is a psychiatric disorder which presents itself in individuals through patterns of developm…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Developing an AI-Powered UX Research Point of View for Digital Health in A Regulatory Context: An Exemplar Case from MSM and Transgender HIV Care in Nigeria

User Experience Research (UXR) in a legal and regulatory contexts presents unique challenges that require specialised approaches to protect…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

On the Robustness of Multilingual Text Embedding Rankings Across Learning Tasks, Languages, and Benchmark Datasets

Large-scale multilingual text embedding models play crucial role in both research and industry, yet their behavior in language-specific, mu…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Extending the UXR Point of View Pyramid: A Generative AI-Augmented Methodology for Human-Centred AI Systems

Rising household debt and cost-of-living pressures in the United Kingdom have intensified the role of AI-driven financial technologies in m…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成

FOCUS: Forcing In-Context Object Localization through Visual Support Constraints and Policy Optimization

In-context localization (ICL) seeks to localize a target object specified by a small set of support examples in a query image, operating on…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

From Evidence to Design: Developing an AI-Augmented UX Research Point of View for Digital Wellbeing in Emergency and Public Safety Contexts

This paper investigates how User Experience Research (UXR) methods can be combined with AI-supported analysis to develop clearer design dir…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Developing a Culturally Grounded, AI-Augmented UX Research Point of View (POV): An Exemplar Case Study from Telemedicine Dementia Care

User Experience Research (UXR) Points of View (POVs) distil complex and often fragmented research evidence into actionable perspectives tha…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成エージェント

SpatialAct: Probing Spatial Reasoning-to-Action Capabilities of VLM Agents in 3D Scenes

Humans can effortlessly perceive spatial layouts, form cognitive representations, reason about spatial relations, and translate such reason…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Developing a UXR Point of View for Cognitive Accessibility in Mobile Learning with Generative AI

This study investigates how UX research (UXR) principles, combined with Large Language Model (LLM)-supported analysis, can be used to impro…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Trust-Region Behavior Blending for On-Policy Distillation

On-policy distillation (OPD) trains a student on prefixes sampled from its own policy while matching a stronger teacher. This addresses the…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

D$^3$: Dynamic Directional Graph-Constrained Data Scheduling for LLM Training

Training data plays a central role in large language models (LLMs) optimization, motivating extensive research on data scheduling strategie…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Emergent Languages in Populations of Language Model Agents: From Token Efficiency to Oversight Evasion

Monitoring autonomous language model agents currently relies mostly on surface behavior. But what happens when agent populations invent new…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

MIMO: Multilingual Information Retrieval via Monolingual Objectives

Multilingual Information Retrieval (MLIR) reflects real-world search environments in which queries and relevant documents may appear in dif…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

MindVoice: Reconstructing Intelligible Speech from Non-invasive Neural Signals with Pretrained Priors

Reconstructing continuous speech from non-invasive neural recordings is a fundamental problem for probing human auditory perception and bui…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Steering LLMs? Actually, Sparse Autoencoders can outperform simple baselines

Sparse Autoencoders (SAEs) have been seen as a promising avenue for exploring the internals of Large Language Models (LLMs) and for steerin…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成ロボティクス

Probing Collision Grounding in Vision-Language Models for Safe Human-Robot Collaboration

Safe human--robot collaboration requires more than visual description: a monitor must determine whether the robot body is safely separated,…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

MAECO-Lite: Modular Ontology for Dynamic Malware Analysis

Capturing dynamic malware behavior in a practical but still semantically precise manner remains a significant challenge in cyber threat int…

2026-06-01 13:00 JSTarXiv cs.AIロボティクス

Simulation of collision avoidance behavior in crowd movement by data-driven approach

Crowd movement simulation is essential for pedestrian safety management and facility layout optimization. Data-driven models enhance trajec…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成ハードウェア/半導体研究/論文

Benchmarking and Enhancing Text-to-Image Models for Generating Visual Representations in Early Arithmetic Education

AI systems are increasingly used to support educational content creation, yet it remains unclear whether they can generate outputs that fai…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Shared Doubt: Zero-shot Cross-Lingual Confidence Estimation for Language Models

Confidence estimation (CE), i.e. quantifying the reliability of a model's prediction, has attracted great interest in the context of large…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Comparing LLM-Based Conversational and Graphical Interfaces for Industrial Decision Tasks: An Exploratory Mixed-Methods Study

The use of Generative AI Conversational User Interfaces (CUI) as a new way to access and analyze data is growing in all sectors, and the in…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

What changes after deployment? A survey on On-device Learning in TinyML

Machine learning models on microcontroller-class devices (TinyML) face a fundamental challenge: post-deployment distribution change undermi…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

EchoRL: Reinforcement Learning via Rollout Echoing

Reinforcement Learning with Verifiable Rewards is an effective route for post-training to strengthen the reasoning capability of large lang…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成

Beyond Classification: Dynamic Adapter Routing for Continual Multimodal Retrieval

While retrieval is a core function of vision-language models, continually updating these models for retrieval tasks remains critically unde…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Correcting Split Selection in Online Decision Trees via Anytime-Valid Inference

Bagging-based ensembles, most notably Adaptive Random Forests, are among the strongest performers for learning from data streams. A common…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Learning Cardiac Latent Representations in Vectorcardiogram Space

Electrocardiography (ECG) is a cornerstone of cardiac assessment, making the learning of informative ECG representations fundamental to tas…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Entropic Projection Alignment: Estimating, Explaining, and Improving Model Performance Under Distribution Shift

We propose a unified framework for addressing three key challenges of distribution shift: (1) estimating a model's performance on an unlabe…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成エージェント研究/論文

ERGeoBench:A Comprehensive Benchmark for Embodied Reasoning and Geo-localization in Multimodal Large Language Models

Multimodal large language models (MLLMs) have shown strong potential as embodied agents, yet embodied geo-localization remains underexplore…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Why Linear Recurrent Memory Works in Partially Observable Reinforcement Learning

The family of linear recurrent neural networks has shown strong performance as recurrent memory units in partially observable reinforcement…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成

Envisioning Beyond the Few: Disentangled Semantics and Primitives for Few-Shot Atypical Layout-to-Image Generation

The layout-to-image (L2I) task enables fine-grained control over image generation via object categories and spatial layouts. However, exist…

2026-06-01 13:00 JSTarXiv cs.AIエージェント

Personalized to Persuade: The Effects of Contextualization and Warmth on Trust and Reliance in Conversational AI

Artificial Intelligence (AI) agents personalize their responses by tailoring explanations to users' backgrounds, interests, and prior inter…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Practical Cross-Band Channel Prediction for AI-RAN via Physics-Guided Deep Unfolding

To make cross-band channel prediction practical for AI-native RAN, algorithms must generalize across diverse environments and support real-…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成

SAM for Robust Mitochondria Instance Segmentation in Fluorescence Microscopy

The morphological analysis of mitochondria in fluorescence microscopy (FM) is crucial for understanding cellular health, energy production,…

2026-06-01 13:00 JSTarXiv cs.AIロボティクス

DeMaVLA: A Vision-Language-Action Foundation Model for Generalizable Deformable Manipulation

Real-world household robots require Vision-Language-Action (VLA) foundation models that can acquire reusable manipulation skills across div…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Neither Replacement nor Panacea: Comparing LLM-Based Conversational and Graphical Decision Support in Industrial Tasks

Managers in manufacturing settings rely on digital interfaces to interpret operational data for decision-making, but growing data volume an…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

The Terminal Representation in Reinforcement Learning

Representation learning is a powerful tool for spatio-temporal abstraction within reinforcement learning (RL). Two well established approac…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Latent Space Disentanglement via Activation Steering for Interpretable Attribute Control in Symbolic Music Generation

Transformer-based architectures have significantly advanced the generation of complex symbolic sequences, yet a significant gap remains in…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Inconsistency-Aware Minimization: Improving Generalization with Unlabeled Data

Estimating the generalization gap and developing optimization methods that improve generalization are crucial for deep learning models, for…

2026-06-01 13:00 JSTarXiv cs.AIエージェント

Social welfare optimisation under institutional reward and punishment

Institutional incentives are widely used to promote cooperation among autonomous, self-regarding agents, from human societies to multi-agen…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Appropriateness of Empathy in AI: A Signal-Cost Perspective

The appropriateness of empathy in AI has emerged as a critical concern, as excessive empathy risks seeming manipulative while insufficient…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成研究/論文

FBHM: Functional Benchmarking and Steering of VLMs for Hateful Meme Detection

Hateful meme detection remains a formidable challenge for vision-language models, as existing benchmarks are structurally observational - c…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

dashi: A Python library for Dataset Shift Characterization to Support Trustworthy AI Development and Deployment

The Artificial Intelligence (AI) life cycle requires a thorough understanding of the underlying data dynamics for robust, safe and cost-eff…

2026-06-01 13:00 JSTarXiv cs.AIエージェント

Dreaming Of Others: Latent Teammate Modeling In World Models For Multi-Agent Reinforcement Learning

In cooperative multi-agent reinforcement learning (MARL), agents must coordinate with partners whose internal policies and intentions are n…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Scaling Higher-Order Graph Learning with Maximal Clique Complexes

Graph neural networks (GNNs) are limited to modeling pairwise interactions, while higher-order models based on cell complexes achieve great…

2026-06-01 13:00 JSTarXiv cs.AIエージェント

DynaTree: Dynamic Agentic Retrieval Tree for Time-Sensitive News Retrieval

Agentic Retrieval-Augmented Generation improves retrieval by integrating planning, tool use, and iterative reasoning, but existing agentic…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Target-Side Paraphrase Augmentation for Sign Language Translation with Large Language Models

Sign language translation (SLT) remains constrained by limited paired sign-video/text corpora and heavy-tailed target vocabularies. We stud…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

The Sword, Shield, and Achilles' Heel: Characterizing the Linguistic Inductive Bias of Large Language Models for Spatial Reasoning in Navigation Planning

Large Language Model (LLM)-based navigation systems commonly construct explicit spatial representations (e.g., topological graphs, semantic…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Skill Availability and Presentation Granularity in Large-Language-Model Agents: A Controlled SkillsBench Study

Skill documents provide procedural knowledge to large-language-model agents at inference time. This article studies whether the presentatio…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Neuro-symbolic Syntactic Parsing: Shaping a Neural Network with the CYK Algorithm

In this paper, we show the possibility of a direct injection of algorithms into neural network architecture. We focus on a complex algorith…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

DOA: Training-Free Decoder-Only Attention Policy for Long-Form Simultaneous Translation with SpeechLLMs

Simultaneous speech-to-text translation (SimulST) generates translations while speech is still unfolding, requiring a streaming policy that…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Used Car Salesbots? Honesty and Credulity of LLMs as Bargaining Agents under Partial Information

In this work we study agents in simulated bargaining scenarios, where a buyer and a seller communicate through a text channel and attempt t…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Fine-grained Verification via Diagnostic Reasoning Supervision for Aspect Sentiment Triplet Extraction

Aspect Sentiment Triplet Extraction (ASTE) aims to identify aspect terms, opinion terms, and sentiment polarities as structured triplets, p…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェント

PithTrain: A Compact and Agent-Native MoE Training System

Mixture-of-Experts (MoE) has become the dominant architecture for frontier language models. To meet this demand, production frameworks have…

2026-06-01 13:00 JSTarXiv cs.AIエージェントハードウェア/半導体

GPU Forecasters: Language Models as Selective Surrogates for Kernel Runtime Optimization

GPU kernels are the workhorse of modern deep learning, and optimizing them (via evolutionary search or coding agents) usually requires repe…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Scaling Conversational Hungarian ASR: The BEA-Dialogue+ Corpus

Conversational automatic speech recognition in Hungarian is constrained by the limited amount of publicly available dialogue-style training…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

On Efficient Scaling of GNNs via IO-Aware Layers Implementations

Graph Neural Networks (GNNs) are bottlenecked by sparse, irregular memory access. Popular frameworks such as DGL and PyTorch Geometric supp…

2026-06-01 13:00 JSTarXiv cs.AIエージェント

Skill Reuse as Compression in Agentic RL

Large language model agents trained with reinforcement learning (RL) often learn brittle, task-specific shortcuts. We hypothesize that agen…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

If LLMs Have Human-Like Attributes, Then So Does Age of Empires II

Much research has been carried out on large language models (LLMs) and LLM-powered agentic workflows. However, many works within the field…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Separating Secrets from Placeholders: A Hybrid CNN-CodeBERT Framework for Three-Class Credential Leakage Detection

Credential leakage in public source code repositories poses a critical security threat, with over 23.8 million secrets exposed in 2024 alon…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成

Feature-Optimized Vision for Adaptive 3D Scene Reconstruction

Three-dimensional scene reconstruction depends on local image evidence that is both visually discriminative and geometrically useful. Fixed…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成

RayDer: Scalable Self-Supervised Novel View Synthesis from Real-World Video

Self-supervised novel view synthesis (NVS) remains challenging to scale, despite the abundance of video data, largely due to the brittlenes…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Vision-Language Models Suppress Female Representations Under Ambiguous Input

Alignment teaches vision-language models (VLMs) to avoid expressing demographic biases, and when gender is clearly visible they largely suc…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Positional versus Symbolic Attention Heads: Learning Dynamics, RoPE Geometry, and Length Generalization

Transformer-based language models are widespread in today's society. As such, understanding the mechanisms by which they solve structured t…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

What Gets Unmasked First? Trajectory Analysis of Diffusion Models for Graph-to-Text Generation

We present the first systematic study of masked diffusion language models (MDLMs) for graph-to-text generation. We analyze MDLM generation…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

SPECTRA: Synthetic IR Test Collections with Relevance Oracles and Controlled Distractor Diagnostics

Scalable information retrieval testing needs corpora that are large enough to stress index construction, ranking latency, query routing, an…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェント

LongTraceRL: Learning Long-Context Reasoning from Search Agent Trajectories with Rubric Rewards

Long-context reasoning remains a central challenge for large language models, which often fail to locate and integrate key information in e…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Language Models Learn Constructional Semantics, Not To Mention Syntax: Investigating LM Understanding of Paired-Focus Constructions

Grasping the semantics of rare constructions (form-meaning pairings) has been shown to be a challenging problem that has currently only bee…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成

TunerDiT: Training-free Progressive Steering of Diffusion Transformer for Multi-Event Video Generation

Text-to-video (T2V) generation faces challenging questions when generating videos with long horizons containing multiple events. Inspired b…

2026-06-01 13:00 JSTarXiv cs.AIエージェント

Stateful Online Monitoring Catches Distributed Agent Attacks

Language models can find thousands of severe software vulnerabilities, and agents are increasingly being misused for cyberattacks. To avoid…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成

Lumos-Nexus: Efficient Frequency Bridging with Homogeneous Latent Space for Video Unified Models

Connector-based video unified models have demonstrated strong capability in instruction-grounded video synthesis, but integrating a large h…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

LLM Bias Evaluation: Gender, Racial, and Age Disparities in Occupational and Crime Scenarios

LLM bias evaluation is critical as large language models (LLMs) increasingly influence high-stakes decisions. This paper provides a compreh…

2026-06-01 13:00 JSTarXiv cs.AIビジネス/資金調達

Unifying and Optimizing Data Values for Selection via Sequential Decision-Making

Data selection has emerged as a crucial downstream application of data valuation, yet the theoretical foundations for using data values in…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

ProofWala: A Framework for Multilingual Proof Data Synthesis and Theorem-Proving

Neural approaches to theorem proving require robust infrastructure for interfacing with interactive theorem provers (ITPs), extracting stru…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Chain-of-Thought Reasoning In The Wild Is Not Always Faithful

Recent studies indicate that when faced with explicit biases in prompts, models often omit mentioning these biases in their Chain-of-Though…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Inferring Events from Time Series using Language Models

A common goal in analyzing time series data is to understand how events cause observed variations. We study whether Large Language Models (…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Symbolic Intermediaries as a Linguistic-Numerical Interface for LLM-Driven Geometric Reasoning

Large Language Models (LLMs) display reasoning capabilities over linguistic and symbolic objects but have limited capabilities to directly…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

OLG++: A Semantic Extension of Obligation Logic Graph

We present OLG++, a semantic extension of the Obligation Logic Graph (OLG) for modeling regulatory and legal rules in municipal and interju…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Neuro-Symbolic Predictive Process Monitoring

This paper addresses the problem of suffix prediction in Business Process Management (BPM) by proposing a Neuro-Symbolic Predictive Process…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

ReTabAD: A Benchmark for Restoring Semantic Context in Tabular Anomaly Detection

In tabular anomaly detection (AD), textual semantics often carry critical signals, as the definition of an anomaly is closely tied to domai…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

SAC-Opt: Semantic Anchors for Iterative Correction in Optimization Modeling

Large language models (LLMs) have opened new paradigms in optimization modeling by enabling the generation of executable solver code from n…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Post-Training LLMs as Better Decision-Making Agents: A Regret-Minimization Approach

Large language models (LLMs) are increasingly deployed as "agents" for decision-making (DM) in interactive and dynamic environments. Yet, s…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

HERMES: Towards Efficient and Verifiable Mathematical Reasoning in LLMs

Informal mathematics has been central to modern large language model (LLM) reasoning, offering flexibility and efficient construction of ar…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Debate with Images: Detecting Deceptive Behaviors in Multimodal Large Language Models

Are frontier AI systems becoming more capable? Certainly. Yet such progress is not an unalloyed blessing but rather a Trojan horse: behind…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

DTop-p MoE: Sparsity-Controlled Dynamic Top-p MoE for Foundation Model Pre-training

Sparse Mixture-of-Experts architectures are essential for scaling model capacity efficiently, yet the standard Top-$k$ routing imposes a ri…

2026-06-01 13:00 JSTarXiv cs.AIエージェント

Agentic Physical AI toward a Domain-Specific Foundation Model for Energy Systems: A Case Study on Nuclear Reactor Control

The prevailing paradigm in AI for physical systems: scaling general-purpose foundation models toward universal multimodal reasoning, confro…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Regret-Based Federated Causal Discovery with Unknown Interventions

Most causal discovery methods recover a completed partially directed acyclic graph representing a Markov equivalence class from observation…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェント

ConSensus: Multi-Agent Collaboration for Multimodal Sensing

Large language models (LLMs) are increasingly grounded in sensor data to perceive and reason about human physiology and the physical world.…

2026-06-01 13:00 JSTarXiv cs.AIエージェント

NEMO: Execution-Aware Optimization Modeling via Autonomous Coding Agents

We present NEMO, a system that translates Natural-language descriptions of decision problems into formal Executable Mathematical Optimizati…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体ビジネス/資金調達

Diagnosing the Reliability of LLM-as-a-Judge via Item Response Theory

While LLM-as-a-Judge is widely used in automated evaluation, existing validation practices primarily operate at the level of observed outpu…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

From Out-of-Distribution Detection to Hallucination Detection: A Geometric View

Detecting hallucinations in large language models is a critical open problem with significant implications for safety and reliability. Whil…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

MedCoG: Maximizing LLM Inference Density in Medical Reasoning via Meta-Cognitive Regulation

Large Language Models (LLMs) have shown strong potential in complex medical reasoning yet face diminishing gains under inference scaling la…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Discovering Differences in Strategic Behavior Between Humans and LLMs

As Large Language Models (LLMs) are increasingly deployed in social and strategic scenarios, it becomes critical to understand where and wh…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成

Certified Circuits: Stability Guarantees for Mechanistic Circuits

Understanding how neural networks arrive at their predictions is essential for debugging, auditing, and deployment. Mechanistic interpretab…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

SPM-Bench: Benchmarking Large Language Models for Scanning Probe Microscopy

As LLMs achieved breakthroughs in general reasoning, their proficiency in specialized scientific domains reveals pronounced gaps in existin…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェント

From Weak Cues to Real Identities: Evaluating Inference-Driven De-Anonymization in LLM Agents

Anonymization is often assumed to protect privacy once explicit identifiers are removed, because re-identification has historically require…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Reliable Self-Improvement Training by Verifying Reasoning, Not Just Answers

Self-improvement training, where models learn from self-generated solutions, promises sustained capability gains but suffers from a pervasi…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Counterfactual Credit Policy Optimization for Multi-Agent Collaboration

Collaborative multi-agent large language models (LLMs) can solve complex reasoning tasks by decomposing roles, but reinforcement learning f…

2026-06-01 13:00 JSTarXiv cs.AIエージェントビジネス/資金調達

LH-Bench: Skill-Grounded Evaluation of Long-Horizon Agents on Subjective Enterprise Tasks

Large language models excel on objectively verifiable tasks such as math and programming, where evaluation reduces to unit tests or a singl…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Learning to Reason with Insight for Informal Theorem Proving

Although most of the automated theorem-proving approaches depend on formal proof systems, informal theorem proving can align better with la…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

ClimAgent: LLM as Agents for Autonomous Open-ended Climate Science Analysis

Climate research is pivotal for mitigating global environmental crises, yet the accelerating volume of multi-scale datasets and the complex…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

To Use AI as Dice of Possibilities with Timing Computation

The dominant noun-based modeling paradigm has fundamentally constrained AI development, precluding any adequate representation of the futur…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェントビジネス/資金調達

Counterfactual Trace Auditing of LLM Agent Skills

Large Language Model agents are increasingly augmented with agent skills. Current evaluation methods for skills remain limited. Most deploy…

2026-06-01 13:00 JSTarXiv cs.AIエージェント

ASH: Agents that Self-Hone via Embodied Learning

Long-horizon embodied tasks remain a fundamental challenge in AI, as current methods rely on hand-engineered rewards or action-labeled demo…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Fully Open Meditron: An Auditable Pipeline for Clinical LLMs

Clinical decision support systems (CDSS) require scrutable, auditable pipelines that enable rigorous, reproducible validation. Yet current…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

PlanningBench: Generating Scalable and Verifiable Planning Data for Evaluating and Training Large Language Models

Planning is a fundamental capability for large language models (LLMs) because such complex tasks require models to coordinate goals, constr…

2026-06-01 13:00 JSTarXiv cs.AIエージェント

ScenePilot: Controllable Boundary-Driven Critical Scenario Generation for Autonomous Driving

Safety-critical scenarios are central to evaluating autonomous driving systems, yet their rarity in naturalistic logs makes simulation-base…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

BoxLitE: A Faithful Knowledge Base Embedding Based on Convex Optimization

Knowledge base (KB) embeddings aim at combining the capability of classical knowledge graph embeddings to generalize the information presen…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

MuCRASP: Multimodal Chain-of-thought Reasoning aware Structured Pruning

Vision-language models (VLMs) increasingly rely on chain-of-thought (CoT) reasoning to solve complex multimodal tasks, but their large para…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Exploiting Local Dynamics Regularity for Reusable Skills in Offline Hierarchical RL

Hierarchical Reinforcement Learning (HRL) promises to solve long-horizon Reinforcement Learning (RL) tasks more efficiently than non-hierar…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Advancing Creative Physical Intelligence in Large Multimodal Models

Large multimodal models (LMMs) have rapidly advanced in perception and reasoning; however, it remains unclear whether these capabilities ge…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

Neuro-Symbolic Verification of LLM Outputs for Data-Sensitive Domains (extended preprint)

LLMs deployed in high-stakes domains face fundamental reliability challenges: hallucinations, inconsistencies, and privacy vulnerabilities…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Alignment Tampering: How Reinforcement Learning from Human Feedback Is Exploited to Optimize Misaligned Biases

Reinforcement Learning from Human Feedback (RLHF) is the standard method to align Large Language Models (LLMs) with human preferences. In t…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Reward Bias Substitution: Single-Axis Bias Mitigations Redirect Optimization Pressure

Single-axis mitigations of reward-model biases (e.g., reducing proxy reliance on length, sycophancy, or style) can rotate optimization pres…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

VikingMem: A Memory Base Management System for Stateful LLM-based Applications

Large Language Models have revolutionized interactive applications; however, their finite context windows pose a critical data management c…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェント

SAAS: Self-Aware Reinforcement Learning for Over-Search Mitigation in Agentic Search

Agentic search enables LLMs to solve complex multi-hop questions through iterative reasoning and external search. Despite the effectiveness…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

OmniMatBench: A Human-Calibrated Multimodal Reasoning Benchmark Across 19 Materials Science Subfields

As multimodal language models play an increasingly important role in scientific research, materials science offers a critical testbed due t…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Domain-Specific Data Synthesis for LLMs via Minimal Sufficient Representation Learning

Large Language Models have demonstrated remarkable progress in general-purpose capabilities and can achieve strong performance in specific…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

MIRA: Mid-training Rubric Anchoring for Source-Aware Data Selection

Mid-training has become an important stage in modern LLM development, using large-scale curated mixtures to strengthen capabilities before…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Graph Machine Learning in the Era of Large Language Models (LLMs)

Graphs play an important role in representing complex relationships in various domains like social networks, knowledge graphs, and molecula…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Breaking Information Cocoons: A Hyperbolic Framework for Balancing Exploration and Exploitation in Recommender Systems

Modern recommender systems often create information cocoons, restricting users' exposure to diverse content. The central challenge is to ba…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Understanding the Fundamental Design Decisions of Retrieval-Augmented Generation Systems

Retrieval-Augmented Generation (RAG) has emerged as a critical technique for enhancing large language model (LLM) capabilities. However, pr…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成

Cross-Modal Attention Calibration for LVLM Hallucination Mitigation

Large vision-language models (LVLMs) have shown remarkable capabilities in visual-language understanding. Despite their success, LVLMs stil…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

Beyond Memorization: Assessing Semantic Generalization in Large Language Models Using Phrasal Constructions

The web-scale of pretraining data has created an important evaluation challenge: to disentangle linguistic competence on cases well-represe…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

PRISM: Self-Pruning Intrinsic Selection Method for Training-Free Multimodal Data Selection

Visual instruction tuning adapts pre-trained Multimodal Large Language Models (MLLMs) to follow human instructions for real-world applicati…

2026-06-01 13:00 JSTarXiv cs.AIエージェント

Auto-Discovery-Bench: Diagnosing Structured State Tracking in Oracle-Guided Discovery

Interactive discovery requires agents to maintain and update structured beliefs over many rounds of feedback. Before evaluating agents in n…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

EMCEE: Improving Multilingual Capability of LLMs via Bridging Knowledge and Reasoning with Extracted Synthetic Multilingual Context

Large Language Models (LLMs) have achieved impressive progress across a wide range of tasks, yet their heavy reliance on English-centric tr…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

How does Bayesian Sampling help Membership Inference Attacks?

Membership Inference Attacks (MIAs) aim to estimate whether a specific data point was used in the training of a given model. Existing state…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Unraveling LoRA Interference: Orthogonal Subspaces for Robust Model Merging

Fine-tuning large language models (LMs) for individual tasks yields strong performance but is expensive for deployment and storage. Recent…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Who Gets Credit or Blame? Attributing Accountability in Modern AI Systems

Modern AI systems are typically developed through multiple stages-pretraining, fine-tuning rounds, and subsequent adaptation or alignment,…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Unlearning's Blind Spots: Over-Unlearning and Prototypical Relearning Attack

Machine unlearning (MU) aims to expunge a designated forget set from a trained model without costly retraining, yet the existing techniques…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

SHIELD: Secure Hypernetworks for Incremental Expansion Learning Defense

Continual learning under adversarial conditions remains an open problem, as existing methods often compromise either robustness, scalabilit…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成

DISCO: Mitigating Bias in Deep Learning with Conditional Distance Correlation

Dataset bias often leads deep learning models to exploit spurious correlations instead of task-relevant signals. We introduce the Standard…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Organizational Adaptation to Generative AI in Cybersecurity

Cybersecurity organizations are adapting to GenAI integration through modified frameworks and hybrid operational processes, with success in…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成

PictSure: Pretraining Embeddings Matters for In-Context Learning Image Classifiers

Building image classification models remains cumbersome in data-scarce domains, where collecting large labeled datasets is impractical. In-…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成

Joint angle based learning to refine kinematic human pose estimation

Marker-free human pose estimation (HPE) has found increasing applications in various fields. Current HPE suffers from occasional errors in…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Human-Alignment and Calibration of Inference-Time Uncertainty in Large Language Models

There has been much recent interest in evaluating large language models for uncertainty calibration to facilitate model control and modulat…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Residual Reservoir Memory Networks

We introduce a novel class of untrained Recurrent Neural Networks (RNNs) within the Reservoir Computing (RC) paradigm, called Residual Rese…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成

Target-Agnostic Calibration under Distribution Shift with Frequency-Aware Gradient Rectification

Real-world model deployments inevitably encounter distribution shifts, rendering the confidence estimates of deep neural networks highly un…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Reasoning-Intensive Regression

AI researchers and practitioners increasingly apply large language models (LLMs) to what we call reasoning-intensive regression (RiR), i.e.…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Human Psychometric Questionnaires Mischaracterize LLM Behavior

We examine whether human psychometric questionnaires can serve as reliable tools for characterizing and predicting LLM behavior in everyday…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

MedFact: Benchmarking the Fact-Checking Capabilities of Large Language Models on Chinese Medical Texts

Deploying Large Language Models (LLMs) in medical applications requires fact-checking capabilities to ensure patient safety and regulatory…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Towards Atoms of Large Language Models

The fundamental representational units (FRUs) of large language models (LLMs) remain undefined, limiting further understanding of their und…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Towards Foundation Models for Zero-Shot Time Series Anomaly Detection: Leveraging Synthetic Data and Relative Context Discrepancy

Time series anomaly detection (TSAD) is a critical task, but developing models that generalize to unseen data in a zero-shot manner remains…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成

SAEmnesia: Erasing Concepts in Diffusion Models with Supervised Sparse Autoencoders

Concept unlearning in diffusion models is hampered by feature splitting, where concepts are distributed across many latent features, making…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Spectral Collapse Drives Loss of Plasticity in Deep Continual Learning

We investigate why deep neural networks suffer from loss of plasticity in continual learning, and thus fail to learn new tasks without rein…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Dual Mechanisms of Value Expression: Intrinsic vs. Prompted Values in Large Language Models

Large language models can express values in two main ways: (1) intrinsic expression, reflecting the model's inherent values learned during…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Mechanistic Interpretability as Statistical Estimation: A Variance Analysis

Mechanistic Interpretability (MI) aims to reverse-engineer model behaviors by identifying functional sub-networks. Yet, the scientific vali…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

LLMs Lean on Priors, Not Programming Language Semantics

Recent work asks whether large language models (LLMs) condition their reasoning on explicit rules rather than statistical regularities from…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

OBCache: Optimal Brain KV Cache Pruning for Efficient Long-Context LLM Inference

Large language models (LLMs) with extended context windows enable powerful applications but impose significant memory overhead, as caching…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

PAC-Bayesian Reinforcement Learning Trains Generalizable Policies

We derive a novel PAC-Bayesian generalization bound for reinforcement learning that explicitly accounts for Markov dependencies in the data…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Boundary-Guided Policy Optimization for Memory-efficient RL of Diffusion Large Language Models

A key challenge in applying reinforcement learning (RL) to diffusion large language models (dLLMs) is the intractability of their likelihoo…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成

CaptionFormer: Unified Segmentation, Tracking, and Captioning for Spatio-Temporal Objects

Dense Video Object Captioning (DVOC) is the task of jointly detecting, tracking, and captioning object trajectories in a video, requiring t…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

InfiMed-ORBIT: Aligning LLMs on Open-Ended Complex Tasks via Rubric-Based Incremental Training

Reinforcement learning (RL) has powered many recent breakthroughs in large language models (LLMs), especially for tasks where rewards can b…

2026-06-01 13:00 JSTarXiv cs.AIエージェント

Scaling Multi-Agent Environment Co-Design with Diffusion Models

The agent-environment co-design paradigm jointly optimises agent policies and environment configurations in search of improved system perfo…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成

SpectralTrain: A Universal Framework for Hyperspectral Image Classification

Hyperspectral image (HSI) classification typically involves large-scale data and computationally intensive training, which limits the pract…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成ロボティクス

Mixture of Horizons in Action Chunking

Vision-language-action (VLA) models have shown remarkable capabilities in robotic manipulation, but their performance is sensitive to the $…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成

Reasoning-Aware Multimodal Fusion for Hateful Video Detection

Hate speech in online videos is posing an increasingly serious threat to digital platforms, especially as video content becomes increasingl…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Conditional Coverage Diagnostics for Conformal Prediction

Evaluating conditional coverage remains one of the most persistent challenges in assessing the reliability of predictive systems. Although…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies

Existing reinforcement learning (RL) approaches treat large language models (LLMs) as a unified policy, overlooking their internal mechanis…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

FEM-Bench: A Structured Scientific Reasoning Benchmark for Evaluating Code-Generating LLMs

As LLMs advance their reasoning capabilities about the physical world, the absence of rigorous benchmarks for evaluating their ability to g…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成

Flow Equivariant World Models: Memory for Partially Observed Dynamic Environments

Embodied systems experience the world as 'a symphony of flows': a combination of many continuous streams of sensory input coupled to self-m…

2026-06-01 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

Rethinking Multimodal Few-Shot 3D Point Cloud Segmentation: From Fused Refinement to Decoupled Arbitration

In this paper, we revisit multimodal few-shot 3D point cloud semantic segmentation (FS-PCS), identifying a conflict in "Fuse-then-Refine" p…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

The Refutability Gap: Challenges in Validating Reasoning by Large Language Models

Recent reports claim that Large Language Models (LLMs) have achieved the ability to derive new science and exhibit human-level general inte…

2026-06-01 13:00 JSTarXiv cs.AIビジネス/資金調達

PASTA: A Scalable Framework for Multi-Policy AI Compliance Evaluation

AI compliance is becoming increasingly critical as AI systems grow more powerful and pervasive. Yet the rapid expansion of AI policies crea…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Performance and Complexity Trade-off Optimization of Speech Models During Training

In speech machine learning, neural network models are typically designed by choosing an architecture with fixed layer sizes and structure.…

2026-06-01 13:00 JSTarXiv cs.AIロボティクス

SKETCH: Semantic Key-Point Conditioning for Long-Horizon Vessel Trajectory Prediction

Accurate long-horizon vessel trajectory prediction remains challenging due to compounded uncertainty from complex navigation behaviors and…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達規制/政策

Gap-K%: Measuring Top-1 Prediction Gap for Detecting Pretraining Data

The opacity of massive pretraining corpora in Large Language Models (LLMs) raises significant privacy and copyright concerns, making pretra…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

ParalESN: Enabling parallel information processing in Reservoir Computing

Reservoir Computing (RC) has established itself as an efficient paradigm for temporal processing. However, its scalability remains severely…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Decouple Searching from Training: Scaling Data Mixing via Model Merging for Large Language Model Pre-training

Determining an effective data mixture is a key factor in Large Language Model (LLM) pre-training, where models must balance general compete…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Multi-Agent Teams Hold Experts Back

Multi-agent LLM systems are increasingly deployed as autonomous collaborators, where agents interact freely rather than execute fixed, pre-…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

The Gaussian-Head OFL Family: One-Shot Federated Learning from Client Global Statistics

Classical Federated Learning relies on a multi-round iterative process of model exchange and aggregation between server and clients, with h…

2026-06-01 13:00 JSTarXiv cs.AIビジネス/資金調達

An Odd Estimator for Shapley Values

The Shapley value is a ubiquitous framework for attribution in machine learning, encompassing feature importance, data valuation, and causa…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Plain Transformers are Surprisingly Powerful Link Predictors

Link prediction is a core challenge in graph machine learning, demanding models that capture rich and complex topological dependencies. Whi…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Mixture of Concept Bottleneck Experts

Concept Bottleneck Models (CBMs) promote interpretability by grounding predictions in human-understandable concepts. However, existing CBMs…

2026-06-01 13:00 JSTarXiv cs.AIエージェント

CVE-Factory: Scaling Expert-Level Agentic Tasks for Code Security Vulnerability

Evaluating and improving the security capabilities of code agents requires high-quality, executable vulnerability tasks. However, existing…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Stop the Flip-Flop: Context-Preserving Verification for Fast Revocable Diffusion Decoding

Parallel diffusion decoding can accelerate diffusion language model inference by unmasking multiple tokens per step, but aggressive paralle…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Pull Requests as a Training Signal for Repo-Level Code Editing

Repository-level code editing requires models to understand complex dependencies and execute precise multi-file modifications across a larg…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

A Kinetic Energy Perspective of Flow Matching

Flow-based generative models can be viewed through a physics lens: sampling transports a particle from noise to data by integrating a learn…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Inverting Data Transformations via Diffusion Sampling

We study the problem of transformation inversion on general Lie groups: a datum is transformed by an unknown group element, and the goal is…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

Breaking the Simplification Bottleneck in Amortized Neural Symbolic Regression

Symbolic regression (SR) aims to discover interpretable analytical expressions that accurately describe observed data. Amortized SR promise…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェントビジネス/資金調達

A Behavioural and Representational Evaluation of Goal-Directedness in Language Model Agents

Understanding an agent's goals helps explain and predict its behaviour, yet there is no established methodology for reliably attributing go…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Effective Reasoning Chains Reduce Intrinsic Dimensionality

Chain-of-thought (CoT) reasoning and its variants have substantially improved the performance of language models on complex reasoning tasks…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Biases in the Blind Spot: Detecting What LLMs Fail to Mention

Large Language Models (LLMs) often provide chain-of-thought (CoT) reasoning traces that appear plausible, but may hide internal biases. We…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Less is Enough: Synthesizing Diverse Data in LLM Feature Space with Sparse Autoencoders

The diversity of post-training data is critical for effective downstream performance in large language models (LLMs). Many existing approac…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI

Weight Decay Improves Language Model Plasticity

Large language models are typically trained in two broad phases: pretraining to produce a base model, followed by further training to impro…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

SCOPE: Selective Conformal Optimized Pairwise LLM Judging

Large language models (LLMs) are increasingly used as scalable judges in pairwise evaluation, but they remain prone to miscalibration and b…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

DTBench: A Synthetic Benchmark for Document-to-Table Extraction

Document-to-table (Doc2Table) extraction derives structured tables from unstructured documents under a target schema, enabling reliable and…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

The Information Geometry of Softmax: Probing and Steering

This paper concerns the question of how AI systems encode semantic structure into the geometric structure of their representation spaces. T…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AIエージェント

HiPER: Hierarchical Reinforcement Learning with Explicit Credit Assignment for Large Language Model Agents

Training LLMs as interactive agents for multi-turn decision-making remains challenging, particularly in long-horizon tasks with sparse and…

2026-06-01 13:00 JSTarXiv cs.AIビジネス/資金調達研究/論文

Position: Evaluation of ECG Representations Must Be Fixed

This position paper argues that current benchmarking practice in 12-lead ECG representation learning must be fixed to ensure progress is re…

2026-06-01 13:00 JSTarXiv cs.AI研究/論文

HistCAD: A Constraint-Aware Parametric History-Based CAD Representation, Dataset, and Benchmark with Industrial Complexity

Parametric CAD sequences are reusable because dimensional and geometric constraints govern how parameter changes propagate. Existing CAD ge…

2026-06-01 13:00 JSTarXiv cs.AILLM/生成AI