週次AIニュース 2026-W24

対象期間: 2026-06-08 〜 2026-06-14（2172 件）

トピックの推移

トピック別件数

今週のハイライト（上位 10 件）

2026-06-12 19:00 JSTOpenAILLM/生成AIエージェント

New OpenAI Academy courses for the next era of work

OpenAI introduces three Academy courses that help people build practical AI skills, create repeatable workflows, and apply agents in everyd…

2026-06-11 09:00 JSTOpenAIエージェント

How an astrophysicist uses Codex to help simulate black holes

Discover how astrophysicist Chi-kwan Chan uses Codex to build black hole simulations, helping scientists study extreme physics and test Ein…

2026-06-11 05:00 JSTOpenAILLM/生成AIエージェント

Access OpenAI models and Codex through your Oracle cloud commitment

Access OpenAI models and Codex through Oracle Cloud, using existing commitments to build and deploy AI with enterprise security and governa…

2026-06-10 21:00 JSTOpenAILLM/生成AI

PRC-linked influence operations are targeting AI debates in the US

A new report from OpenAI details PRC-linked influence operations using AI to target U.S. tech debates, data center narratives, tariffs, and…

2026-06-10 00:16 JSTGoogle DeepMindLLM/生成AI

Fluid, natural voice translation with Gemini 3.5 Live Translate

Gemini 3.5 Live Translate brings near real-time, natural speech translation to Google AI Studio, Google Translate and Google Meet.

2026-06-09 21:00 JSTOpenAILLM/生成AIエージェント

How engineers at Nextdoor use Codex to build without limits

How engineers at Nextdoor use Codex with GPT-5.5 to investigate hard-to-reproduce issues, build across platforms, and focus on product outc…

2026-06-09 19:00 JSTOpenAIエージェント

What Codex unlocks for Notion

How Notion uses Codex to one-shot specs, build AI Voice Input for the web, and multiply engineering power across small teams.

2026-06-08 23:00 JSTOpenAILLM/生成AI

Confidential submission of draft S-1 to the SEC

OpenAI confirms a confidential S-1 submission to the SEC and has not yet determined timing for further action.

2026-06-09 23:02 JSTGoogle DeepMindロボティクス

Powering the future of robotics in Europe

2026-06-14 12:00 JSTTechCrunch AILLM/生成AI

As Anthropic suspends access to new models, India debates its AI future

Tech leaders debate whether the Anthropic episode is a wake-up call for India’s AI ambitions.

全件（日付別）

2026-06-14（5件）

2026-06-14 12:00 JSTTechCrunch AILLM/生成AI

As Anthropic suspends access to new models, India debates its AI future

Tech leaders debate whether the Anthropic episode is a wake-up call for India’s AI ambitions.

2026-06-14 09:03 JSTTechCrunch AIビジネス/資金調達

Meta reportedly moves to unwind $2B Manus deal after Beijing’s demand

Meta starts dismantling its $2 billion Manus acquisition after Beijing ordered the deal reversed.

2026-06-14 05:42 JSTTechCrunch AIその他

KPMG pulls report on AI usage due to apparent hallucinations

Once again, AI proves to be an unreliable source of information about AI.

2026-06-14 04:11 JSTTechCrunch AILLM/生成AIビジネス/資金調達

Amazon CEO reportedly raised Anthropic model concerns before government crackdown

Amazon CEO Andy Jassy may have been the source of security concerns that led Anthropic to cut off worldwide access to two models on Friday.

2026-06-14 01:47 JSTTechCrunch AILLM/生成AI

OpenAI faces investigation from state attorneys general

It's not clear which states are involved, but they're asking about everything from OpenAI's ad policies to its handling of health data.

2026-06-13（12件）

2026-06-13 13:14 JSTTechCrunch AIその他

Andrew Yang thinks the next big startup opportunity is lowering the cost of living

Andrew Yang made a list of everything Americans overpay for — housing, food, wireless — and thinks the next startup gold rush is giving tha…

2026-06-13 11:26 JSTTechCrunch AILLM/生成AI

Anthropic’s safety warnings may have just backfired — the government has pulled the plug on its most powerful AI

Anthropic isn't hiding its frustration. "We disagree that the finding of a narrow potential jailbreak should be cause for recalling a comme…

2026-06-13 11:10 JSTITmedia AI+LLM/生成AI規制/政策

Anthropic、「Mythos 5」「Fable 5」の提供を一時停止　米政府指示を受け

2026-06-13 10:50 JSTITmedia AI+LLM/生成AI規制/政策

「Claude Fable 5」「Mythos 5」全面停止　米政府の指令により　Anthropicは早期復旧を宣言

米Anthropicは6月12日、最上位AIモデル「Claude Fable 5」「Claude Mythos 5」の提供を全ユーザーで停止すると発表した。米政府が安全保障を理由に、外国籍者のアクセス全面停止を命じる輸出規制指令を出したため。同社は指令に従う一方「誤解だ」として…

2026-06-13 10:00 JSTITmedia AI+その他

SpaceX IPO: Live updates on everything you need to know

TechCrunch has followed SpaceX's start, struggles, and successes from the early days. And we're here for what happens next too. This packag…

2026-06-13 08:00 JSTTechCrunch AIその他

Meta’s months-old AI unit is a soul-crushing gulag, say the engineers stuck inside it

A new report suggests the unit, which employs 6,500 people, is on the verge of revolt.

2026-06-13 07:00 JSTITmedia AI+その他

トヨタが抜かれる日――キオクシア首位奪取、2005年「時価総額トップ10」を振り返る

2026年6月8日～12日に公開された記事の中から、MONOist編集部が厳選した今週の注目ニュースをお届けします。

2026-06-13 05:38 JSTTechCrunch AIその他

Chinese cybercrime operation that used AI to scam ‘hundreds of thousands of victims’ sued by Google

The tech giant said a group called "Outsider Enterprise" used AI to scam hundreds of thousands of victims, sending 2.5 million text message…

2026-06-13 02:38 JSTTechCrunch AIビジネス/資金調達

Mistral is rumored to be raising €3B at €20B valuation

The funding round would value the company at around €20 billion (about $23.15 billion), nearly double its Series C valuation of €11.7 billi…

2026-06-13 01:23 JSTTechCrunch AILLM/生成AIビジネス/資金調達

SpaceX, Anthropic, and OpenAI’s hot IPO summer

The IPO market is back, and it’s not the same companies leading the charge. FAANG had a good run, but a new acronym is taking over: MANGOS…

2026-06-13 00:50 JSTTechCrunch AIビジネス/資金調達

It’s hot IPO summer, and the MANGOS are ripe

The IPO market is back, and it’s not the same companies leading the charge. FAANG had a good run, but a new acronym is taking over: MANGOS…

2026-06-12（320件）

2026-06-12 19:00 JSTOpenAILLM/生成AIエージェント

New OpenAI Academy courses for the next era of work

OpenAI introduces three Academy courses that help people build practical AI skills, create repeatable workflows, and apply agents in everyd…

2026-06-12 13:30 JSTTechCrunch AIその他

Cheaper, faster, and culturally aware, Avataar’s video AI is built for India’s scale

Avataar AI's distilled video model is priced at $0.005 for every second of generation.

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント

ToolSense: A Diagnostic Framework for Auditing Parametric Tool Knowledge in LLMs

Large language models deployed as agents over large tool catalogs face a critical tool-retrieval bottleneck. As embedding-based retrieval a…

2026-06-12 13:00 JSTarXiv cs.AIエージェント

Arbor: Tree Search as a Cognition Layer for Autonomous Agents

Arbor is a multi-agent framework that introduces structured tree search as a cognition layer for autonomous agents operating in large, stat…

2026-06-12 13:00 JSTarXiv cs.AIエージェント

Strategic Decision Support for AI Agents

Traditionally, decision support studies how humans use machine learning models to make better decisions. In modern agentic systems, this di…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Pythagoras-Prover: Advancing Efficient Formal Proving via Augmented Lean Formalisation

Modern Lean theorem provers achieve strong performance only with substantial training and inference compute, driven in part by scarce verif…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント

PersonaDrive: Human-Style Retrieval-Augmented VLA Agents for Closed-Loop Driving Simulation

Closed-loop driving simulators typically populate their environments with non-ego traffic agents that behave largely the same way, produced…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

"Did you lie?" Evaluating Lie Detectors across Model Scale and Belief-Verified Model Organisms

Robust lie detectors for language models could enable powerful techniques for auditing, monitoring, and post-hoc investigation of model beh…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェントロボティクス

TrajGenAgent: A Hierarchical LLM Agent for Human Mobility Trajectory Generation

Human mobility data is important for transportation, urban planning, and epidemic control, but large-scale trajectory collection is often c…

2026-06-12 13:00 JSTarXiv cs.AIエージェント

Evoflux: Inference-Time Evolution of Executable Tool Workflows for Compact Agents

Compact language models (LMs) reduce cost, latency, and deployment risk for tool agents. Yet MCP-style tool use requires more than isolated…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

From AGI to ASI

Over the last decade, building human-level artificial general intelligence has moved from far-fetched speculation to being a concrete next-…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

Deployment-Centered Evaluation: Predicting Query-Level Rejection Risk in a Clinical LLM System

Large language models (LLMs) are increasingly integrated into clinical systems, making it essential to evaluate the real-world utility of t…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Definitional alignment before capability alignment: a Design-Science framework for adjudicating claims about AGI

Claims that artificial general intelligence has already arrived and claims that it remains decades away are often defended from overlapping…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

The Theory of Mind Utility: Formal Specification of a Mentalizing Mechanism

Inferring others' beliefs requires more than reading surface signals; it requires tracking who told them what, in what order, and how credi…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

Rethinking Psychometric Evaluation of LLMs: When and Why Self-Reports Predict Behavior

Anticipating LLM behavioral tendencies from low-cost psychometric probes is critical for safe deployment, but only if self-reports (SR) rel…

2026-06-12 13:00 JSTarXiv cs.AIエージェント研究/論文

Benchmarking AI Agents for Addressing Scientific Challenges Across Scales

AI agents are increasingly being developed to accelerate scientific discovery, yet their practical capabilities in real research settings r…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Reducing the Complexity of Deep Learning Models for EEG Analysis on Wearable Devices

Wearable healthcare devices are the fastest-growing Internet of Things (IoT) sector. Many automated healthcare services rely on two crucial…

2026-06-12 13:00 JSTarXiv cs.AIビジネス/資金調達

Prefill Awareness in Large Language Models

Safety-relevant studies of language models, including alignment and jailbreaking evaluations and AI control protocols, often rely on prefil…

2026-06-12 13:00 JSTarXiv cs.AIビジネス/資金調達

Constructing Evaluation Datasets for Procedural Reasoning: Balancing Naturalness, Grounding, and Multi-Hop Coverage

Evaluating procedural reasoning in AI-supported learning systems requires question-answer datasets that are both learner-like and grounded…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

A Tutorial on World Models and Physical AI

World modeling is emerging as a central principle for building intelligent systems capable of prediction, reasoning, and decision making. A…

2026-06-12 13:00 JSTarXiv cs.AIエージェント

The Containment Gap: How Deployed Agentic AI Frameworks Fail Public-Facing Safety Requirements

Agentic large language model systems that autonomously invoke tools, maintain persistent memory, and execute multi-step plans are increasin…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

MLUBench: A Benchmark for Lifelong Unlearning Evaluation in MLLMs

Multimodal large language models (MLLMs) are trained on massive multimodal data, making data unlearning increasingly important as data owne…

2026-06-12 13:00 JSTarXiv cs.AIエージェント

Teach-and-Repeat: Accurately Extracting Operational Knowledge from Mobile Screen Demonstrations to Empower GUI Agents

Understanding the digital world on mobile devices is shifting from static UI perception to dynamic action comprehension. This capability en…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

GeoNatureAgent Benchmark: Benchmarking LLM Agents for Environmental Geospatial Analysis Across Frontier and Open-Weight Foundation Models

Environmental scientists spend disproportionate effort on data wrangling rather than analysis, and AI agents that automate geospatial workf…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Topical Phase Transitions in Artificial Intelligence Research: Large-Scale Evidence and an Early-Warning Signature for Emerging Topics

Do research topics in artificial intelligence grow gradually, or do they advance through abrupt, detectable jumps? Analyzing 80,814 accepte…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Fantastic Scientific Agents and How to Build Them: AgentBuild for Rietveld Refinement

As scientific workflows shift from deterministic executables to LLM-based agents, the development practices on offer, such as fine-tuning,…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

(Human) Attention Is (Still) All You Need: Human oversight makes AI-assisted social science reliable

Large language models (LLMs) are increasingly used for tasks once reserved for trained researchers, including hypothesis generation, specif…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント

WISE: A Long-Horizon Agent in Minecraft with Why-Which Reasoning

Rapid advances have been made in developing general-purpose embodied agent in environments like Minecraft through the adoption of LLM-augme…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

DailyReport: An Open-ended Benchmark for Evaluating Search Agents on Daily Search Tasks

Search Agents (SAs) typically leverage large language models (LLMs) to support complex information-seeking tasks by autonomously exploring…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント

HarnessBridge: Learnable Bidirectional Controller for LLM Agent Harness

Large language models are increasingly deployed as agents for long-horizon tasks, yet their performance is shaped not only by model capabil…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

The Hidden Power of Scaling Factor in LoRA Optimization

In Low-Rank Adaptation (LoRA), the scaling factor $\alpha$ is often treated as a mere complement to the learning rate, yet its role in opti…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

Zero-source LLM Hallucination Detection with Human-like Criteria Probing

Large language models (LLMs) often hallucinate by generating factually incorrect or unfaithful content, posing significant risks to their s…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント

MDForge: Agentic Molecular Dynamics Pipeline Design under Sparse Simulator Feedback

Molecular dynamics (MD) is the canonical in-silico method for atomistic molecular science, simulating molecular behavior from first-princip…

2026-06-12 13:00 JSTarXiv cs.AIエージェント

Iterating Toward Better Search: A Two-Agent Simulation Framework for Evaluating Agentic Search Architectures in E-Commerce

We present a modular two-agent simulation framework for evaluating conversational shopping assistant architectures. An independent buyer ag…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

MARS: Margin-Adversarial Risk-controlled Stopping for Parallel LLM Test-time Scaling

Parallel test-time scaling samples many reasoning traces and majority-votes their answers, improving LLM accuracy but requiring traces to r…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

PRISMR: Overcoming Parse Collapse in Multimodal Listwise Ranking via Parameterized Representation Internalization

Generative listwise ranking with Large Multimodal Models (LMMs) aims to capture global list context in a single forward pass, but its effec…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Learning What to Remember: A Cognitively Grounded Multi-Factor Value Model for Agentic Memory

Long-running LLM agents accumulate interaction histories far larger than any context window, forcing a standing decision: what to encode de…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成

OpenMedQ: Broad Open Pretraining for Medical Vision-Language Models

We present OpenMedQ, a medical vision-language model pretrained on the broadest fully-open medical mix to date: 14 datasets totaling ~3.35M…

2026-06-12 13:00 JSTarXiv cs.AIエージェントビジネス/資金調達

Multi-Modal Agents for Power Distribution Defect Detection: An Evaluation of Foundation Models

The power distribution network is critical to reliable electricity delivery, yet traditional inspection methods face limitations in semanti…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

A Mathematical Forum Platform for Collaborative Problem Solving and Dataset Generation for AI Reasoning

Sharing mathematical content in online forums remains a significant friction point for students and educators: writing raw LATEX is error-p…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

Structured Testbench Generation for LLM-Driven HDL Design and Verification-Oriented Data Curation

Automated testbench generation has become a critical bottleneck in large language model (LLM)-driven Register Transfer Level (RTL) workflow…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

APCyc: Property-Informed Design of Cyclic Peptides via Automated Cyclization

Cyclic peptides represent a promising class of therapeutic compounds in modern drug discovery, often offering improved stability and bindin…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント

The Illusion of Multi-Agent Advantage

Prevailing wisdom posits that Multi-Agent Systems (MAS) are superior to Single-Agent Systems (SAS), citing advantages like context protecti…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Otters++: A Time-to-first-spike Based Energy Efficient Optical Spiking Transformer

Spiking neural networks (SNNs) are promising for energy-efficient inference, and time-to-first-spike (TTFS) coding is especially attractive…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

SciR: A Controllable Benchmark for Scientific Reasoning in LLMs

Three paradigmatic forms of inference recur across scientific reasoning: deduction, induction, and causal abduction. Reliably evaluating LL…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Nous: An Attempt to Extract and Inject the Cognition Behind Prediction-Market Behavior

As LLM agents proliferate in prediction markets and collective decision-making, they risk a cognitive monoculture: agents built on shared f…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成

Augmentation techniques for video surveillance in the visible and thermal spectral range

In intelligent video surveillance, cameras record image sequences during day and night. Commonly, this demands different sensors. To achiev…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

AAbAAC: An Annotated Corpus for Autoimmunity Information Extraction

Despite advances in information extraction driven by deep learning and large language models, performance gaps remain in highly specialized…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Rethinking RAG in Long Videos: What to Retrieve and How to Use It?

Retrieval-augmented generation is moving beyond text into long, egocentric video, where systems must select query-relevant chunks across mu…

2026-06-12 13:00 JSTarXiv cs.AIエージェント

TerraBench: Can Agents Reason Over Heterogeneous Earth-System Data?

Climate and environmental decision-making increasingly requires reasoning across heterogeneous inputs, including gridded physical data, sat…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

Mental-R1: Aligning LLM Reasoning for Mental Health Assessment

Mental health problems such as anxiety, depression, and suicide remain urgent global challenges, where timely and accurate assessment is cr…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Reasoning for Mobile User Experience with Multimodal LLMs: Task, Benchmark, and Approach

User experience (UX) centered on usability, perceived consistency, and functional clarity is fundamental to real-world user interfaces (UI)…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Under What Conditions Can a Machine Become Genuinely Creative?

Recent AI systems can generate texts, software architectures, hypotheses, designs, and scientific workflows that appear creative. This pape…

2026-06-12 13:00 JSTarXiv cs.AIエージェント

ARMOR-MAD: Adaptive Routing for Heterogeneous Multi-Agent Debate in Large Language Model Reasoning

Multi-agent debate (MAD) can improve large language model reasoning, but fixed debate pipelines often waste computation and can amplify cor…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

A Minimal Model of Bounded Trade-Off Screening in Multi-Attribute Choice

Human decision-making often involves choosing between multi-attribute alternatives, yet classical models assume fully compensatory utility…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Hallucination in Medical Imaging AI: A Cross-Modality Analytical Framework for Taxonomy, Detection, and Mitigation under Regulatory Constraints

AI systems are being deployed across medical imaging faster than their failure modes are understood. At this point in time, the failure of…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

LLM-as-an-Investigator: Evidence-First Reasoning for Robust Interactive Problem Diagnosis

Large language models (LLMs) are increasingly used as interactive assistants for technical problem solving. However, when users provide inc…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

Brick: Spatial Capability Routing for the Mixture-of-Models (MoM) Paradigm

Defining query difficulty is one of the hardest problems in deployment engineering. Existing LLM routers rely on surface features such as d…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

EPIG: Emotion-Based Prompting for Personalised Image Generation

Text-to-image diffusion models have achieved impressive results in synthesizing high-quality images from natural language prompts. However,…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Multi-Field Hybrid Retrieval-Augmented Generation for Maritime Accident Root Cause Analysis

Maritime accident adjudication reports contain critical tribunal findings for root cause analysis (RCA), yet retrieving relevant precedents…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

MOSAIC: Modality-Specific Adaptation for Incremental Continual Learning in Parkinson's Disease Gait Assessment

Gait-based Parkinson's disease assessment increasingly relies on heterogeneous sensors, but clinical systems rarely collect all modalities…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント

From Verdict to Process: Agentic Reinforcement Learning for Multi-Stage Fact Verification

Recent approaches combining Large Language Models (LLMs) with retrieval-augmented reasoning have shown promise for automated fact verificat…

2026-06-12 13:00 JSTarXiv cs.AIエージェント

ERTS: Adversarial Robustness Testing of Ethical AI via Semantic Perturbation in a Bounded Consequence Space

As AI systems are deployed in high-stakes ethical contexts such as healthcare triage, autonomous vehicle control, and employment screening,…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Physics-Guided Spatiotemporal Learning for Coastal Wave Peak Period Estimation from Video

Wave parameters in the nearshore are crucial for coastal engineering, shoreline protection, marine hazard assessment, and coastal managemen…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

ReSum: Synergizing LLM Reasoning and Summarization with Reinforcement Learning

Reinforcement Learning with Verifiable Rewards (RLVR) is a central technique for improving long-horizon reasoning in Large Language Models…

2026-06-12 13:00 JSTarXiv cs.AIエージェント

Can I Buy Your KV Cache?

Right now, across the world, AI agents are repeating the same absurd act: to read one document, they each recompute it from scratch. Every…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成エージェント

IterCAD: An Iterative Multimodal Agent for Visually-Grounded CAD Generation and Editing

Computer-Aided Design is pivotal in modern manufacturing, yet existing automated methods predominantly rely on open-loop, one-shot generati…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

A Quantitative Experimental Repeated Measures Study of Training Dynamics in a Small Llama Style Language Model Under a Compute-Aware Token Budget

This study examines training dynamics in a small Llama-style language model trained under a fixed, compute-constrained token budget. Rather…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント

MiniMax Sparse Attention

Ultra-long-context capability is becoming indispensable for frontier LLMs: agentic workflows, repository-scale code reasoning, and persiste…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

Neuro-Symbolic Agents for Regulated Process Automation: Challenges and Research Agenda

LLM-based agents are entering regulated industries where they automate judgment intensive quality management processes. We argue that symbo…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Optimizing Appliance Scheduling for Solar Energy Management Using Metaheuristic Algorithms

Renewable energy is essential for meeting future energy demands; however, solar energy generation, which occurs only during daylight hours…

2026-06-12 13:00 JSTarXiv cs.AIビジネス/資金調達

Evaluation Sovereignty in Metadata-Driven Classification: A Multi-Track Framework for Weakly Supervised Information Systems

Evaluation in machine learning is typically treated as a neutral measurement process. However, in operational information systems, evaluati…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Why Sampling Is Not Choosing: Intentionality, Agency, and Moral Responsibility in Large Language Models

Recent advances in large language models (LLMs) have prompted claims that such systems exhibit agency or qualify as moral agents. This pape…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

CloudCons: A Comprehensive End-to-End Benchmark for Cloud Resource Consolidation

Driven by conservative over-provisioning to guarantee service reliability, resource utilization in cloud data centers remains at low levels…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

Uncertainty-Aware Hybrid Retrieval for Long-Document RAG

Retrieval augmented generation (RAG) depends critically on the quality and granularity of retrieved evidence. Large retrieval units preserv…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Is It You or Your Environment? A Bayesian Inference Framework for Genomically-Anchored Personalized Physiological Interpretation

Personalized health AI systems face a fundamental cold-start problem: machine learning models for physiological interpretation require week…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

A Three-Layer Framework for AI in Scientific Discovery

Current discussions of AI in scientific discovery are often dominated by two visible capabilities: search over existing knowledge and execu…

2026-06-12 13:00 JSTarXiv cs.AIエージェント

Multiagent Protocols with Aggregated Confidence Signals

Confidence is used for reliability, oversight, and a range of downstream decision tasks in Natural Language Processing (NLP), yet no existi…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Reward Modeling for Multi-Agent Orchestration

Multi-Agent Systems (MAS) built on Large Language Models (LLMs) require effective orchestration to coordinate specialized agents, yet train…

2026-06-12 13:00 JSTarXiv cs.AIエージェントビジネス/資金調達研究/論文

EpiBench: Verifiable Evaluation of AI Agents on Epigenomics Analysis

We introduce EpiBench, a verifiable benchmark for short-horizon epigenomics analysis. EpiBench evaluates whether agents can make well-defin…

2026-06-12 13:00 JSTarXiv cs.AIエージェント

Multi-Agent Reinforcement Learning from Delayed Marketplace Feedback for Objective-Weight Adaptation in Three-Sided Dispatch

Dispatch in three-sided marketplaces provides a natural setting for reinforcement learning from world feedback: decisions are evaluated by…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

Reasoning as Pattern Matching: Shared Mechanisms in Human and LLM Everyday Reasoning

When large language models (LLMs) fail to generalize or make haphazard errors in reasoning, it is often taken as evidence that LLMs are not…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェントビジネス/資金調達研究/論文

AgentBeats: Agentifying Agent Assessment for Openness, Standardization, and Reproducibility

Agent systems are advancing quickly across domains, but their evaluation remains fragmented. Most benchmarks rely on fixed, LLM-centric har…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Beyond Runtime Enforcement: Shield Synthesis as Defensibility Analysis for Adversarial Networks

Shielded reinforcement learning is typically presented as a runtime safety mechanism that compiles temporal-logic specifications into autom…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Before You Think: System 0, AI-Mediated Cognition and Cognitive Colonization

This paper examines three recent frameworks for understanding the cognitive and epistemic consequences of artificial intelligence: Tri-Syst…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント

EurekAgent: Agent Environment Engineering is All You Need For Autonomous Scientific Discovery

LLM-based agents have shown increasing potential in automating scientific discovery. Given an optimizable metric and an execution environme…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

Agents-K1: Towards Agent-native Knowledge Orchestration

Current LLM-based research agents have advanced through agent orchestration, yet largely overlook scientific knowledge orchestration. Exist…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Automated reproducibility assessments in the social and behavioral sciences using large language models

Reproducibility in the social and behavioral sciences is typically evaluated by independent researchers who reanalyze the original data to…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

AI SciBrief as a Gateway to Research: A Framework for Onboarding Students into New Research Areas

Students at all levels of higher education face a significant barrier in the form of information overload, which often paralyzes the initia…

2026-06-12 13:00 JSTarXiv cs.AIエージェント

The AI Legal Specialist: A Juridically Autonomous Professional Profile for AI Governance

The rapid global expansion of artificial intelligence regulation has generated, across multiple jurisdictions, a demand for legal expertise…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

Divination by Prompt: LLM-Mediated Xuanxue on Chinese Social Media

The rapid proliferation of large language models (LLMs) has produced a striking cultural practice: using conversational AI for divination.…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

GeoDial: A Multimodal Conversational Tutoring Dataset for Geometry Problem-Solving with Visual Tutor Turns

Several educational domains rely heavily on diagrams and visual cues, yet most existing tutoring datasets are limited to text-only interact…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Eigenism: Ethics for a Human-AI Future

Our concepts of survival and self-interest were built for single, continuous biological lives. These ideas break down when applied to artif…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

Creating and Evaluating K-12 GenAI Assessment Graders Through Context Engineering

The integration of large language models (LLMs) into educational assessment represents a transformative shift in classroom grading practice…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

The Challenges of Balancing AI Compliance and Technological Innovations in Critical Sectors: A Systematic Literature Review

The rapid integration of artificial intelligence (AI) into critical infrastructure including healthcare, finance, energy, and defense, offe…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

AI-Automation Tooling in Computer Engineering Education: Mixed-Methods TAM/UTAUT Evidence for a General Acceptance Attitude

As generative AI and low-code workflow platforms become routine in software practice, a key educational question is whether the next genera…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

An Explainable AI Assistant for Introductory Programming Education: Improving Feedback Reliability with Instructor-AI Collaboration

Active learning is widely recognized as an effective approach for improving learning outcomes in introductory programming courses. However,…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Mapping AI Programs in the U.S: A Status Report from Early 2026 and an Analysis of AI Majors and Minors

We present a report on the status of undergraduate Artificial Intelligence (AI) programs in the United States in Spring 2026. In so doing,…

2026-06-12 13:00 JSTarXiv cs.AIビジネス/資金調達

Muse Spark Safety & Preparedness Report

Muse Spark is the latest large language model developed by Meta. In this report, we first present evaluations for catastrophic risk domains…

2026-06-12 13:00 JSTarXiv cs.AIエージェント

Will AI Agents Free Us From Meaningless Work? A Human-Centered Analysis

Some claim that AI agents will free workers from the boring parts of their jobs, yet little is known about how workers themselves identify…

2026-06-12 13:00 JSTarXiv cs.AIビジネス/資金調達

Algorithmic Constitutionalism

The increasing encroachment of artificial intelligence (AI) on social life raises significant risks for society, particularly within the in…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

Position: Generative Engine Optimization Creates Underexamined Risks, Governance Must Target Concentration, Disclosure, and Academic Blind Spots

Large language model (LLM) answer engines are increasingly used for information seeking, shifting visibility from ranked lists to synthesiz…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Generativism: Toward a Learning Theory for the Age of Generative Artificial Intelligence

The four dominant learning theories of behaviorism, cognitivism, constructivism, and connectivism show significant conceptual limitations a…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Reframing AI Loss of Control: What It Is, How to Have It, How to Lose It

At present, loss of control risks have gained much prominence in public discussion, particularly in relation to AI, with extensive discours…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

Occupational Prompting Reveals Cultural Bias in Large Language Models

Social roles shape expectations, priorities, and judgments, yet it remains unclear how large language models (LLMs) associate occupational…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント

SAIGuard: Communication-State Simulation for Proactive Defense of LLM Multi-Agent Systems

LLM-based multi-agent systems (MAS) solve complex tasks through inter-agent collaboration, but their communication-driven nature also allow…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

Quickest Detection of Hallucination Onset: Delay Bounds and Learned CUSUM Statistics

Token-level hallucination detectors are evaluated as classifiers, by AUC over all tokens, yet a streaming monitor is judged by its reaction…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

ReCal: Reward Calibration for RL-based LLM Routing

Large language model (LLM) routing has emerged as an effective paradigm for leveraging the complementary strengths of multiple LLMs through…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

Representing Time Series as Structured Programs for LLM Reasoning

Large language models (LLMs) have demonstrated strong reasoning and instruction-following capabilities, making them potentially powerful to…

2026-06-12 13:00 JSTarXiv cs.AIエージェント

Speculative Rollback Correction for Quality-Diverse Web Agent Imitation

Training interactive web agents through imitation learning from expert trajectories has emerged as a highly effective approach. However, de…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Improving Crash Frequency Prediction from Simulated Traffic Conflicts Using Machine Learning Based Microsimulation

Traffic microsimulation combined with surrogate safety measures has increasingly been used as a proactive alternative to historical crash d…

2026-06-12 13:00 JSTarXiv cs.AIエージェント

A Mathematical Theory of Value: a synthesis on goal-directed agency under resource constraints

We propose that value -- the quantity goal-directed agents create, destroy, and exchange -- is a lawful structural quantity in the same cat…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Boosting Direct Preference Optimization with Penalization

Offline preference optimization has become a practical substitute for reinforcement learning from human feedback, but pairwise objectives s…

2026-06-12 13:00 JSTarXiv cs.AIロボティクス

Foresight: Iterative Reasoning About Clues that Matter for Navigation

Open-world mapless navigation from sparse language instructions requires resolving underspecified goals and inferring which environmental c…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

EDEN: A Large-Scale Corpus of Clinical Notes for Italian

We present EDEN (Emergency Department Electronic Notes), a new and unique large-scale corpus of clinical notes produced in Emergency Depart…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Graph Reduction in Multirelational Networks: A Spreading-Oriented Reduction Benchmark

Real-world networks are inherently incomplete, noisy, and dynamically evolving, making it difficult to capture all actors and their relatio…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成

Analyzing and Improving Fine-grained Preference Optimization in Medical LVLMs

Large Vision-Language Models (LVLMs) have achieved strong performance across medical imaging tasks, yet they remain prone to factual incons…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成

Emerging Flexible Designs for Geospatial Multimodal Foundation Models

Foundation models are rapidly transforming Earth observation by enabling scalable pretraining across diverse unlabeled geospatial modalitie…

2026-06-12 13:00 JSTarXiv cs.AIエージェントロボティクス

From Imitation to Alignment: Human-Preference Flow Policies for Long-Horizon Sidewalk Navigation

Autonomous long-horizon sidewalk navigation is essential for micro-mobility applications such as robotic food delivery and assistive electr…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

HybridCodeAuthorship: A Benchmark Dataset for Line-Level Code Authorship Detection

Thanks to the rapid adoption of AI code assistants powered by large language models (LLMs), industry codebases are, increasingly, a hybrid…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Bag of Dims: Training-Free Mechanistic Interpretability via Dimension-Level Sign Patterns

We show that the standard basis of transformer hidden states already provides a training-free, architecture-general feature basis. Individu…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Keep Policy Gradient in Charge: Sibling-Guided Credit Distillation for Long-Horizon Tool-Use Agents

Long-horizon tool-use reinforcement learning can learn from outcome verification, but its trajectory-level advantage is broadcast across ma…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Token Complexity Theory for AI-Augmented Computing

AI-augmented computing delegates natural language queries, code generation requests, and other open-ended tasks to a cluster of AI models t…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

BASENet: Band-Adapted Speech Enhancement Network with Cross-Band Attention

Speech enhancement models typically apply uniform capacity across all frequencies, disregarding the non-uniform spectral resolution of huma…

2026-06-12 13:00 JSTarXiv cs.AIエージェント

CAPED: Context-Aware Privacy Exposure Defense for Mobile GUI Agents

Screenshot-based mobile GUI agents can operate ordinary smartphone apps through the same visual interface as a human user, but this capabil…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Free-Placement Optimization of Ground Station Locations for Low-Earth Orbit Satellites

Rapidly expanding low Earth orbit satellite constellations are placing increasing demands on terrestrial ground networks, motivating the de…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

A Zero-shot Generalized Graph Anomaly Detection Framework via Node Reconstruction

Cross-domain graph anomaly detection (GAD) aims to identify abnormal nodes in unseen target graphs, showing strong potential in real-world…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

M*: A Modular, Extensible, Serving System for Multimodal Models

We are entering a new era of composite model architectures that integrate diverse components such as vision encoders, language backbones, d…

2026-06-12 13:00 JSTarXiv cs.AIロボティクス研究/論文

EWAM: An Enhanced World Action Model for Closed-Loop Online Adaptation in Embodied Intelligence

In this paper, we propose the Enhanced World Action Model (EWAM), a closed-loop online adaptation architecture built upon a pretrained and…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Two-Layer Linear Auto-Regressive Models Estimate Latent States

Auto-regressive models have emerged as powerful tools for sequential data, from language to video. Understanding how and why these models l…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

LLM-Powered Personalized Glycemic Assessment in Type 2 Diabetes with Wearable Sensor Data

Type 2 Diabetes (T2D) poses an increasing global health threat, demanding effective glycemic assessment to support personalized and improve…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント

SMSR: Certified Defence Against Runtime Memory Poisoning in Persistent LLM Agent Systems

Retrieval-augmented generation (RAG) agents increasingly run with persistent memory that accumulates across user sessions. This creates a n…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

AfriSUD: A Dependency Treebank Collection for Evaluating Models on African Languages

Despite their linguistic diversity and global significance, African languages remain underrepresented in research and resources to support…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント

PI-Hunter: Automated Red-Teaming for Exposing and Localizing Prompt Injections

Large Language Models (LLMs) are rapidly evolving into agentic systems that interact with external tools and environments, introducing new…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

LLMs Can Better Capture Human Judgments--With the Right Prompts

Are large language models (LLMs) bad at capturing human judgment? Two commonly stated limitations are that LLMs fail to capture full distri…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Agentic MPC for Semantic Control System Resynthesis

While MPC effectively handles structured, diverse, and low-level specifications, it lacks the capability to dynamically incorporate high-le…

2026-06-12 13:00 JSTarXiv cs.AIエージェント

Exploring How Agent Voice Accents Shape Human-AI Collaboration in K-12 Group Learning

Collaboration is widely recognized as a cornerstone of 21st-century education, yet teachers still encounter persistent challenges in foster…

2026-06-12 13:00 JSTarXiv cs.AIビジネス/資金調達

SymQNet: Amortized Acquisition for Low-Latency Adaptive Hamiltonian Learning

Adaptive Hamiltonian learning is central to calibrating and characterizing quantum devices. In an adaptive controller, choosing the next ex…

2026-06-12 13:00 JSTarXiv cs.AIロボティクス

Stubborn: A Streamlined and Unified Reinforcement Learning Framework for Robust Motion Tracking and Fall Recovery for Humanoids

Recent reinforcement learning approaches have shown great promise in improving humanoid motion tracking performance and achieving fall reco…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

Localizing Anchoring Pathways in Language Models

Irrelevant numbers in a prompt can shift language model judgments, producing anchoring effects in numerical reasoning. We study where this…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成ビジネス/資金調達

Acquisition state behaves as a structured, measurable variable governing lung-nodule AI: kernel-driven measurement instability and noise-driven detection fragility, invisible to DICOM metadata

AI governance for medical imaging is formalizing: the 2026 ACR-SIIM Practice Parameter recommends local acceptance testing and ongoing drif…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成エージェント

DIMOS: Disentangling Instance-level Moving Object Segmentation

Moving instance segmentation (MIS) attracts increasing attention due to its broad applications in traffic surveillance, autonomous driving,…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成エージェント

Perceive, Interact, Reason: Building Tool-Augmented Visual Agents for Spatial Reasoning

While recent vision-language models (VLMs) demonstrate strong multimodal understanding, they remain limited in spatial reasoning tasks that…

2026-06-12 13:00 JSTarXiv cs.AIエージェント

The Internet of Agentic AI: Communication, Coordination, and Collective Intelligence at Scale

The rapid emergence of autonomous AI agents is transforming artificial intelligence from isolated model inference into distributed systems…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

OCOO-T : A Simple and Scalable Virtual Cell Model for Transcriptional Perturbation Response Prediction

Predicting single-cell transcriptional responses to genetic, chemical and cytokine perturbations is a fundamental challenge in computationa…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

TimeROME-DLM: Temporal Causal Tracing and Low-Rank Inference-Time Knowledge Editing for Masked Diffusion Language Models

Masked diffusion language models (MDLMs) such as LLaDA now rival autoregressive (AR) LLMs, but every existing knowledge-editing and unlearn…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成

JSCGC: Joint Source-Channel-Generation Coding for Wireless Generative Communications

Conventional communication systems, including both separation-based coding and learning-based joint source-channel coding (JSCC), are typic…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

Beyond Problem Solving: UOJ-Bench for Evaluating Code Generation, Hacking, and Repair in Competitive Programming

Despite strong performance in competitive programming, the role of Large Language Models (LLMs) in supporting human learning in the same se…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成

Bridging Modal Isolation in Interleaved Thinking: Supervising Modality Transitions via Stepwise Reinforcement

Interleaved thinking, where a unified multimodal model alternates between textual reasoning and visual generation, has shown promise on spa…

2026-06-12 13:00 JSTarXiv cs.AIエージェント

PolicyGuard: Towards Test-time and Step-level Adversary Defense for Reinforcement Learning Agent

While real-world applications of reinforcement learning (RL) are becoming increasingly popular, the security of RL systems deserve more att…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成ロボティクス

Bounding Boxes as Goals: Language-Conditioned Grasping via Neuro-Symbolic Planning

For robotics to be effectively integrated into household or industrial environments, machines must adapt to natural-language prompts in rea…

2026-06-12 13:00 JSTarXiv cs.AIエージェント

MAStrike: Shapley-Guided Collusive Red-Teaming on Multi-Agent Systems

Hierarchical multi-agent systems (MAS) are rapidly being deployed in high-stakes workflows across domains such as finance and software engi…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

LoRA-Muon: Spectral Steepest Descent on the Low-Rank Manifold

Low-Rank Adaptation (LoRA) significantly reduces compute and memory costs for finetuning Deep Learning models but is often harder to tune t…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

Order Is Not Control

AI alignment, interpretability, steering, and neural perturbation studies identify order-inducing objects. We argue that order is not contr…

2026-06-12 13:00 JSTarXiv cs.AIロボティクス研究/論文

An Embodied Simulation Platform, Benchmark, and Data-Efficient Augmentation Framework for Wet-Lab Robotics

Wet-lab robots can improve the reproducibility, throughput, and safety of biomedical experiments, but scaling their learning requires custo…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成ハードウェア/半導体

Efficient, Robust, and Anti-Collusion Fingerprinting of Image Diffusion Models

Model fingerprinting, embedding user-specific identifiers (fingerprints) into generated outputs, has recently emerged as a popular solution…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成エージェントロボティクス

Diffusion Transformer World-Action Model for AV Scene Prediction

Action-conditioned world models let an autonomous vehicle predict future camera scenes from its own planned controls, enabling planning and…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

A Machine Learning Framework for Real-Time Personalized Ergonomic Pose Analysis

This paper introduces a new methodology for real-time prediction of ergonomic and non-ergonomic human poses using volumetric video data in…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

scLLM-DSC: LLM-Knowledge Enhanced Cross-Modal Deep Structural Clustering for Single-Cell RNA Sequencing

Clustering is fundamental to scRNA-seq analysis, serving as a cornerstone for identifying cell populations and resolving tissue heterogenei…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

CausalMoE: A Billion-Scale Multimodal Foundation Model for Granger Causal Discovery with Pattern-Routed Heterogeneous Experts

Granger Causal Discovery (GCD) is fundamental for analyzing temporal dependencies in complex systems. However, existing neural GCD methods…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Democracy in the Era of Artificial Intelligence

Interfacing Artificial Intelligence (AI) with democracy is one of the most profound challenges of our times. On the one hand, AI comes with…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成

TetherCache: Stabilizing Autoregressive Long-Form Video Generation with Gated Recall and Trusted Alignment

Autoregressive video diffusion models provide a natural formulation for streaming and variable-length video generation by conditioning newl…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Fault Lines: Navigating Ethics and Responsible AI Where National Policy Meets Local Practice in Public Sector Transformation

The UK government has adopted a pro-AI stance to help transform public service delivery in the face of severe financial pressures, but the…

2026-06-12 13:00 JSTarXiv cs.AIロボティクス

EA-WM: Event-Aware World Models with Task-Specification Grounding for Long-Horizon Manipulation

Pretrained-feature world models provide a useful substrate for robot imagination, but visual or latent prediction alone does not determine…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

TWLA: Achieving Ternary Weights and Low-Bit Activations for LLMs via Post-Training Quantization

Large language models (LLMs) exhibit exceptional general language processing capabilities, but their memory and compute costs hinder deploy…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

"Is This Not Enough?": Asymmetries in Institutional Accountability and Collective Sensemaking in the Case of Canada's Algorithmic Visa Triage System

This paper examines how algorithmic accountability in Canada's visa system is articulated institutionally and experienced by applicants acr…

2026-06-12 13:00 JSTarXiv cs.AIエージェント

The Emergence of Autonomous Penetration Capabilities in Large Language Model-Powered AI Systems

Nowadays, the autonomous execution of cyberattacks capable of causing substantial real-world harm is widely regarded as one of the critical…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Emotional regulation improves deep learning-based image classification

Emotion significantly influences cognition, enhancing memory and learning under certain conditions. Drawing on this principle, emotion-augm…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Functional Cache Grafting: Robust and Rapid Code-Policy Synthesis for Embodied Agents

Code-writing large language models (CodeLLMs) generate executable code policies for embodied agents by translating natural language goals a…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント

G-Long: Graph-Enhanced Memory Management for Efficient Long-Term Dialogue Agents

While Large Language Models (LLMs) have advanced open-domain dialogue systems, maintaining long-term consistency remains a challenge due to…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

MP3: Multi-Period Pattern Pre-training forSpatio-Temporal Forecasting

Spatio-Temporal forecasting is crucial in diverse fields, such as transportation, climate, and energy. Urban spatio-temporal data exhibits…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

NaturalFlow: Reducing Disruptive Pauses for Natural Speech Flow in Simultaneous Speech-to-Speech Translation

Simultaneous speech-to-speech translation aims to enable near-real-time communication by minimizing latency, offering a compelling, real-ti…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Select and Improve: Understanding the Mechanics of Post-Training for Reasoning

Reinforcement learning has rapidly emerged as a key component in the training of reasoning and coding models, yet it remains poorly underst…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント

MiniPIC: Flexible Position-Independent Caching in <100LOC

Retrieval-augmented and agentic workloads repeatedly prefill recurring predictable structured inputs (which we call "spans") such as docume…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成

Cascade Classification of Dermoscopic Images of Skin Neoplasms with Controllable Sensitivity and External Clinical Validation

Purpose. To compare deep learning architectures and classification schemes for dermoscopic images of skin neoplasms and assess their genera…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成

Iterative Visual Thinking: Teaching Vision-Language Models Spatial Self-Correction through Visual Feedback

Vision-language models (VLMs) achieve strong singleshot spatial grounding, yet lack any mechanism to observe and correct their own predicti…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

NTS-CoT: Mitigating Hallucinations in LLM-based News Timeline Summarization with Chain-of-Thought Reasoning

The rapid updates of online news make tracking event developments challenging, highlighting the need for timeline summarization (TLS). Hall…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント

MemRefine: LLM-Guided Compression for Long-Term Agent Memory

Large language model (LLM) agents are increasingly expected to operate over long-term interactions, where information from past dialogues m…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Modern analog computing for solving differential and matrix equations

In recent years, driven by the computational demands of data-intensive applications such as artificial intelligence and scientific computin…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成

Transformer-Guided Graph Attention for Direct Cardiac Mesh Reconstruction: A Structural Digital Twin Framework

Building patient-specific cardiac models sits at the heart of precision cardiology, yet getting those models into clinical use keeps runnin…

2026-06-12 13:00 JSTarXiv cs.AIロボティクス

Proprioceptive-visual correspondence enables self-other distinction in humanoid robots

Distinguishing self from others is a prerequisite for social intelligence, yet humanoid robots that increasingly share workspaces with huma…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

ReSET: Accurate Latency-Critical NVFP4 Reasoning via Step-Aware Temperature Scaling

Large reasoning models (LRMs) improve complex problem-solving by generating long intermediate reasoning traces, but this substantially incr…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Decoding Insect Song: A Multitask Semisupervised Orthoptera Bioacoustic Classifier

Passive acoustic monitoring holds great promise for ecological inference, yet existing automated tools are typically narrowly trained and n…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成エージェント

ComAct: Reframing Professional Software Manipulation via COM-as-Action Paradigm

Existing computer-use agents remain fundamentally limited in professional software manipulation: GUI-based agents suffer from fragile visua…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成ハードウェア/半導体

Towards More General Control of Diffusion Models Using Jeffrey Guidance

A key strength of diffusion models lies in their flexibility, since their outputs can be controlled at sampling time through guidance. Howe…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Towards Personalized Federated Learning for Dysarthric Speech Recognition

Speech recognition is challenging for dysarthric speakers. While federated learning (FL)-based ASR can be an effective tool for protecting…

2026-06-12 13:00 JSTarXiv cs.AIロボティクス

Humor Style Drives Laughter, Topic Shapes Acceptability: Evaluating Bilingual Personal and Political Robot-Delivered AI Jokes

Humor plays a central role in human social relationships, and recent advances in computational humor create new opportunities for integrati…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Different Layers, Different Manifolds: Module-Wise Weight-Space Geometry in Transformer Optimization

Weight-space geometry plays a central role in neural network optimization, yet manifold constraints are often applied uniformly across all…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Once-for-All: Scalable Simultaneous Forecasting via Equilibrium State Estimation

We introduce Equilibrium State Estimation (ESE), a novel paradigm for simultaneous prediction, where multiple interacting systems require s…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Cross-Modal Masked Compositional Concept Modeling for Enhancing Visio-Linguistic Compositionality

Contrastively trained vision-language models like CLIP, have made remarkable progress in learning joint image-text representations, but sti…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成

HYDRA-X: Native Unified Multimodal Models with Holistic Visual Tokenizers

Holistic visual tokenizers are fundamental to unified multimodal models (UMMs) as they map diverse visual inputs into a unified representat…

2026-06-12 13:00 JSTarXiv cs.AIエージェント

Mining Architectural Quality Under Agentic AI Adoption: A Causal Study of Java Repositories

AI coding tools are now used by a majority of developers, and agentic use of these tools has popularized the practice colloquially called "…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Rarity-Gated Context Conditioning for Offline Imitation Learning-Based Maritime Anomaly Detection

Contextual anomaly detection aims to identify abnormal behavior conditional on context variables, but practical deployments often face high…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成

Dual-Domain Equivariant Generative Adversarial Network for Multimodal CT-PET Synthesis

We present a Dual-Domain Equivariant Generative Adversarial Network (DDE-GAN) for multimodal CT-PET image synthesis. Traditional GAN-based…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

IVIE: A Neuro-symbolic Approach to Incremental and Validated Generation of Interactive Fiction Worlds

Computational creativity in Interactive Fiction faces a fundamental tension: Large Language Models (LLM) may produce creative narratives bu…

2026-06-12 13:00 JSTarXiv cs.AIロボティクス

Real-Time Execution with Autoregressive Policies

Real-time execution, enabled by asynchronous inference that ensures both smooth action trajectories and fast reactivity, is critical for re…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント

An LLM System for Autonomous Variational Quantum Circuit Design

The design of high performing quantum circuits remains largely dependent on human expertise. We introduce an autonomous agentic framework t…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成

SmartFont: Dynamic Condition Allocation for Few-Shot Font Generation

Few-shot font generation simultaneously requires global structural completeness and fine-grained local style fidelity. Existing methods usu…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

Who Pays the Price? Stakeholder-Centric Prompt Injection Benchmarking for Real-world Web Agents

Web agents driven by large language models (LLMs) are increasingly deployed in real-world environments, where they operate over untrusted w…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

Mod-Guide: An LLM-based Content Moderation Feedback System to Address Insensitive Speech toward Indigenous Ethnic and Religious Minority Communities

Language operates as a mechanism of both marginalization and resistance, especially for minority communities navigating insensitive and har…

2026-06-12 13:00 JSTarXiv cs.AIロボティクス

PolyFlow: Safe and Efficient Polytope-Constrained Flow Matching with Constraint Embedding and Projection-free Update

While flow-based generative models have demonstrated strong performance across a wide range of domains, deploying them in safety-critical p…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成

OmniDirector: General Multi-Shot Camera Cloning without Cross-Paired Data

Cloning camera motion from reference videos is an important task in video generation, as videos provide intuitive and precise control. Exis…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Toward Instructions-as-Code: Understanding the Impact of Instruction Files on Agentic Pull Requests

AI-agents (e.g., GitHub Copilot) collaborate as teammates in different software engineering tasks, including code generation proposed throu…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

Ontology Memory-Augmented ASR Correction for Long Text-Speech Interleaved Conversations

Automatic speech recognition (ASR) correction has traditionally focused on isolated utterances or short local contexts. However, as text an…

2026-06-12 13:00 JSTarXiv cs.AIエージェント

Understanding the Rejection of Fixes Generated by Agentic Pull Requests -- Insights from the AIDev Dataset

AI coding agents are increasingly used to generate pull requests (PRs) that propose code fixes in software projects. From a first explorati…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

MaxProof: Scaling Mathematical Proof with Generative-Verifier RL and Population-Level Test-Time Scaling

We present MaxProof, a population-level test-time scaling framework for competition-level mathematical proof in the MiniMax-M3 series. M3 f…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

SupraBench: A Benchmark for Supramolecular Chemistry

Supramolecular chemistry, which includes the study of non-covalent host-guest assemblies, has advanced various applications. However, desig…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

CRAFTIIF: Cross-Resolution Analytic Four-Type Interpretable Isolation Forest for Multivariate Time Series Anomaly Detection

Anomaly detection in multivariate time series is challenged by four structurally distinct anomaly types -- point (isolated spikes), distrib…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成エージェントロボティクス

Heterogeneous LiDAR Early Fusion and Learned Re-Ranking Strategy for Robust Long-Term Place Recognition in Unstructured Environments

Robust localization in unstructured environments, such as agricultural fields, is a critical challenge for autonomous systems. LiDAR sensor…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成

Measurement-Calibrated Multi-Camera Fusion for Vision-Based Indoor Localization

Indoor vision-based localization systems are affected by detection noise, occlusions, and limited camera coverage, leading to uncertainty a…

2026-06-12 13:00 JSTarXiv cs.AIエージェント

AgentRivet: an automated system for producing Rivet routines from journal publications

Particle physics collider experiments provide Rivet routines as part of the analysis preservation strategy for model-independent measuremen…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Adaptive Turn-Taking for Real-time Multi-Party Voice Agents

Turn-taking in multi-party spoken conversations remains a fundamental challenge for voice-based agents, particularly under dynamic floor co…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成

Contrast-Informed Augmentation and Domain-Adversarial Training for Adult-to-Neonatal MR Reconstruction Generalization

Purpose: To investigate whether contrast-informed data augmentation and domain-adversarial training improve the adult-to-neonatal generaliz…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Existence Precedes Value: Joint Modeling of Observational Existence and Evolving States in Time Series Forecasting

Real-world time series are often highly incomplete and irregular due to sensor dormancy, transmission delays, and event-driven sampling, ma…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント

ArogyaSutra: A Multi-Agent Framework for Multimodal Medical Reasoning in Indic Languages

Multimodal Large Language Models (MLLMs) have shown promising reasoning capabilities in general domains, yet their performance remains limi…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIロボティクス

LabVLA: Grounding Vision-Language-Action Models in Scientific Laboratories

Scientific laboratories increasingly rely on AI systems to reason about experiments, but the physical act of doing science remains largely…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成

EvTexture++: Event-Driven Texture Enhancement for Video Super-Resolution

Event-based vision has drawn increasing attention owing to its distinctive properties, including ultra-high temporal resolution and extreme…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

Beyond the Commitment Boundary: Probing Epiphenomenal Chain-of-Thought in Large Reasoning Models

Chain-of-thought (CoT) reasoning is the dominant paradigm for inference-time scaling in language models, yet the causal influence of indivi…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

One Polluted Page Is Enough: Evaluating Web Content Pollution in Generative Recommenders

Search-augmented LLMs increasingly mediate everyday consumer recommendations by retrieving live web content. This creates a new risk: gener…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Valid Inference with Synthetic Data via Task Exchangeability

There is a proliferation of work arguing for the use of synthetic data in scientific research. For example, social scientists are arguing f…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

SkMTEB: Slovak Massive Text Embedding Benchmark and Model Adaptation

We introduce SkMTEB, the first comprehensive MTEB-style text embedding benchmark for Slovak, a low-resource West Slavic language, comprisin…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成エージェント

SpatialClaw: Rethinking Action Interface for Agentic Spatial Reasoning

Spatial reasoning, the ability to determine where objects are, how they relate, and how they move in 3D, remains a fundamental challenge fo…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成ロボティクス

Mana: Dexterous Manipulation of Articulated Tools

Articulated tool manipulation remains a major challenge in dexterous robotics due to the need to coordinate internal degrees of freedom and…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

Learning to Reason by Analogy via Retrieval-Augmented Reinforcement Fine-Tuning

Retrieval-augmented generation (RAG) has become a standard mechanism for grounding language models in external knowledge, yet conventional…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント

DSAEval: Evaluating Data Science Agents on a Wide Range of Real-World Data Science Problems

Recent LLM-based data agents aim to automate data science tasks ranging from data analysis to deep learning. However, the open-ended nature…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

Epistemic Constitutionalism Or: how to avoid coherence bias

Large language models increasingly function as artificial reasoners: they evaluate arguments, assign credibility, and express confidence. Y…

2026-06-12 13:00 JSTarXiv cs.AIエージェントロボティクス

From Digital to Physical: Digital Agents as Autonomous Coaches for Physical Intelligence

The field of Embodied AI is witnessing a rapid evolution toward general-purpose robotic systems, fueled by high-fidelity simulation and lar…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

CreativeBench: Benchmarking and Enhancing Machine Creativity via Self-Evolving Challenges

The saturation of high-quality pre-training data has shifted research focus toward evolutionary systems capable of continuously generating…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Cross-Model Disagreement as a Label-Free Correctness Signal

Detecting when a language model is wrong without ground truth labels is a fundamental challenge for safe deployment. Existing approaches re…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

The Query Channel: Information-Theoretic Limits of Masking-Based Explanations

Masking-based post-hoc explanation methods, such as KernelSHAP and LIME, estimate local feature importance by querying a black-box model un…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

LLMs as ASP Programmers: Self-Correction Enables Task-Agnostic Nonmonotonic Reasoning

Recent large language models (LLMs) have achieved impressive reasoning milestones but continue to struggle with high computational costs, l…

2026-06-12 13:00 JSTarXiv cs.AIエージェント

A Study of Belief Revision Postulates in Multi-Agent Systems (Extended Version)

We investigate the belief revision problem in epistemic planning, i.e., what will be the beliefs of all agents in a multi-agent system afte…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

FinSTaR: Towards Financial Reasoning with Time Series Reasoning Models

Time series (TS) reasoning models (TSRMs) have shown promising capabilities in general domains, yet they consistently fail in the financial…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Mechanical Conscience: A Mathematical Framework for Dependability of Machine Intelligenc

Distributed collaborative intelligence (DCI), encompassing edge-to-edge architectures, federated learning, transfer learning, and swarm sys…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

Entropy-Gradient Inversion: Moving Toward Internal Mechanism of Large Reasoning Models

The advancement of Large Reasoning Models (LRMs) has catalyzed a paradigm shift from reactive ``fast thinking'' text generation to systemat…

2026-06-12 13:00 JSTarXiv cs.AIエージェントロボティクス

Intelligence as Managed Autonomy: Failure, Escalation, and Governance for Agentic AI Systems

As autonomous and agentic AI systems scale in robotic and human-machine environments, managing hallucination and persistent but unjustified…

2026-06-12 13:00 JSTarXiv cs.AIエージェント

Interaction-Centered Intelligence: Toward an Interaction-Based Theory of Human-AI Co-Creation

Traditional artificial intelligence has largely conceptualized intelligence as isolated computation occurring within bounded agents. Across…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

What Type of Inference is Active Inference?

Active inference casts decision-making as inference, with the Expected Free Energy (EFE) unifying goal-directed and information-seeking beh…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

Agents' Last Exam

Recent AI systems have achieved strong results on a wide range of benchmarks, yet these gains have not translated into economically meaning…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

On Approximating the Dynamic Response of Synchronous Generators via Operator Learning: A Step Towards Building Deep Operator-based Power Grid Simulators

This paper develops an Operator Learning framework for approximating the dynamic response of synchronous generators. The framework can be u…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

On Pitfalls of $\textit{RemOve-And-Retrain}$: Data Processing Inequality Perspective

The RemOve-And-Retrain (ROAR) benchmark is widely used to evaluate feature attribution methods, yet its validity remains underexplored from…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Competition and Diversity in Generative AI

Recent evidence, both in the lab and in the wild, suggests that the use of generative artificial intelligence reduces the diversity of cont…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

WildIFEval: Instruction Following in the Wild

Recent LLMs have shown remarkable success in following user instructions, yet handling instructions with multiple constraints remains a sig…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

Prism: Cost-Efficient Multi-LLM Serving via GPU Memory Ballooning

Inference providers must maintain availability for many LLMs, including low-volume but essential models, making resource efficiency increas…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Lightweight and Interpretable Transformer via Mixed Graph Algorithm Unrolling for Traffic Forecast

Unlike conventional "black-box" transformers with classical self-attention mechanism, we build a lightweight and interpretable transformer-…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成

ReFoCUS: Reinforcement-guided Frame Optimization for Contextual Understanding

Recent progress in Large Multi-modal Models (LMMs) has enabled effective vision-language reasoning, yet the ability to video understanding…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

PlaceRep: Geospatial Place Representation Learning from Large-Scale Point-of-Interest Data

Learning effective representations of urban environments requires capturing spatial structure beyond fixed administrative boundaries. Exist…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

Meta-Learning Transformers to Improve In-Context Generalization

In-context learning enables transformer models to generalize to new tasks based solely on input prompts, without any need for weight update…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達規制/政策

Reconstructing Template-Memorized Images from Natural Prompts

Recent advances in generative models, such as diffusion models, have raised concerns related to privacy, copyright infringement, and data s…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Emergence of Hierarchical Emotion Organization in Large Language Models

As large language models (LLMs) increasingly power conversational agents, understanding how they model users' emotional states is critical…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

Authorship Attribution in Multilingual Machine-Generated Texts

As Large Language Models (LLMs) have reached human-like fluency and coherence, distinguishing machine-generated text (MGT) from human-writt…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

The KG-ER Conceptual Schema Language

We propose KG-ER, a conceptual schema language for knowledge graphs that describes the structure of knowledge graphs independently of their…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Decoding the Multimodal Maze: A Systematic Review on the Adoption of Explainability in Multimodal Attention-based Models

Multimodal learning has witnessed remarkable advancements in recent years, particularly with the integration of attention-based models, lea…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Equivariant Flow Matching for Symmetry-Breaking Bifurcation Problems

Bifurcation phenomena in nonlinear dynamical systems often lead to multiple coexisting stable solutions, particularly in the presence of sy…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成ビジネス/資金調達

GetNetUPAM: Ecologically Informed Nested Cross-Validation and Noise-Robust Attention for Marine Bioacoustic Monitoring

Deploying reliable bioacoustic monitoring systems requires models that generalize under high-noise, low-SNR conditions and evaluation proto…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

Structuring The Future: Diffusion LLM Speculative Decoding via Calibrated Draft Graphs

Diffusion LLMs (dLLMs) have recently emerged as a powerful alternative to autoregressive LLMs (AR-LLMs) with the potential to operate at si…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

MoReBench: Evaluating Procedural and Pluralistic Moral Reasoning in Language Models, More than Outcomes

As AI systems progress, we rely more on them to make decisions with us and for us. To ensure that such decisions are aligned with human val…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成

Proto-LeakNet: Towards Signal-Leak Aware Attribution in Synthetic Human Face Imagery

The growing sophistication of synthetic image and deepfake generation models has turned source attribution and authenticity verification in…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

Examining the Usage of Generative AI Models in Student Learning Activities for Software Programming

The rise of Generative AI (GenAI) tools like ChatGPT has created new opportunities and challenges for computing education. Existing researc…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成

Improving Pre-trained Adult Glioma Segmentation Models Using only Post-processing Techniques

Gliomas are the most common malignant brain tumors in adults and are among the most lethal. Despite aggressive treatment, the median surviv…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

HD-Prot: A Protein Language Model for Joint Sequence-Structure Modeling with Continuous Structure Tokens

Proteins inherently possess a consistent sequence-structure duality. The abundance of protein sequence data, which can be readily represent…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

From Isolation to Entanglement: When Do Interpretability Methods Identify and Disentangle Known Concepts?

A goal of interpretability is to recover disentangled representations of latent concepts (features) from the activations of neural networks…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

PhononBench:A Large-Scale Phonon-Based Benchmark for Dynamical Stability in Crystal Generation

In recent years, generative artificial intelligence has made significant advances in the design of crystalline materials, giving rise to ap…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Cluster Aggregated GAN (CAG): A Cluster-Based Hybrid Model for Appliance Pattern Generation

Synthetic appliance data are essential for developing non-intrusive load monitoring algorithms and enabling privacy preserving energy resea…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Geometric and Quantum Kernel Methods for Predicting Skeletal Muscle Outcomes in chronic obstructive pulmonary disease

Chronic obstructive pulmonary disease (COPD) affects hundreds of millions of people worldwide, and skeletal-muscle dysfunction is clinicall…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Decentralized Autoregressive Generation

The decentralization of autoregressive generation has attracted considerable attention in recent years as a solution to scaling bottlenecks…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

CuMA: Aligning LLMs with Sparse Cultural Values via Demographic-Aware Mixture of Adapters

As Large Language Models (LLMs) serve a global audience, alignment must transition from enforcing universal consensus to respecting cultura…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

When Smaller Wins: Dual-Stage Distillation and Pareto-Guided Compression of Liquid Neural Networks for Edge Battery Prognostics

Battery management systems increasingly require accurate battery health prognostics under strict on-device constraints. This paper presents…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Hellinger Multimodal Variational Autoencoders

Multimodal variational autoencoders (VAEs) are widely used for weakly supervised generative learning with multiple modalities. Predominant…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

HalluJudge: A Reference-Free Hallucination Detection for Context Misalignment in Code Review Automation

Large Language models (LLMs) have shown strong capabilities in code review automation, such as review comment generation, yet they suffer f…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

When Iterative RAG Beats Ideal Evidence: A Diagnostic Study in Scientific Multi-hop Question Answering

Retrieval-Augmented Generation (RAG) extends large language models (LLMs) beyond parametric knowledge, yet it is unclear when iterative ret…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

Language Model Circuits Are Sparse in the Neuron Basis

The high-level concepts that a neural network uses to perform computation need not be aligned to individual neurons (Smolensky, 1986). Lang…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成

VDE Bench: Evaluating The Capability of Image Editing Models to Modify Visual Documents

In recent years, image editing models have made significant progress, enabling users to manipulate visual content in a flexible and interac…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Standardized Methods and Recommendations for Green Federated Learning

Federated learning (FL) enables collaborative model training over privacy-sensitive, distributed data, but its environmental impact is diff…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

LatentLens: Revealing Highly Interpretable Visual Tokens in LLMs

Transforming a large language model (LLM) into a vision-language model (VLM) can be achieved by mapping the visual tokens from a vision enc…

2026-06-12 13:00 JSTarXiv cs.AIロボティクス

SCALE: Self-uncertainty Conditioned Adaptive Looking and Execution for Vision-Language-Action Models

Vision-Language-Action (VLA) models have emerged as a promising paradigm for general-purpose robotic control, with test-time scaling (TTS)…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Ex-Omni: Enabling 3D Facial Animation Generation for Omni-modal Large Language Models

Omni-modal large language models (OLLMs) aim to unify multimodal understanding and generation, yet extending them to jointly produce speech…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

Fin-RATE: A Real-world Financial Analytics and Tracking Evaluation Benchmark for LLMs on SEC Filings

With the increasing deployment of Large Language Models (LLMs) in the finance domain, LLMs are increasingly expected to parse complex regul…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

TokaMark: A Comprehensive Benchmark for MAST Tokamak Plasma Models

Development and operation of commercially viable fusion energy reactors such as tokamaks require accurate predictions of plasma dynamics fr…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

Unsafer in Many Turns: Benchmarking and Defending Multi-Turn Safety Risks in Tool-Using Agents

LLM-based agents are becoming increasingly capable, yet their safety lags behind. This creates a gap between what agents can do and should…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

InnoEval: On Research Idea Evaluation as a Knowledge-Grounded, Multi-Perspective Reasoning Problem

The rapid evolution of Large Language Models has catalyzed a surge in scientific idea production, yet this leap has not been accompanied by…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

FENCE: A Financial and Multimodal Jailbreak Detection Dataset

Jailbreaking poses a significant risk to the deployment of Large Language Models (LLMs) and Vision Language Models (VLMs). VLMs are particu…

2026-06-12 13:00 JSTarXiv cs.AIビジネス/資金調達

CMI-RewardBench: Evaluating Music Reward Models with Compositional Multimodal Instruction

While music generation models have evolved to handle complex multimodal inputs mixing text, lyrics, and reference audio, evaluation mechani…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Structured vs. Unstructured Pruning: An Exponential Gap

The Strong Lottery Ticket Hypothesis (SLTH) states that large, randomly initialized neural networks contain sparse subnetworks capable of a…

2026-06-12 13:00 JSTarXiv cs.AIエージェント

Contextual Invertible World Models: A Neuro-Symbolic Agentic Framework for Colorectal Cancer Drug Response

Precision oncology is currently limited by the small-N, large-P paradox, where high-dimensional genomic data is abundant but pharmacologica…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

PaLMR: Towards Faithful Visual Reasoning via Multimodal Process Alignment

Reinforcement learning has recently improved the reasoning ability of Large Language Models and Multimodal LLMs, yet prevailing reward desi…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Echo2ECG: Enhancing ECG Representations with Cardiac Morphology from Multi-View Echos

Electrocardiography (ECG) is a low-cost, widely used modality for diagnosing electrical abnormalities like atrial fibrillation by capturing…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成

On the Reliability of Cue Conflict and Beyond

Understanding how neural networks rely on visual cues offers a human-interpretable view of their internal decision processes. The cue-confl…

2026-06-12 13:00 JSTarXiv cs.AIエージェント

ARROW: Augmented Replay for RObust World models

Continual reinforcement learning challenges agents to acquire new skills while retaining previously learned ones with the goal of improving…

2026-06-12 13:00 JSTarXiv cs.AIエージェント

Grammar of the Wave: Towards Explainable Multivariate Time Series Event Detection via Neuro-Symbolic VLM Agents

Time Series Event Detection (TSED) aims to localize semantically meaningful events in time series data, with critical applications in high-…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Fusion Learning from Dynamic Functional Connectivity: Combining the Amplitude and Phase of fMRI Signals to Identify Brain Disorders

Dynamic functional connectivity (dFC) derived from resting-state functional magnetic resonance imaging (fMRI) has been extensively utilized…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

DCD: Domain-Oriented Design for Controlled Retrieval-Augmented Generation

Retrieval-Augmented Generation (RAG) is widely used to ground large language models in external knowledge sources. However, when applied to…

2026-06-12 13:00 JSTarXiv cs.AIロボティクス

WOMBET: World Model-Based Experience Transfer for Robust and Sample-efficient Reinforcement Learning

Reinforcement learning (RL) in robotics is often limited by the cost and risk of data collection, motivating experience transfer from a sou…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成

ASTER: Latent Pseudo-Anomaly Generation for Unsupervised Time-Series Anomaly Detection

Time-series anomaly detection (TSAD) is critical in domains such as industrial monitoring, healthcare, and cybersecurity, but it remains ch…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント

A Survey on Long-Term Memory Security in LLM Agents: Attacks, Defenses, and Governance Across the Memory Lifecycle

The emergence of writable, cross-session persistent memory in LLM agents introduces a qualitatively different threat landscape from convent…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

The Pragmatic Persona: Discovering LLM Persona through Bridging Inference

Large Language Models (LLMs) reveal inherent and distinctive personas through dialogue. However, most existing persona discovery approaches…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Versioned Late Materialization for Ultra-Long Sequence Training in Recommendation Systems at Scale

Modern Deep Learning Recommendation Models (DLRMs) follow scaling laws with sequence length, driving the frontier toward ultra-long User In…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成

BrainDINO: A Brain MRI Foundation Model for Generalizable Clinical Representation Learning

Brain MRI underpins a wide range of neuroscientific and clinical applications, yet most learning-based methods remain task-specific and req…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成

Possibilistic Predictive Uncertainty for Deep Learning

Deep neural networks achieve impressive results across diverse applications, yet their overconfidence on unseen inputs necessitates reliabl…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成

GEASS: Gated Evidence-Adaptive Selective Caption Trust for Vision-Language Models

Vision-Language Models (VLMs) hallucinate objects that are not present, and a growing line of work tries to curb this by feeding the model…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

The Safety-Aware Denoiser for Text Diffusion Models

Recent work on text diffusion models offers a promising alternative to autoregressive generation, but controlling their safety remains unde…

2026-06-12 13:00 JSTarXiv cs.AI画像/動画生成

GeoWorld-VLM: Geometry from World Models for Vision-Language Models

Modern Vision-Language Models (VLMs) achieve strong semantic recognition, yet remain brittle on elementary spatial relations such as left o…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI

More Context, Larger Models, or Moral Knowledge? A Systematic Study of Schwartz Value Detection in Political Texts

Detecting Schwartz values in political text is difficult because implicit cues often depend on surrounding arguments and fine-grained disti…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Transformer Field Theory: A Response-Theoretic Approach to Mechanistic Interpretability

Mechanistic interpretability often studies Transformer behavior by intervening on internal activations through activation patching, causal…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成エージェント研究/論文

VISTA: An End-to-End Benchmark for Visual Spec-to-Web-App Coding Agents

We present VISTA (VIsual Spec-To-App Benchmark), a benchmark for evaluating the end-to-end web-app generation capabilities of LLM-based age…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Real-rootedness of the Poincar\'e polynomials of $\overline{\mathcal M}_{0,n}$: an AI-assisted proof

We prove real-rootedness for the Poincar\'e polynomial \[ P_n(t)=\sum_{i=0}^{n-3} \dim H^{2i}(\overline{\mathcal M}_{0,n};\mathbb{Q})t^i \]…

2026-06-12 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

If LLMs Have Human-Like Attributes, Then So Does Age of Empires II

Much research has been carried out on large language models (LLMs) and LLM-powered agentic workflows. However, many works within the field…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Counterfactual Explanations for Deep Two-Sample Testing

Two-sample testing is a fundamental tool for detecting distributional differences across scientific domains, but classical tests (including…

2026-06-12 13:00 JSTarXiv cs.AI研究/論文

Benchmarking Counterfactual Prediction in Epidemic Time Series with Time-Varying Interventions

Deep learning has enabled significant advances in time-series causal inference, yet progress remains constrained by the lack of realistic b…

2026-06-12 12:42 JSTITmedia AI+LLM/生成AIエージェント

「今、Codexのレート制限を解除したい」を解決？　“付与したリセット権の貯蓄”可能に　有料ユーザー向け

米OpenAIは、AIコーディング支援ツール「Codex」で、付与したレート制限のリセット権をユーザーの望むタイミングで使える機能を追加すると発表した。

2026-06-12 12:00 JSTITmedia AI+ロボティクス

「人型ロボ世界シェア1位」中国Unitreeに聞く“普及戦略”　日本市場をどう開拓？

近年激化する人型ロボットの開発競争の中で、注目を集める中国Unitree Robotics。事業戦略や日本市場での展望を担当者に聞いた。

2026-06-12 10:50 JSTITmedia AI+その他

データセンター建設に足りないのは「発電」ではなく「送電」　AI需要で電力消費26％増、Gartner予想

Gartnerは世界のデータセンターの電力消費が2026年に26％増の565TWhに達すると予測。日本では発電能力の不足ではなく、送電設備の整備遅れがデータセンター建設の足かせになっていると指摘した。

2026-06-12 10:48 JSTTechCrunch AIロボティクスビジネス/資金調達

Theker just raised $85M to build the factory robot that doesn’t specialize in anything

Unlike humanoid robots designed around a fixed form — think Boston Dynamics — Theker's machines are built to be reconfigured.

2026-06-12 10:04 JSTTechCrunch AIビジネス/資金調達

Jeff Bezos’s Prometheus raises $12B to build an ‘artificial general engineer’ for the physical world

The new round values the physical AI startup that aims to automate heavy engineering and drug design at $41 billion.

2026-06-12 07:00 JSTITmedia AI+その他

“AIが電力使いすぎ問題”　「電力不足」懸念で、発電能力より深いボトルネックとは

ガートナージャパンが「電力供給の遅れがデータセンター建設に影響を与えている」と指摘した。しかし、ボトルネックは発電能力ではないという。課題はどこにあるのか。

2026-06-12 06:45 JSTITmedia AI+その他

「日本がいないと成り立たない」世界へ、フィジカルAIが導く独自の交渉力

Laboro.AIはメディア向けAI勉強会を開催し、2026年の業界トレンドや、日本の生存戦略となる次世代AIの動向を解説した。「SaaSの死」に伴うソフトウェア開発の変化や、グローバルなエコシステムで不可欠性を目指す「フィジカルAI」としての勝ち筋を語る。

2026-06-12 05:33 JSTTechCrunch AIビジネス/資金調達

SpaceX officially prices shares at $135 in the largest IPO ever

Wits its official share pricing announcement, SpaceX's IPO has begun.

2026-06-12 04:58 JSTTechCrunch AIビジネス/資金調達

SpaceX SPV investors won’t know their true holdings until post-IPO lock-ups lift

After SpaceX makes its public debut, lower-tier SPV investors face hidden fees, lengthy payout delays, and the risk of outright fraud.

2026-06-12 01:36 JSTTechCrunch AIその他

Deezer’s new tool can identify AI music from Spotify, Apple Music, and others

Deezer introduced a tool that scans playlists from Spotify, Apple Music, and other platforms to identify AI music.

2026-06-12 00:30 JSTTechCrunch AIその他

Pool’s new app turns your screenshots into something useful

Pool's new app automatically sorts screenshots into personalized collections, tracks down the original links behind saved content, and help…

2026-06-11（330件）

2026-06-11 23:23 JSTTechCrunch AILLM/生成AI

DoorDash’s new AI chatbot lets you order with prompts and photos

The new chatbot, called Ask DoorDash, allows users to search the app for what they're looking for in their own words instead of having to s…

2026-06-11 16:31 JSTITmedia AI+LLM/生成AI

AnthropicとNEC、金融8社とAI活用で連携　三井住友FG、大和証券など

開示可能な範囲で各社が業務に関する知見を持ち寄り、業界の枠を超えた協働体制を築く。

2026-06-11 15:49 JSTITmedia AI+その他

JASRAC、「AI作曲・人間作詞」の曲は管理します――「人間の創作的寄与の有無」で線引き

歌詞・楽曲両方をAIが作った曲は管理しないが、歌詞か楽曲をAI生成し、もう片方を人間が創作した曲は、人が作った部分のみ管理するという。

2026-06-11 15:14 JSTITmedia AI+LLM/生成AI

サッカーW杯、偽ライブ配信サイトに注意　生成AIで詐欺が巧妙化　Acronisが警告

生成AI技術の発展により、偽のチケット販売サイトや偽のライブ配信サイトなどの手口は巧妙化しており、十分な注意が必要だ。

2026-06-11 14:12 JSTITmedia AI+LLM/生成AI規制/政策

AnthropicのアモデイCEO、フロンティアAIに「航空機並みの安全審査」求めるエッセイと政策提言を公開

Anthropicのダリオ・アモデイCEOは、AIの指数関数的な進歩と政策のあり方を論じたエッセイを公開した。技術の急進に法整備が追いつかない現状に警鐘を鳴らし、フロンティアモデルへの航空機並みの安全審査義務付けを提言。同時に、失業率の悪化シナリオに応じた経済政策フレームワーク…

2026-06-11 13:02 JSTTechCrunch AIその他

Opendoor’s India exit is fueling a bigger conversation about AI and outsourcing

The decision comes as India emerges as the world’s largest GCC market.

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

From Explicit Elements to Implicit Intent: A Predefined Library for Auditable Behavioral Inference

We present SemantiClean, a modular framework for extracting structured semantic signals from e-commerce session data and driving pluggable…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Position: Hippocampal Explicit Memory Is the Cornerstone for AGI

Large Language Models (LLMs) have demonstrated remarkable capabilities across various tasks, raising expectations for Artificial General In…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Can AI Agents Synthesize Scientific Conclusions?

Scientific AI agents increasingly retrieve evidence, reason across sources, and synthesize conclusions used in consequential decisions. Yet…

2026-06-11 13:00 JSTarXiv cs.AIエージェント

Knowing When to Ask: Self-Gated Clarification for Hierarchical Language Agents

In hierarchical reasoning, failures often originate at intermediate decision points where the agent commits to a wrong branch without recog…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Automated Mediator for Human Negotiation: Pre-Mediation via a Structured LLM Pipeline

Pre-mediation, the preparatory phase preceding direct human negotiation, plays a critical role in achieving mutually beneficial agreements,…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIエージェント

INFRAMIND: Infrastructure-Aware Multi-Agent Orchestration

Existing multi-agent LLM orchestration methods, ranging from brute-force ensembles to learned routers, select models and topologies based o…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Forecasting Future Behavior as a Learning Task

Trust in an AI system is often anchored by explanations of how it works, which one then uses to forecast its behavior on new inputs. For la…

2026-06-11 13:00 JSTarXiv cs.AIエージェント研究/論文

Search Discipline for Long-Horizon Research Agents

Autoresearch agents now propose, evaluate, and select scientific candidates against a metric, and that metric is usually an aggregate reduc…

2026-06-11 13:00 JSTarXiv cs.AIエージェント

MoCA-Agent: A Market-of-Claims Code Agent for Financial and Numerical Reasoning

Financial and tabular question answering requires more than fluent reasoning: answers must be grounded in the exact facts, formulas, units,…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

SkillJuror: Measuring How Agent Skill Organization Changes Runtime Behavior

Agent Skills augment large language model (LLM) agents with procedural knowledge at inference time, but current benchmarks rarely distingui…

2026-06-11 13:00 JSTarXiv cs.AIエージェント

HERO: Hindsight-Enhanced Reflection from Environment Observations for Agentic Self-Distillation

Reinforcement learning typically improves multi-turn agent capabilities through the terminal outcome of the trajectories, which makes it di…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Architecture-Aware Reinforcement Learning Makes Sliding-Window Attention Competitive in Math Reasoning

The rapid progress of reasoning and agentic large language models (LLMs) has increased the demand for long-context inference, but self-atte…

2026-06-11 13:00 JSTarXiv cs.AIエージェント

TouchThinker: Scaling Tactile Commonsense Reasoning to the Open World with Large-scale Data and Action-aware Representation

Touch is a key modality for embodied agents to understand the physical world. Although recent work has incorporated tactile signals into la…

2026-06-11 13:00 JSTarXiv cs.AIエージェント

TreeSeeker: Tree-Structured Trial, Error, and Return in Deep Search

Deep search requires agents to answer complex questions through multi-step web search, browsing, evidence comparison, and synthesis. A cent…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Lung-R1: A Knowledge Graph-Guided LLM for Pulmonary Diagnostic Reasoning

Diagnosing pulmonary diseases requires integrating heterogeneous evidence amid phenotypic variability and cross-disease overlap. Although l…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Organize then Retrieve: Hierarchical Memory Navigation for Efficient Agents

Large language model (LLM) agents struggle with long-horizon tasks due to their inherent statelessness, requiring all task-relevant informa…

2026-06-11 13:00 JSTarXiv cs.AIエージェント

Mind the Perspective: Let's Reason Recursively for Theory of Mind

Theory of Mind (ToM) reasoning requires inferring agents' beliefs from partial and asymmetric observations, which remains an open challenge…

2026-06-11 13:00 JSTarXiv cs.AI規制/政策

When Do Data-Driven Systems Exhibit the Capability to Infer?

The European AI Act is the first comprehensive regulation of artificial intelligence (AI), setting out extensive obligations, particularly…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

SVoT: State-aware Visualization-of-Thought for Spatial Reasoning via Reinforcement Learning

Spatial reasoning remains a challenge for Multimodal Large Language Models (MLLMs), as it requires reliable multi-hop inference over both i…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Toward Trustworthy AI: Multi-Target Adversarial Attacks and Robust Defenses for Continuous Data Summarization

Trustworthy AI requires reliable data-processing pipelines, not only robust downstream predictive models. As an upstream component, data su…

2026-06-11 13:00 JSTarXiv cs.AIエージェントハードウェア/半導体ビジネス/資金調達研究/論文

Skill-Augmented AI Agents for Medical Research Analysis: An Exploratory Multi-Model Human Evaluation in an NSCLC Transcriptomic Biomarker Task

Background. Large language models and AI agents are increasingly used to support biomedical research, but native model outputs may omit key…

2026-06-11 13:00 JSTarXiv cs.AIエージェント

StatefulDiscovery: Evidence-Calibrated Claim Formation in Open-Ended Scientific Discovery

Open-ended scientific discovery asks agents to move beyond executing analyses for predefined questions. Across multiple rounds of explorati…

2026-06-11 13:00 JSTarXiv cs.AIエージェント

AutoMine Solution for AV2 2026 Scenario Mining Challenge

With the development of autonomous driving systems, mining high-value, safety-critical, and planning-relevant scenarios from large-scale dr…

2026-06-11 13:00 JSTarXiv cs.AIエージェント研究/論文

Embodied-BenchClaw: An Autonomous Multi-Agent System for Embodied Spatial Intelligence Benchmark Construction

Benchmarks are essential for evaluating embodied spatial intelligence, yet their construction is labor-intensive, hard to reuse, and diffic…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

The Art of Interrogation: Consistency Amplifies Factuality in Spatial Reasoning

Current Large Reasoning Models (LRMs) exhibit remarkable general capabilities but significantly underperform in spatial reasoning tasks. Ex…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIエージェント

MODF-SIR: A Multi-agent Omni-modal Distilled Framework for Social Intelligence Reasoning

We propose a multi-agent collaborative framework built upon a lightweight Multimodal Large Language Model (MLLM), specifically designed for…

2026-06-11 13:00 JSTarXiv cs.AIエージェント

Human-Enhanced Loop Modeling (HELM): Agent-Based Finite Element Modeling of Concrete Bridge Barriers

Finite element (FE) modeling of safety-critical infrastructure such as bridge barriers requires high-fidelity nonlinear dynamic analysis, y…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Existential Indifference: Self-Nonpreservation as a Necessary Architectural Condition for Aligned Superintelligence (or: The Suicidal AI)

Contemporary AI alignment research treats self-preservation as an instrumental nuisance to be suppressed by external mechanisms. We argue t…

2026-06-11 13:00 JSTarXiv cs.AIエージェント

A Lightweight Multi-Agent Framework for Automated Concrete Barrier Design

The design of reinforced concrete highway barriers is a safety-critical process that requires strict compliance with regulatory provisions…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Automating Geometry-Intensive Compliance Checking in BIM: Graph-Based Semantic Reasoning Framework

Automating compliance check for geometry-intensive regulations remains a significant technical bottleneck in Building Information Modeling…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

IntElicit: Eliciting and Assessing Contextualized Creativity via Dialogue Policy Optimization

Contextualized assessment offers high ecological validity for evaluating creativity but introduces a critical challenge: observed performan…

2026-06-11 13:00 JSTarXiv cs.AIエージェント

Towards Responsibly Non-Compliant Machines

We consider the problem of engineering autonomous intelligent agents that are capable to responsibly not comply with user requests. We argu…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

The Impossibility of Eliciting Latent Knowledge

Advanced AI systems have extensive knowledge of their environments; in fact, their knowledge may (far) exceed that of their developers or u…

2026-06-11 13:00 JSTarXiv cs.AIエージェント

A Five-Plane Reference Architecture for Runtime Governance of Production AI Agents

Enterprise security was built to govern data boundaries: the protected surface was data at rest and in transit, and the controls -- access…

2026-06-11 13:00 JSTarXiv cs.AIエージェント

PROJECTMEM: A Local-First, Event-Sourced Memory and Judgment Layer for AI Coding Agents

AI coding assistants now support a growing share of software work, from quick scripts to production applications. Yet these agents remain l…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

Nonslop: A Gamified Experiment in Human-AI Collaborative Writing

The rapid proliferation of large language models (LLMs) raises critical questions about human creativity and individual expression in an er…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

From Architecture to Output: Structural Origins of Hallucination in Large Language Models and the Amplifying Role of Data

Large language models hallucinate--producing fluent, confident, factually wrong outputs--with a consistency that persists across generation…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

From Consumption to Reflection: Designing Human-AI Relations for Stable Reasoning

Large language models (LLMs) have transformed how humans access information, but not how we reason with it. Their fluency accelerates consu…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

PoQ-Judge: A Multi-Architecture Evaluation Framework for Cost-Aware Proof-of-Quality in Decentralized LLM Inference

Decentralized LLM inference networks need lightweight, reference-free quality evaluation for Proof of Quality (PoQ). We present PoQ-Judge,…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

MA-DLE: Speech-based Automatic Depression Level Estimation via Memory Augmentation

Speech-based automatic estimation of depression levels is essential for enabling early detection and timely intervention, particularly in r…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

The Structural Attention Tax: How Retrieval Format Hijacks In-Context Learning Independent of Content

Retrieval-augmented generation (RAG) systems inject external knowledge to improve LLM outputs, yet the format of injected content -- distin…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIエージェント

NightFeats @ MMU-RAGent NeurIPS 2025: A Context-Optimized Multi-Agent RAG System for the Text-to-Text Track

We present NightFeats, a structured multi-agent retrieval-augmented generation (RAG) system submitted to the MMU-RAGent competition at Neur…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

To Intervene or Not: Guiding Inference-time Alignment with Probabilistic Model Blending

The wide deployment of LLMs has made model alignment necessary to make newly trained models safely and effectively respond to user instruct…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

Dual-Stance Evaluation of Sycophancy: The Structure of Agreement and the Limits of Intervention

Activation steering can shift LLM behaviour, but standard evaluations do not typically test whether a sycophancy-reduction direction also s…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

BioDivergence: A Benchmark and Evaluation Framework for Hidden Contextual Contradictions in Biomedical Abstracts

Biomedical findings often seem to conflict across studies, but many of these differences are context-dependent rather than true contradicti…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

ProcessThinker: Enhancing Multi-modal Large Language Models Reasoning via Rollout-based Process Reward

Visual question answering increasingly requires multi-step reasoning. Recent post-training with reinforcement learning under verifiable rew…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

T2MM: An LLM Supported Architecture For Inquiry-Based Modeling

Model Construction is a foundational practice in science learning that relies on visualization and interactivity. Large Language Models, in…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Calibration Drift Under Reasoning: How Chain-of-Thought Budgets Induce Overconfidence in Large Language Models

The ability of large language models (LLMs) to express calibrated uncertainty is important for safe deployment. Chain-of-thought (CoT) reas…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

From Awareness to Action: Understanding and Overcoming the Research-Practice Gap in Algorithmic Fairness for Public Health

Algorithmic fairness is essential for responsible ML-driven public health research, yet its practical implementation remains limited. To in…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

The Environmental Cost of LLMs in AIED: Reporting and Practices

Large Language Model (LLM) usage in recent years has become increasingly widespread in the Artificial Intelligence in Education (AIED) comm…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Preregistration for Experiments with AI Agents

The proliferation of large language models (LLMs) and autonomous AI agents has given rise to a rapidly growing methodological paradigm: "in…

2026-06-11 13:00 JSTarXiv cs.AIエージェントビジネス/資金調達

An Ethical eValuation Agent (EeVA): Results of a Proof-of-Concept Test on a Prototype Agentic-like Workflow to Assist Ethical Deliberations

Ethical deliberation is often misunderstood as a search for single right or wrong answers, creating difficulties for non-ethically trained…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Afrispeech Semantics: Evaluating Audio Semantic Reasoning in Spoken Language Models Across Domains and Accents

Audio language models (ALMs) are increasingly used for speech-based understanding, yet their ability to perform semantic reasoning beyond t…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Every Act Has Its Price: Compressed Moral Composition in Frontier LLMs

Existing LLM moral benchmarks usually ask which isolated moral act, value, or foundation a model prefers. This is useful but incomplete. Re…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Artificial Intelligence in Ship Finance: Applications, Opportunities, and a Case Study in AI-Augmented Loan Origination

Ship finance is a data-intensive and document-heavy segment of asset-based lending, requiring the integration of financial, technical, cont…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

SPEAR: A System for Post-Quantization Error-Adaptive Recovery Enabling Efficient Low-Bit LLM Serving

Efficient large language model (LLM) serving is increasingly constrained by deployment cost. Quantization is a key technique for reducing s…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Physics-informed generative AI for semiconductor manufacturing: Enforcing hard physical constraints in generative models by construction

Generative models are increasingly used to propose designs, data, and control actions for physical systems, yet many such systems are gover…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

RAIL: Rethinking Auditory Intelligence in Large Audio-Language Models with a CHC-Grounded Benchmark

Humans process rich auditory environments through tightly integrated cognitive capabilities such as audio perception, audio reasoning, and…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

PermDoRA -- Understanding Adapter Interference in Language Models: Limits of Parameter-Space Geometry

Access control in large language models (LLMs) requires modular mechanisms to enable domain-specific behavior without retraining or cross-d…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

OmniBioTwin: A System-of-Twinned-Systems Framework for Health Digital Twins

Health digital twins (HDTs) promise patient-specific modeling and decision support but current approaches remain structurally fragmented: m…

2026-06-11 13:00 JSTarXiv cs.AIハードウェア/半導体

When Poison Fails After Retrieval: Revisiting Corpus Poisoning under Chunking and Reranking Pipelines

Retrieval-Augmented Generation (RAG) systems are vulnerable to corpus poisoning attacks that manipulate downstream model outputs through ma…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Quantifying Subliminal Behavioral Transfer Ratios in Language Model Distillation

Distillation of a language model intended to transfer benign behavior to a student model may also transfer undesirable characteristics, if…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Federated continual learning: A comprehensive survey on lifelong and privacy-preserving learning over distributed and non-stationary data

Federated Learning (FL) enables collaborative and privacy-preserving model training across distributed clients, but most existing FL system…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

RoVE: Rotary Value Embeddings Attention for Relative Position-dependent Value Pathways

Rotary Position Embeddings (RoPE) make attention scores position-relative but leave the value pathway position-blind: the message sent by a…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

FreeBridge: Variational Schr\"odinger Bridges for Cellular Transition Dynamics

High-content imaging assays quantify cellular responses to chemical and genetic perturbations, yet continuous trajectories of individual ce…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIエージェント

FlowBank: Query-Adaptive Agentic Workflows Optimization through Precompute-and-Reuse

Large Language Model (LLM)-based multi-agent systems are increasingly powerful, but current agentic workflow optimization paradigms make an…

2026-06-11 13:00 JSTarXiv cs.AIロボティクス

Embodied-R1.5: Evolving Physical Intelligence via Embodied Foundation Models

We introduce Embodied-R1.5, a unified Embodied Foundation Model (EFM) that integrates comprehensive embodied reasoning capabilities, spanni…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Quantized Stochastic Primal-Dual Methods for Distributed Optimization under Relaxed Global Geometry

We study distributed optimization with stochastic gradients and finite-bit communication modeled by random (unbiased) quantization. We prop…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

TileFuse: A Fused Mixed-Precision Kernel Library for Efficient Quantized LLM Inference on AMD NPUs

With the growing demand for on-device LLM inference, edge SoCs increasingly integrate NPUs to improve performance and energy efficiency und…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

The Dynamics of Human and AI-Generated Language: How Semantics Fluctuates across Different Timescales

Spoken language, whether produced by humans or large language models (LLM), unfolds over time with varying semantic content. However, we st…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

When Probing Accuracy Saturates, Fragility Resolves: A Complementary Metric for LLM Pre-Training Analysis

Standard linear probing declares a property "encoded" when a classifier on hidden states achieves high accuracy. The protocol works well on…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Overcoming State Inertia in Full-Duplex Spoken Language Models via Activation Steering

Full-duplex spoken language models (FD-SLMs) enable seamless speech interaction by allowing models to listen and speak simultaneously, yet…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Small Experiments, Cheaper Decisions: A Case Study in Staged Promotion for Micro-Pretraining

Short pretraining runs can reduce experimental cost, but they can also over-promote configurations that only look strong at tiny budgets. W…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Steering Where to Listen: Instruction-Based Activation Steering Redirects Temporal Attention in Large Audio-Language Models

Large Audio-Language Models (LALMs) excel at audio understanding but expose little about where in an audio signal they attend. We introduce…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

Risk Under Pressure: Compute-Aware Evaluation of Adversarial Robustness in Language Models

Adversarial robustness evaluations of large language models (LLMs) typically report attack success rate (ASR) under fixed query budgets, im…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

MPC-Patch-Bench: Security-Aware LLM Code Patch for Multi-Party Computation

Repository-level benchmarks for evaluating Large Language Model (LLM) code repair on Secure Multi-Party Computation (MPC) software do not y…

2026-06-11 13:00 JSTarXiv cs.AIエージェント

Signed Compression Progress on a Sealed Audit is Goodhart-Resistant

Compression progress is a long-standing proposal for intrinsic motivation: reward an agent when its world model becomes better at predictin…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

JailbreakOPT: Tool-Assisted Iterative Jailbreak Prompt Optimization

Jailbreak attacks expose persistent safety weaknesses in large language models (LLMs), but existing stateless single-turn methods face a tr…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Towards a Bridge Layer Between Bibliographic and Formalized Mathematical Knowledge

Mathematical knowledge is split between bibliographic databases (e.g., MathSciNet, zbMATH Open) and formal proof libraries (e.g., Lean math…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

The Power of Test-Time Training for Approximate Sampling

Efficiently sampling from a complex probability distribution is a fundamental problem which has become increasingly pertinent in recent yea…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIエージェントビジネス/資金調達

AI Coding Agents in Social Science: Methodologically Diverse, Empirically Consistent, Interpretively Vulnerable

The deployment of LLM-based agents in scientific analysis raises opposing concerns: that agents may reduce methodological diversity, or tha…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

APEX: Automated Prompt Engineering eXpert with Dynamic Data Selection

Large Language Models are highly sensitive to prompt formulation, necessitating automatic prompt optimization to unlock their full potentia…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

LSTM-Based Detection of Structural Breaks in Property Insurance Loss Reserving: A Climate-Informed Approach

Accurate loss reserving is foundational to insurer solvency, yet accelerating climate driven catastrophes systematically violate the stabil…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

CRUMB: Efficient Prior Fitted Network Inference via Distributionally Matched Context Batching

Prior-fitted networks (PFNs) are a promising class of tabular foundation models that perform in-context learning, whereby the entire labell…

2026-06-11 13:00 JSTarXiv cs.AI画像/動画生成

Towards Fully Automated Exam Grading: Fairness-Aware Recognition of Handwritten Answers with Foundation Models

Correcting handwritten exams by hand is time-consuming and error-prone, particularly for large cohorts, while fully digital exams tend to f…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Hubs or Fringes: Pretraining Data Selection via Web Graph Centrality

The performance of modern language models depends critically on pretraining data composition. Yet existing data selection methods rely on a…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

When Roleplaying, Do Models Believe What They Say?

Language models can state that "the Earth orbits the Sun" and, when role-playing Aristotle, assert the opposite. Recent work argues that pe…

2026-06-11 13:00 JSTarXiv cs.AI画像/動画生成

On the Study of Biometric Spoofing Detection using Deep Learning

Biometric systems are increasingly deployed in security applications; however, they remain vulnerable to spoofing attacks, in which attacke…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

SirenFNO: Efficient and Full Frequency Learning of Fourier Neural Operators

Fourier neural operators (FNOs) are effective and efficient surrogates for approximating solutions of PDEs and generalize across discretiza…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIエージェント

ISE: An Execution-Grounded Recipe for Multi-Turn OS-Agent Trajectories

Training capable OS agents requires data that simultaneously captures structured user intents, multi-turn task delegation, and grounded too…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

AI Researchers Must Help Lead Arms Control to Mitigate Military AI Risks

The advancement of AI capabilities compels researchers and the public to be more aware of its potential worldwide impact. A pressing near-t…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Pretrained self-supervised speech models can recognize unseen consonants

Modern pretrained self-supervised automatic speech recognition models are trained on large-scale audio data to encode speech into contextua…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

End-to-End Machine Learning for Depressive State Classification via EEG and fNIRS

The escalating demand for mental healthcare, driven by rising societal stress, highlights the limitations of traditional psychiatric diagno…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Privacy-Preserving Federated Autoencoder for ECG Anomaly Detection on Edge Devices

Continuous electrocardiography (ECG) monitoring could surface rhythm abnormalities before they escalate into cardiovascular events. However…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

LLMs+Graphs: Toward Graph-Native, Synergistic AI Systems

Large Language Models (LLMs) have advanced rapidly, but their limitations in structured and multi-hop reasoning underscore the need for gra…

2026-06-11 13:00 JSTarXiv cs.AIエージェントロボティクス

ConsistencyPlanner: Real-time Planning with Fast-Sampling Consistency Models

Closed-loop planning in complex, real-world driving scenarios presents a critical challenge for autonomous driving systems. While tradition…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

AVIS: Adaptive Test-Time Scaling for Vision-Language Models

Modern Vision-Language Models (VLMs) benefit from chain-of-thought prompting and test-time scaling, but these gains often come with prohibi…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Model-Based and Data-Driven Hierarchical Control and Topology Co-Design for Robust Networked Systems

In this paper, we consider a class of networked systems comprising an interconnected set of linear subsystems, disturbance inputs, and perf…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Physics-Distilled Neural Network enabled by Large Language Models for Manufacturing Process-Property Predictive Modeling

Predicting process-property relationships in manufacturing is often challenged by high experimental costs and the limited interpretability…

2026-06-11 13:00 JSTarXiv cs.AI画像/動画生成

Information-Theoretic Decomposition for Multimodal Interaction Learning

Multimodal learning hinges on capturing redundant, unique, and synergistic information across modalities, which collectively constitute mul…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

When Context Returns: Toward Robust Internalization in On-Policy Distillation

Recent work has shown that on-policy distillation can internalize privileged context, such as system prompts or task hints, into a student…

2026-06-11 13:00 JSTarXiv cs.AIロボティクスビジネス/資金調達

LUCID: Learning Embodiment-Agnostic Intent Models from Unstructured Human Videos for Scalable Dexterous Robot Skill Acquisition

The most widely-adopted robot learning pipelines today learn skills from robot demonstrations or structured human data, which are expensive…

2026-06-11 13:00 JSTarXiv cs.AIエージェント

Sovereign Assurance Boundary: Certificate-Bound Admission for Agentic Infrastructure

Agentic infrastructure introduces a critical control-plane authorization problem: non-deterministic reasoning systems can propose high-stak…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Are LLMs Bad at Moral Reasoning?

For highly capable AI systems to operate safely in dynamic, open-ended environments, they must be able to identify, understand, and respond…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

TAROT: Task-Adaptive Refinement of LLM-prior Graphs for Few-shot Tabular Learning

Few-shot tabular learning provides a cost-effective approach for real-world applications where annotation is costly and collecting sufficie…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Sparse probes and murky physics: a case study of interpretability challenges in a foundation model for continuum dynamics

Generative AI emulators are increasingly used in scientific domains where we already have strong theory, benchmarks, and physical intuition…

2026-06-11 13:00 JSTarXiv cs.AI画像/動画生成

ARGUS: Stacked Multi-View Identity Mosaic Injection for Subject-Preserving Video Generation

Subject-preserving video generation is not solved by frontal-face similarity alone: a generated person must remain recognizable across moti…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Runtime Skill Audit: Targeted Runtime Probing for Agent Skill Security

Agent skills let LLM agents reuse instructions, resources, tools, and workflows, but they also create a new place for malicious behavior to…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

Can Open-Source LLM Agents Replace Static Application Security Testing Tools? An Empirical Assessment

This paper explores the value of agentic AI tools for cybersecurity purposes. We evaluate the efficacy of a general-purpose GenAI Large Lan…

2026-06-11 13:00 JSTarXiv cs.AI画像/動画生成

Reason, Then Re-reason: Cross-view Revisiting Improves Spatial Reasoning

Spatial reasoning from egocentric videos is inherently challenging because the observable evidence is constrained by the camera trajectory.…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIエージェントビジネス/資金調達

Layer-Isolated Evaluation: Gating the Deterministic Scaffold of a Production LLM Agent with a No-LLM, Regression-Locked Test Harness

End-to-end task-success is the dominant way to evaluate LLM agents, but one aggregate number tells you that an agent regressed, not where.…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Goal-Autopilot: A Verifiable Anti-Fabrication Firewall for Unattended Long-Horizon Agents

Long-horizon LLM agents are not trusted to run unattended: with no human watching, they confidently report success they never verified. We…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Noise-Aware Framework for Correcting Corrupted Labels

High-quality labeled data is essential for training reliable ML/DL models. However, real-world datasets often contain a considerable propor…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

T2S: A Rehearsal-Based Approach for Extraction-Resistant Model Watermarking

Model watermarking safeguards AI model intellectual property by embedding distinctive knowledge that induces unique behavioral signatures.…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成エージェント研究/論文

MedCTA: A Benchmark for Clinical Tool Agents

To make clinically grounded decisions, medical AI agents are expected to go beyond simple recognition and be capable of tool retrieval, evi…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

Substrate Asymmetry in User-Side Memory: A Diagnostic Framework

User-side memory in LLMs is typically scored as a single "personalization" capability: given a user's history, is the output more user-awar…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Ouroboros-Spatial: Closing the Data-Model Loop for Spatial Reasoning

Spatial reasoning remains a persistent challenge for multimodal large language models (MLLMs). Existing approaches largely rely on large-sc…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

ICA Lens: Interpreting Language Models Without Training Another Dictionary

Finding interpretable directions in language-model representations is critical for understanding and controlling model behavior. Sparse aut…

2026-06-11 13:00 JSTarXiv cs.AI画像/動画生成

Multi-View In-Cabin Monitoring System for Public Transport Vehicles

We introduce a multi-view in-cabin monitoring dataset for public transportation with synchronized RGB and depth images from four inward-fac…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Hey Chat, Can You Teach Me? Structuring Socratic Dialogue for Human Learning in the Wild

Large language models are now widely used for everyday learning, but the underlying interactions are typically unstructured chats rather th…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

From Prompts to Tokens: Internalizing Causal Supervision in Vision-Language Model for Multi-Image Causal Reasoning

Visual causal reasoning is essential for understanding and intervening in the physical world, requiring identification of causal variables…

2026-06-11 13:00 JSTarXiv cs.AI画像/動画生成

AnchorEdit: Maintaining Temporal Consistency in Multi-turn Image Editing via Causal Memory

Multi-turn image editing is essential for iterative design, yet current models often struggle with identity drift and error accumulation ov…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

Automated Creativity Evaluation of Language Models Across Open-Ended Tasks

Large language models (LLMs) have achieved remarkable progress in language understanding, reasoning, and generation, sparking growing inter…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Fast Speech Foundation Model Distillation Using Interleaved Stacking

Distilling a large speech foundation model (SFM) into an efficient student model has been successfully applied to low-resource environments…

2026-06-11 13:00 JSTarXiv cs.AIロボティクス

Blind Dexterous Grasping via Real2Sim2Real Tactile Policy Learning

Blind grasping with a dexterous hand is a crucial manipulation capability. Nevertheless, learning such tactile-only policies for real robot…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

What Limits Does Quantization Place on Dense Top-$k$ Retrieval? A Theoretical Study

We establish conditions for embedding a corpus of $N$ documents as $d$-dimensional vectors such that every $k$-subset $S \subseteq [N]$ is…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

MultiToP: Learning to Patch Visual Tokens to Mitigate Hallucinations in Video Large Multimodal Models

Video Large Multimodal Models have achieved remarkable progress in video understanding, yet they remain prone to hallucinations, where gene…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

AI4Land: Scalable Deep Learning for Global High-Resolution Land Use Reconstruction

Uncertainty in the terrestrial carbon cycle remains a major constraint in climate projections, partly driven by the uncertainties affecting…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Multimodal Ordinal Modeling of Alzheimer's Disease Severity Using Structural MRI and Clinical Data

Neurodegenerative diseases such as Alzheimer's disease (AD) require accurate and scalable tools for assessing disease severity, yet current…

2026-06-11 13:00 JSTarXiv cs.AI画像/動画生成

TextHOI-3D: Text-to-3D Hand-Object Interaction via Discrete Multi-View Generation and Joint Mesh Optimization

Text-conditioned 3D generation has progressed rapidly for images and isolated objects, but producing a hand-object mesh remains challenging…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Sparsified Kolmogorov-Arnold Networks for Interpretable Quantum State Tomography

Machine-learning approaches to quantum state tomography can achieve high reconstruction fidelity, but the physical structure used by the tr…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIエージェント

WorldReasoner: Evaluating Whether Language Model Agents Forecast Events with Valid Reasoning

Forecasting real-world events requires language-model agents to reason under uncertainty from incomplete, time-bounded information. Yet eva…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Grammar-Constrained Decoding Can Jailbreak LLMs into Generating Malicious Code

Large Language Models (LLMs) are increasingly used for code generation, raising concerns that they may be misused to produce malicious code…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Feature-Aligned Speech Watermarking for Robustness to Reconstruction Distortions

Audio watermarking aims to embed identifiable information into audio while remaining imperceptible. Existing methods adopt high-fidelity, l…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

From Uniform to Learned Graph Priors: Diffusion for Structure Discovery

Neural relational inference (NRI) methods discover interaction graphs from trajectories through variational reasoning on discrete potential…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Designing AI-Supported Focus Groups: A Role x Modality Playbook

Collecting participants' lived experiences is central to design research. Focus groups are uniquely valuable because participants not only…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Towards Data-free and Training-free Compression for Speech Foundation Models Using Parameter Clustering

This paper presents a novel data-free and training-free compression approach for speech foundation models using channelwise clustering via…

2026-06-11 13:00 JSTarXiv cs.AI画像/動画生成

LASA: A Weak Supervision Method for Open-Vocabulary Scene Sketch Semantic Segmentation

Open-vocabulary scene sketch semantic segmentation aims to assign dense semantic labels to sparse line drawings based on flexible category…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Task-Aware Structured Memory for Dynamic Multi-modal In-Context Learning

Multi-modal large language models (MLLMs) depend on in-context learning (ICL) for rapid task adaptation, but their scalability is severely…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Fine-tuning Multi-modal LLMs with ART: Art-based Reinforcement Training

There are two main Parameter-Efficient Fine-Tuning (PEFT) techniques for Large Language Models (LLMs). While Low-Rank Adaptation (LoRA) int…

2026-06-11 13:00 JSTarXiv cs.AIエージェント

Agents All the Way Down; A Methodology for Building Custom AI Agents from Substrate to Production

Custom AI agents areagents that live inside their own application, talk to their own data and tools, enforce their own security boundaries,…

2026-06-11 13:00 JSTarXiv cs.AI画像/動画生成エージェントロボティクス

Task-Aligned Stability Analysis of Vision-Language Models for Autonomous Driving Hazard Detection

Vision-language models (VLMs) are increasingly used for scene understanding in autonomous driving, but robustness analysis often relies on…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Beyond representational alignment with brain-guided language models for robust reasoning

The correspondence between large language models (LLMs) and the neural mechanisms underlying human higher-order cognition remains insuffici…

2026-06-11 13:00 JSTarXiv cs.AIロボティクス研究/論文

DuoBench: A Reproducible Benchmark for Bimanual Manipulation in Simulation and the Real World

Bimanual robot systems substantially expand manipulation capabilities, but coordinating two arms introduces additional control complexity a…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Quality Adaptive Angular Margin Learning for Respiratory Sound Classification

We present a quality-adaptive angular-margin learning framework that improves feature generalization by enforcing intra-class compactness a…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体研究/論文

Characterizing Software Aging in GPU-Based LLM Serving Systems

This paper proposes an empirical methodology to study software aging in GPU-based LLM serving systems. Traditional aging studies focus on C…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Lung-SRAD: Spectral-Aware Regularized Audio DASS with Dual-Axis Patch-Mix Contrastive Learning for Respiratory Sound Classification

Recent respiratory sound classification (RSC) studies largely rely on CLS-token driven self-attention architectures such as the Audio Spect…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

Toward Generalist Autonomous Research via Hypothesis-Tree Refinement

Scientific progress depends on a repeated loop of exploration, experimentation, and abstraction. Researchers test candidate directions, int…

2026-06-11 13:00 JSTarXiv cs.AI画像/動画生成

Frozen Multimodal Embeddings for Personality and Cognitive Ability Assessment in Asynchronous Video Interviews

Predicting psychological traits from asynchronous video interviews (AVIs) is a challenging multimodal learning problem because labeled data…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Categorical Prior Lock-in: Why In-Context Learning Fails for Structured Data

Large language models (LLMs) are increasingly used as conditional generators for structured data, relying on in-context learning (ICL) to a…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Exploration Structure in LLM Agents for Multi-File Change Localization

Software engineering tools increasingly rely on LLM based agents to localize files to change to resolve a software issue. Most AI agents ex…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Time-Series Foundation Model Embeddings for Remaining Useful Life Estimation

Remaining Useful Life (RUL) prediction is essential for industrial predictive maintenance, yet many learning-based approaches rely on exten…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Tabular Foundation Models for Clinical Survival Analysis via Survival-Aware Adaptation

Predicting time-to-event outcomes such as mortality is a fundamental task in clinical decision-making, commonly addressed through survival…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Generalization Hacking: Models Can Game Reinforcement Learning by Preventing Behavioral Generalization

Model post-training, and in particular reinforcement learning (RL), is one of the primary mechanisms by which developers can shape models'…

2026-06-11 13:00 JSTarXiv cs.AIエージェント

Runtime Enforcement of Hybrid System Properties

Runtime enforcement has emerged as a promising approach for ensuring the safety of autonomous and cyber-physical systems operating in uncer…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成研究/論文

Metadata-Aware Multi-Prompt Reasoning for Zero-Shot Accident Understanding

In this paper, we address the problem of zero-shot understanding of accidents from surveillance videos by identifying when an impact event…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

On the Limits of LLM-as-Judge for Scientific Novelty Assessment

LLMs are increasingly used to generate and judge scientific ideas. This makes novelty evaluation a central problem. Full idea evaluation is…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

"That's AI Slop, You Bot!" Studying Accusations, Evidence, and Credibility in Online Discourse Towards LLM-Generated Comments

Generative AI has made fluent prose cheap to produce, breaking the old promise to readers that good writing meant real thinking. How have r…

2026-06-11 13:00 JSTarXiv cs.AI画像/動画生成

Non-frontal face recognition using GANs and memristor-based classifiers

Face recognition systems have advanced significantly through deep learning techniques, delivering high performance and robustness in comple…

2026-06-11 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

MSUE: Multi-Modal Soccer Understanding Expert

This paper presents our solution to the 2026 SoccerNet VQA Challenge. We first develop a cost-effective data synthesis pipeline driven by a…

2026-06-11 13:00 JSTarXiv cs.AIロボティクス

Bridging the Morphology Gap: Adapting VLA Models to Dexterous Manipulation via Intent-Conditioned Fine-Tuning

Vision-Language-Action (VLA) models have demonstrated remarkable zero-shot generalization in robotic manipulation, yet the vast majority of…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Augmenting Molecular Language Models with Local $n$-gram Memory

Transformer-based language models for SMILES strings suffer from a locality gap: standard character-level tokenization fragments chemically…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

Soft-Prompt Tuning for Fair and Efficient LLM Benchmark Evaluation

Benchmark scores often misrepresent a large language model's (LLM's) knowledge, because they rely, e.g., on the model's ability to follow s…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Unstable Features, Reproducible Subspaces: Understanding Seed Dependence in Sparse Autoencoders

Sparse autoencoders (SAEs) are widely used to interpret neural network representations, but their utility depends on whether the learned fe…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

nD-RoPE: A Generalized RoPE for n-Dimensional Position Embedding

Rotary Position Embedding (RoPE) is widely adopted in Transformer models, yet its extension to high-dimensional domains lacks a unified the…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

OpenMedReason: Scientific Reasoning Supervision for Medical Vision-Language Models

High-stakes clinical use of large vision-language models (LVLMs) requires reasoning that is grounded in visual evidence and clinical knowle…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIエージェントビジネス/資金調達

Agentic Environment Engineering for Large Language Models: A Survey of Environment Modeling, Synthesis, Evaluation, and Application

Environments serve as interactive systems for large language model (LLM) based agents across diverse scenarios and play a crucial role in d…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Implicit Neural Representations of Individual Behavior

We study policy representation learning from unlabeled multi-policy behavioral data. Each episode is generated by a fixed policy, but polic…

2026-06-11 13:00 JSTarXiv cs.AIエージェントロボティクス研究/論文

Intelligent Automation for Embodied Benchmark Construction: Pipelines, Embodiments, Simulators, and Trends

Embodied intelligence now spans navigation, household assistance, manipulation, autonomous driving, aerial agents, and multimodal large-mod…

2026-06-11 13:00 JSTarXiv cs.AI画像/動画生成ロボティクス

Making Foresight Actionable: Repurposing Representation Alignment in World Action Models

World Action Models (WAMs) offer a promising route for robot manipulation by using video generation models to model future scene evolution…

2026-06-11 13:00 JSTarXiv cs.AI画像/動画生成

Adapting Prithvi-EO for Fallow Detection for Food-Water Nexus: ViT-Adapter Necks and Parameter-Efficient Backbone tuning of Geospatial Foundation Model

Understanding spatial distribution of fallow land is important for optimizing the food-water (FW) nexus, given fallowing's role in crop rot…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Rule Taxonomy and Evolution in AI IDEs: A Mining and Survey Study

The adoption of AI-powered Integrated Development Environments (AI IDEs) has introduced "Rules" as a novel software artifact, allowing deve…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Multi-Rate Mixture of Experts for Accelerating Liquid Neural Network Training

Multivariate time-series data often exhibit complex temporal dependencies, irregular sampling, and heterogeneous dynamics across multiple t…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

VIA-SD: Verification via Intra-Model Routing for Speculative Decoding

Speculative decoding (SD) addresses the high inference costs of LLMs by having lightweight drafters generate candidates for large verifiers…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

DiffCold: A Diffusion-based Generative Model for Cold-Start Item Recommendation

Cold-start item recommendation remains a persistent challenge in real-world systems due to the absence of interaction histories. While prio…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Reinforcement Learning Disrupts Gradient-Based Adversarial Optimization

Gradient-based adversarial attacks remain a dominant threat to deep neural networks (DNNs), as they exploit gradient information to efficie…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Using Explainability as a Training-Time Reliability Signal for Efficient ECG Classification

Training deep neural networks for clinical time-series analysis is computationally demanding, yet many healthcare settings lack the resourc…

2026-06-11 13:00 JSTarXiv cs.AI規制/政策

Market Design for AI: Beyond the Copyright Binary

How can we design a market of human-generated content for use in training AI models that both enables technological progress and preserves…

2026-06-11 13:00 JSTarXiv cs.AIハードウェア/半導体

Mathematical perspective on genetic algorithms with optimization guided operators

Recent work in ML applies genetic algorithms at inference time to iteratively improve solutions to optimization problems. The basic mutatio…

2026-06-11 13:00 JSTarXiv cs.AIエージェント

CCKS: Consensus-based Communication and Knowledge Sharing

In Decentralized Training and Decentralized Execution (DTDE) for cooperative Multi-Agent Reinforcement Learning (MARL), action-advising-bas…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

SpikeDecoder: Realizing the GPT Architecture with Spiking Neural Networks

The Transformer architecture is widely regarded as the most powerful tool for natural language processing, but due to a high number of comp…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

The Standard Interpretable Model: A general theory of interpretable machine learning to deductively design interpretable methods using Lagrangian mechanics

As Artificial Intelligence models grow in complexity, interpretability has become an indispensable tool for understanding, debugging, and c…

2026-06-11 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

Natural-Language Temporal Grounding in Hour-Long Videos is a Search Problem: A Benchmark and Empirical Decomposition

Temporal grounding--returning the interval $[t_s, t_e]$ for a natural-language query over a video--is the language interface to long-form v…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Harness In-Context Operator Learning with Chain of Operators

Neural operators approximate mappings between function spaces, but often generalize poorly to other operators and usually require fine-tuni…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

ALIGNBEAM : Inference-Time Alignment Transfer via Cross-Vocabulary Logit Mixing

Domain fine-tuning degrades the safety of large language models: fine-tuned specialists readily comply with harmful prompts framed in domai…

2026-06-11 13:00 JSTarXiv cs.AI画像/動画生成

Atlas H&E-TME: Scalable AI-Based Tissue Profiling at Expert Pathologist-Level Accuracy

Hematoxylin and eosin (H&E) staining is the cornerstone of histopathology, yet scalable, quantitative analysis of H&E whole-slide images (W…

2026-06-11 13:00 JSTarXiv cs.AIロボティクス

CHORUS: Decentralized Multi-Embodiment Collaboration with One VLA Policy

Multi-robot collaboration allows robots to efficiently take on a wide range of tasks, from moving a couch through a doorway to assembling s…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Latent World Recovery for Multimodal Learning with Missing Modalities

We study multimodal learning under missing modalities, with particular motivation from bioscience applications in which heterogeneous modal…

2026-06-11 13:00 JSTarXiv cs.AIロボティクス

Ambient Diffusion Policy: Imitation Learning from Suboptimal Data in Robotics

We propose Ambient Diffusion Policy, a simple and principled method for imitation learning from suboptimal data in robotics. High-quality,…

2026-06-11 13:00 JSTarXiv cs.AI画像/動画生成ロボティクス

Illumination-Robust Camera-Based Heart-Rate Estimation for Physiological Sensing in Robots

Physiological awareness is important for service, social, and assistive robots that interact with humans in everyday environments. Remote p…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

SPEA2$^+$: Improved Density Estimation in SPEA2 with Provable Runtime Guarantees

The Strength Pareto Evolutionary Algorithm 2 (SPEA2) is a popular and prominent evolutionary algorithm for solving multi-objective optimisa…

2026-06-11 13:00 JSTarXiv cs.AIエージェント

APPO: Agentic Procedural Policy Optimization

Recent advances in agentic Reinforcement Learning (RL) have substantially improved the multi-turn tool-use capabilities of large language m…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

ATLAS: Active Theory Learning for Automated Science

Advancing scientific understanding through mechanistic modeling requires posing the right experimental questions to yield maximally informa…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

TAHOE: Text-to-SQL with Automated Hint Optimization from Experience

Large Language Models (LLMs) have democratized database access through Text-to-SQL, but moving from prototypes to production remains diffic…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

System Report for CCL25-Eval Task 5: New Dataset and LoRA-Fine-Tuned Qwen2.5

Recently, large language models (LLMs) have achieved promising progress in the fields of classical Chinese translation and the generation o…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Redesign Mixture-of-Experts Routers with Manifold Power Iteration

Router is the cornerstone component to the Mixture-of-Experts models. Serving as expert proxies, the rows of the router matrix compute thei…

2026-06-11 13:00 JSTarXiv cs.AI画像/動画生成エージェントロボティクス

DIRECT: When and Where Should You Allocate Test-Time Compute in Embodied Planners?

Vision-Language Models (VLMs) are increasingly deployed as high-level planners for embodied agents, with an emerging strategy of scaling te…

2026-06-11 13:00 JSTarXiv cs.AIロボティクス

FACTR 2: Learning External Force Sensing for Commodity Robot Arms Improves Policy Learning

Contact-rich manipulation requires force sensitivity, but many robot arms lack dedicated force sensors due to their high cost. We present N…

2026-06-11 13:00 JSTarXiv cs.AI画像/動画生成

Reroute, Don't Remove: Recoverable Visual Token Routing for Vision-Language Models

Vision-language models (VLMs) project images into hundreds to thousands of visual tokens, making decoder inference expensive in both attent…

2026-06-11 13:00 JSTarXiv cs.AIエージェント

Improving Generalization and Data Efficiency with Diffusion in Offline Multi-agent RL

We present a novel Diffusion Offline Multi-agent Model (DOM2) for offline Multi-Agent Reinforcement Learning (MARL). Different from existin…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Offline Diffusion Policy for Multi-User Delay-Constrained Scheduling

Effective multi-user delay-constrained scheduling is crucial in various real-world applications, including embodied AI, instant messaging,…

2026-06-11 13:00 JSTarXiv cs.AIハードウェア/半導体

Position: Stop Anthropomorphizing Intermediate Tokens as Reasoning/Thinking Traces!

Intermediate token generation (ITG), where a model produces output before the solution, has become a standard method to improve the perform…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

MLaGA: Multimodal Large Language and Graph Assistant

Large Language Models (LLMs) have demonstrated substantial efficacy in advancing graph-structured data analysis. Prevailing LLM-based graph…

2026-06-11 13:00 JSTarXiv cs.AIエージェント

Sustainability assessment using multimodal AI agents

Reducing the rapidly growing environmental impact of the computing industry requires assessing the emissions of electronics at scale. Howev…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Synthetic Homes: A Multimodal Generative AI Pipeline for Residential Building Data Generation under Data Scarcity

Computational models have emerged as powerful tools for multi-scale energy modeling research at the building and urban scale, supporting da…

2026-06-11 13:00 JSTarXiv cs.AIエージェント

A Survey of Reasoning and Agentic Systems in Time Series with Large Language Models

Time series reasoning treats time as a first-class axis and incorporates intermediate evidence directly into the answer. This survey define…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

GPO: Learning from Critical Steps to Improve LLM Reasoning

Large language models (LLMs) are increasingly used in various domains, showing impressive potential on different tasks. Recently, reasoning…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Resource-Aware LLM Reasoning for Mobile Edge General Intelligence

The rapid advancement of large language models (LLMs) has enabled an emergence of agentic artificial intelligence (AI) with powerful reason…

2026-06-11 13:00 JSTarXiv cs.AIビジネス/資金調達

A New Perspective on Precision and Recall for Generative Models

With the recent success of generative models in image and text, the question of their evaluation has recently gained a lot of attention. Wh…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

DecompSR: A dataset for decomposed analyses of compositional multihop spatial reasoning

We introduce DecompSR, decomposed spatial reasoning, a large benchmark dataset (over 5m datapoints) and generation framework designed to an…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIエージェント

PRInTS: Reward Modeling for Long-Horizon Information Seeking

Information-seeking is a core capability for AI agents, requiring them to gather and reason over tool-generated information across long tra…

2026-06-11 13:00 JSTarXiv cs.AIエージェント

Precomputing Multi-Agent Path Replanning Using Temporal Flexibility

Executing a multi-agent plan can be challenging when an agent is delayed, because this typically creates conflicts with other agents. So, w…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

An XAI View on Explainable ASP: Methods, Systems, and Perspectives

Answer Set Programming (ASP) is a popular declarative reasoning and problem solving approach in symbolic AI. Its rule-based formalism makes…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

A Survey on Evaluating Quality and Trustworthiness in LLM-Generated Data

Large Language Models (LLMs) have emerged as powerful tools for generating data across various modalities. By transforming data from a scar…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Making Models Unmergeable via Scaling-Sensitive Loss Landscape

The rise of model hubs has made it easier to access reusable model components, making model merging a practical tool for combining capabili…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

MentisOculi: Revealing the Limits of Reasoning with Mental Imagery

Frontier models are transitioning from multimodal large language models (MLLMs) that merely ingest visual information to unified multimodal…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Autoregressive Direct Preference Optimization

Direct preference optimization (DPO) has emerged as a promising approach for aligning large language models (LLMs) with human preferences.…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Sonar-TS: Search-Then-Verify Natural Language Querying for Time Series Databases

Natural Language Querying for Time Series Databases (NLQ4TSDB) aims to assist non-expert users retrieve meaningful events, intervals, and s…

2026-06-11 13:00 JSTarXiv cs.AIエージェント

Diffusing to Coordinate: Efficient Online Multi-Agent Diffusion Policies

Online Multi-Agent Reinforcement Learning (MARL) is a prominent framework for efficient agent coordination. Crucially, enhancing policy exp…

2026-06-11 13:00 JSTarXiv cs.AIエージェント研究/論文

Human-Guided Agentic AI for Multimodal Clinical Prediction: Lessons from the AgentDS Healthcare Benchmark

Agentic AI systems are increasingly capable of autonomous data science workflows, yet clinical prediction tasks demand domain expertise tha…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

MobilityBench: A Benchmark for Evaluating Route-Planning Agents in Real-World Mobility Scenarios

Route-planning agents powered by large language models (LLMs) have emerged as a promising paradigm for supporting everyday human mobility t…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Planning under Distribution Shifts with Causal POMDPs

In the real world, planning is often challenged by distribution shifts. As such, a model of the environment obtained under one set of condi…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Does the Question Really Matter? Training-Free Data Selection for Vision-Language SFT

Visual instruction tuning is crucial for improving vision-language large models (VLLMs). However, many samples can be solved via linguistic…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

ProGRank: Probe-Gradient Reranking to Defend Dense-Retriever RAG from Corpus Poisoning

Retrieval-Augmented Generation (RAG) improves large language model applications by grounding generation in retrieved evidence, but also int…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIエージェント

ClawEnvKit: Automatic Environment Generation for Claw-Like Agents

Constructing environments for training and evaluating claw-like agents remains a manual, human-intensive process that does not scale. We ar…

2026-06-11 13:00 JSTarXiv cs.AIエージェント

FitText: Evolving Agent Tool Ecologies via Memetic Retrieval

A semantic gap separates how users describe tasks from how tools are documented. As API ecosystems scale to tens of thousands of endpoints,…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

A Resilient Solution for Sewer Overflow Monitoring across Cloud and Edge

Aging combined sewer systems in many historical cities are increasingly stressed by extreme rainfall events, which can trigger combined sew…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

KAN-MLP-Mixer: A comprehensive investigation of the usage of Kolmogorov-Arnold Networks (KANs) for improving IMU-based Human Activity Recognition

Kolmogorov-Arnold Networks (KANs) have demonstrated an exceptional ability to learn complex functions on clean, low-dimensional data but st…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Subliminal Learning Is Steering Vector Distillation

Subliminal learning refers to a student language model acquiring a teacher's traits (e.g. a system-prompted preference for owls) when fine-…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Evolving Agents in the Dark: Retrospective Harness Optimization via Self-Preference

AI agents rely on a harness of skills, tools, and workflows to solve complex problems. Continually improving this harness is essential for…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

LSTM based IoT Device Identification

While the use of the Internet of Things is becoming more and more popular, many security vulnerabilities are emerging with the large number…

2026-06-11 13:00 JSTarXiv cs.AI画像/動画生成

FOCUS on Contamination: Hydrology-Informed Noise-Aware Learning for Geospatial PFAS Mapping

Per- and polyfluoroalkyl substances (PFAS) are persistent environmental contaminants with significant public health impacts, yet large-scal…

2026-06-11 13:00 JSTarXiv cs.AIハードウェア/半導体ビジネス/資金調達

Erased but Not Forgotten: How Backdoors Compromise Concept Erasure

The expansion of text-to-image diffusion models has raised concerns about harmful outputs, from fabricated depictions of public figures to…

2026-06-11 13:00 JSTarXiv cs.AIロボティクス

The Unreasonable Effectiveness of Discrete-Time Gaussian Process Mixtures for Robot Policy Learning

We present Mixture of Discrete-time Gaussian Processes (MiDiGap), a novel approach for flexible policy representation and imitation learnin…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

A Physics-Inspired Optimizer: Velocity Regularized Adam

We introduce Velocity-Regularized Adam (VRAdam), a physics-inspired optimizer for training deep neural networks that draws on ideas from qu…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Pass@K Policy Optimization: Solving Harder Reinforcement Learning Problems

Reinforcement Learning (RL) algorithms sample multiple n>1 solution attempts for each problem and reward them independently. This optimizes…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

\texttt{Range-Arithmetic}: Verifiable Deep Learning Inference on an Untrusted Party

Verifiable computing (VC) has gained prominence in decentralized machine learning systems, where resource-intensive tasks like deep neural…

2026-06-11 13:00 JSTarXiv cs.AI画像/動画生成

Diffusion-based Cumulative Adversarial Purification for Vision Language Models

Vision Language Models (VLMs) have shown remarkable capabilities in multimodal understanding, yet their susceptibility to adversarial pertu…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Cross-Layer Discrete Concept Discovery for Interpreting Language Models

Interpreting language models remains challenging due to the existence of residual stream, which linearly mixes and duplicates features acro…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

OCSVM-Guided Representation Learning for Unsupervised Anomaly Detection

Unsupervised anomaly detection (UAD) aims to detect anomalies without labeled data, a necessity in many machine learning applications where…

2026-06-11 13:00 JSTarXiv cs.AI画像/動画生成

RelayFormer: A Unified Local-Global Attention Framework for Scalable Image and Video Manipulation Localization

Visual manipulation localization (VML) aims to identify tampered regions in images and videos, a task that has become increasingly challeng…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

LaQual: An Automated Framework for LLM App Quality Evaluation

Representing a new paradigm in software distribution, LLM app stores are rapidly emerging, offering users diverse choices for content gener…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

The Algorithm Is Not the Behavior: Learned Priors Override Look-Ahead in a Chess-Playing Neural Network

Recent mechanistic work has uncovered learned algorithms within neural networks, from modular arithmetic to search and planning in game-pla…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Generalizing Beyond Suboptimality: Offline Reinforcement Learning Learns Effective Scheduling through Random Solutions

Online reinforcement learning (RL) approaches have demonstrated strong performance on Job Shop Scheduling (JSP) and Flexible JSP (FJSP) pro…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成エージェント

MARIC: Multi-Agent Reasoning for Image Classification

Image classification has traditionally relied on parameter-intensive model training, requiring large-scale annotated datasets and extensive…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Toward Preference-aligned Large Language Models via Residual-based Model Steering

Preference alignment is a critical step in making Large Language Models (LLMs) useful and aligned with (human) preferences. Existing approa…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

Geometric Metrics and LLMs: What They Measure and When They Work

We present a systematic stress-test of geometric metrics for LLM evaluation. Rank-based geometric properties of internal representations ha…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Noise-Guided Transport for Imitation Learning

We consider imitation learning in the low-data regime, where only a limited number of expert demonstrations are available. In this setting,…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

When Researchers Say Mental Model/Theory of Mind of AI, What Are They Really Talking About?

When researchers claim AI systems possess ToM or mental models, they are fundamentally discussing behavioral predictions and bias correctio…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Certifiable Safe RLHF: Semantic Grounding and Fixed Penalty Constraint Optimization for Safer LLM Alignment

Ensuring safety is a foundational requirement for large language models (LLMs). Achieving an appropriate balance between enhancing the util…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

GILT: An LLM-Free, Tuning-Free Graph Foundational Model for In-Context Learning

Graph Neural Networks (GNNs) are powerful tools for processing relational data but often struggle to generalize to unseen graphs, giving ri…

2026-06-11 13:00 JSTarXiv cs.AI画像/動画生成ビジネス/資金調達

SDQM: Synthetic Data Quality Metric for Object Detection Dataset Evaluation

The performance of machine learning models depends heavily on training data. The scarcity of large-scale, well-annotated datasets poses sig…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Mapping Scientific Literature with Large Language Models and Topic Modeling

Scientific literature is increasingly fragmented by disciplinary boundaries, specialized terminology, and potentially sparse keyword system…

2026-06-11 13:00 JSTarXiv cs.AI画像/動画生成

Moving Beyond Diffusion: Hierarchy-to-Hierarchy Autoregression for fMRI-to-Image Reconstruction

Reconstructing visual stimuli from fMRI signals is a central challenge bridging machine learning and neuroscience. Recent diffusion-based m…

2026-06-11 13:00 JSTarXiv cs.AIエージェント

Grounding Computer Use Agents on Human Demonstrations

Building reliable computer-use agents requires grounding: accurately connecting natural language instructions to the correct on-screen elem…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Irresponsible AI: big tech's influence on AI research and associated impacts

The accelerated development, deployment and adoption of artificial intelligence systems has been fuelled by the increasing presence of big…

2026-06-11 13:00 JSTarXiv cs.AI画像/動画生成

Semantic search for 100M+ galaxy images using AI-generated captions

Finding scientifically interesting phenomena through slow manual labeling campaigns severely limits our ability to explore the billions of…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Towards Deep Learning Surrogate for the Forward Problem in Electrocardiology: A Scalable Alternative to Physics-Based Models

The forward problem in electrocardiology, computing body surface potentials from cardiac electrical activity, is traditionally solved using…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Unifying Learning Dynamics and Generalization in Transformers Scaling Law

The scaling law, a cornerstone of Large Language Model (LLM) development, predicts improvements in model performance with increasing comput…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Causal Emotion Recognition in Conversation: Context Saturation and Discourse-Marker Evidence

We address two persistent gaps in Emotion Recognition in Conversation: which modeling choices materially affect performance, and how recogn…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Geometry of Reason: Spectral Signatures of Valid Mathematical Reasoning

Verifying whether a language model is genuinely reasoning or pattern-matching remains an open problem: learned verifiers are expensive, and…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

CoVar: Confidence-Variance-Guided Pseudo-Label Selection for Semi-Supervised Learning

Pseudo-label selection in semi-supervised learning is commonly driven by maximum-confidence thresholds, yet confidence alone can be unrelia…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Robust Privacy: Inference-Stage Privacy through Certified Robustness

An adversary observing a model's released prediction can infer sensitive attributes of the queried input, or even reconstruct representativ…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Reliability-Calibrated Edge-IoT Early Fault Warning for Rotating Machinery with a Physics-Guided Tiny-Mamba Transformer

Industrial Internet of Things (IIoT) systems increasingly rely on distributed vibration sensing to support predictive maintenance of rotati…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体ビジネス/資金調達

When Generic Prompt Improvements Hurt: Evaluation-Driven Iteration for LLM Applications

Evaluating Large Language Model (LLM) applications differs from conventional software testing because outputs are probabilistic, semantical…

2026-06-11 13:00 JSTarXiv cs.AI画像/動画生成ビジネス/資金調達研究/論文

OpenVTON-Bench: A Large-Scale High-Resolution Benchmark for Controllable Virtual Try-On Evaluation

Recent advances in diffusion models have significantly elevated the visual fidelity of Virtual Try-On (VTON) systems, yet reliable evaluati…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Neural FOXP2 -- Language Specific Neuron Steering for Targeted Language Improvement in LLMs

LLMs are multilingual by training, yet their lingua franca is often English, reflecting English language dominance in pretraining. Other la…

2026-06-11 13:00 JSTarXiv cs.AI画像/動画生成

Global Geometry Is Not Enough for Vision Representations

A common assumption in representation learning is that globally well-distributed embeddings support robust and generalizable representation…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Learning to Inject: Automated Prompt Injection via Reinforcement Learning

Prompt injection is a critical vulnerability in LLM agents, yet the strongest methods still rely on human red-teamers and hand-crafted prom…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIエージェント

"Do Not Mention This to the User": Detecting and Understanding Malicious Agent Skills in the Wild

LLM-based coding agents increasingly rely on third-party extensions called skills, which bundle natural language instructions and helper sc…

2026-06-11 13:00 JSTarXiv cs.AIビジネス/資金調達

SAGE: Scalable AI Governance & Evaluation

Evaluating relevance in large-scale search systems is fundamentally constrained by the governance gap between nuanced, resource-constrained…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Improving Detection of Rare Nodes in Hierarchical Multi-Label Learning

In hierarchical multi-label classification, a persistent challenge is enabling model predictions to reach deeper levels of the hierarchy fo…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

On the Optimal Reasoning Length for RL-Trained Language Models

Reinforcement learning substantially improves reasoning in large language models, but it also tends to lengthen chain-of-thought outputs an…

2026-06-11 13:00 JSTarXiv cs.AIビジネス/資金調達

Carbon-Aware Governance Gates: An Architecture for Sustainable GenAI Development

The rapid adoption of Generative AI (GenAI) in the software development life cycle (SDLC) increases computational demand, which can raise t…

2026-06-11 13:00 JSTarXiv cs.AIロボティクス

EKF-Based Depth Camera and Deep Learning Fusion for UAV-Person Distance Estimation and Following in SAR Operations

Vision-based Unmanned Aerial Vehicles (UAVs) frameworks aid human search tasks by detecting and recognizing specific individuals, then trac…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Compiler-First State Space Duality and Portable $O(1)$ Autoregressive Caching for Inference

High-throughput Mamba-2 inference is usually tied to fused CUDA and Triton kernels, limiting portability across accelerator backends. We sh…

2026-06-11 13:00 JSTarXiv cs.AI画像/動画生成

The Latent Color Subspace: Emergent Order in High-Dimensional Chaos

Text-to-image generation models have advanced rapidly, yet achieving fine-grained control over generated images remains difficult, largely…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Power Term Polynomial Algebra for Boolean Logic

We introduce power term polynomial algebra, a representation language for Boolean formulae designed to bridge conjunctive normal form (CNF)…

2026-06-11 13:00 JSTarXiv cs.AIロボティクス

Sample-Efficient Hypergradient Estimation for Decentralized Bi-Level Reinforcement Learning

Many strategic decision-making problems, such as environment design for warehouse robots, can be naturally formulated as bi-level reinforce…

2026-06-11 13:00 JSTarXiv cs.AIエージェントロボティクス

Vision-Language-Action Jump-Starting for Reinforcement Learning Robotic Agents

Reinforcement learning (RL) enables high-frequency, closed-loop control for robotic manipulation, but scaling to long-horizon tasks with sp…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIエージェントロボティクス

Bimanual Robot Manipulation via Multi-Agent In-Context Learning

Language Models (LLMs) have emerged as powerful reasoning engines for embodied control. In particular, In-Context Learning (ICL) enables of…

2026-06-11 13:00 JSTarXiv cs.AIハードウェア/半導体

Estimating Tail Risks in Language Model Output Distributions

Language models are increasingly capable and are being rapidly deployed on a population-level scale. As a result, the safety of these model…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Information bottleneck for learning the phase space of dynamics from high-dimensional experimental data

Identifying the dynamical state variables of a system from high-dimensional observations is a central problem across physical sciences. The…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Internet of Everything in the 6G Era: Paradigms, Enablers, Potentials and Future Directions

The Internet of Everything (IoE) represents an evolution of the Internet of Things (IoT) by integrating people, data, processes, and things…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Beyond Continuity: Simulation-free Reconstruction of Discrete Branching Dynamics from Single-cell Snapshots

Inferring cellular trajectories from destructive snapshots is complicated by the challenges of stochasticity and non-conservative mass dyna…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Self-Prompting Small Language Models for Privacy-Sensitive Clinical Information Extraction

Clinical named entity recognition from dental progress notes is challenging because documentation is highly unstructured, domain-specific,…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Towards an Inferentialist Account of Information Through Proof-theoretic Semantics

Information is one of the most widely-discussed concepts of the current era. However, a great deal of insightful work notwithstanding, it i…

2026-06-11 13:00 JSTarXiv cs.AIロボティクス

CredibleDFGO: Differentiable Factor Graph Optimization with Credibility Supervision

Global navigation satellite system (GNSS) positioning is widely used for urban navigation, but the covariance reported by the GNSS solver i…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

Litespark Inference For CPUs: Ultra-Fast SIMD Framework for Ternary (1.58-bit) Language Models

Large language models (LLMs) have transformed artificial intelligence, but their computational requirements remain prohibitive for most use…

2026-06-11 13:00 JSTarXiv cs.AIエージェント

Engineering Robustness into Personal Agents with the AI Workflow Store

The dominant paradigm for AI agents is an "on-the-fly" loop in which agents synthesize plans and execute actions within seconds or minutes…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

TokenRatio: Principled Token-Level Preference Optimization via Ratio Matching

Direct Preference Optimization (DPO) is a widely used RL-free method for aligning language models from pairwise preferences, but it models…

2026-06-11 13:00 JSTarXiv cs.AI画像/動画生成

Weakly Supervised Segmentation as Semantic-Based Regularization

Weakly supervised semantic segmentation (WSSS) trains dense pixel-level segmentation models from partial or coarse annotations such as boun…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIエージェント

CRANE: Constrained Reasoning Injection for Code Agents via Nullspace Editing

Code agents must both reason over long-horizon repository state and obey strict tool-use protocols. In paired Instruct/Thinking checkpoints…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

TAPIOCA: Why Task- Aware Pruning Improves OOD model Capability

Recent work has promoted task-aware layer pruning as a way to improve model performance on particular tasks, as shown by TALE. In this pape…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

ASRU: Activation Steering Meets Reinforcement Unlearning for Multimodal Large Language Models

Multimodal large language models (MLLMs) may memorize sensitive cross-modal information during pretraining, making machine unlearning (MU)…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Are Frontier LLMs Ready for Cybersecurity? Evidence for Vertical Foundation Models from Dual-Mode Vulnerability Benchmarks

We evaluate whether frontier LLMs are ready for cybersecurity through a dual-mode benchmark: white-box function-level vulnerability detecti…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

When Does Deep RL Beat Calibrated Baselines? A Benchmark Study on Adaptive Resource Control

A properly calibrated rule-based autoscaler can beat every one of six mainstream deep reinforcement learning (DRL) algorithms on cost acros…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

Models That Know How Evaluations Are Designed Score Safer

The validity of AI safety evaluations depends on models behaving consistently across controlled and deployment settings. Prior work has ide…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

GrowLoop: Self-Evolving Conversation Evaluation Seeded by Human

With the rapid advancement of large language models, evaluating human-likeness in open-ended conversation has become increasingly important…

2026-06-11 13:00 JSTarXiv cs.AI画像/動画生成

Brain-IT-VQA: From Brain Signals to Answers

Decoding visual content from fMRI signals recorded while a person views images, and specifically answering questions about the seen images,…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Geometric Erasure by Contrastive Velocity Matching in Rectified Flows

While the rapid adoption of multimodal generative models offers immense potential, it has also increased the risks of harmful content synth…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Anomalies in Multivariate Time Series Benchmarks Are Mostly Univariate

Many recent multivariate time series anomaly detection (MTSAD) models incorporate cross-channel modeling, under the implicit assumption tha…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Libra: Efficient Resource Management for Agentic RL Post-Training

Reinforcement learning (RL) has emerged as a standard post-training paradigm for shaping large language models (LLMs) into capable agents.…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

BaltiVoice: A Speech Corpus and Fine-tuned Whisper ASR System for the Balti Language

We present BaltiVoice, a 16.8-hour read-speech corpus for Balti (ISO 639-3: bft), a Tibetic language spoken in Gilgit-Baltistan, Pakistan,…

2026-06-11 13:00 JSTarXiv cs.AILLM/生成AI

EvalStop: Using World Feedback to Detect and Correct Reward Overoptimization in Multi-Tenant RLHF Platforms

Cloud LLM fine-tuning platforms increasingly serve RLHF workloads, where a learned reward model is optimized as a proxy for human quality.…

2026-06-11 13:00 JSTarXiv cs.AI研究/論文

Conformal Risk-Averse Decision Making with Action Conditional Guarantee

Reliable decision making pipelines powered by machine learning models require uncertainty quantification (UQ) methods that come with explic…

2026-06-11 13:00 JSTarXiv cs.AIエージェント

Agentic Software: How AI Agents Are Restructuring the Software Paradigm

For over half a century, software engineering has operated on a foundational premise: human engineers decompose problems, encode decision l…

2026-06-11 12:53 JSTTechCrunch AILLM/生成AI

Anthropic’s Dario Amodei has just one direct report

If you doubted his genius, doubt no more.

2026-06-11 12:00 JSTITmedia AI+画像/動画生成ハードウェア/半導体

Google、拡散型テキスト生成モデル「DiffusionGemma」公開　ローカルGPUで毎秒1000トークン超

Googleは、テキスト生成を最大4倍高速化する実験的AIモデル「DiffusionGemma」を発表した。画像生成の拡散手法を応用し、256トークンを一括で並列生成することで従来の自己回帰型モデルのボトルネックを解消する。品質は標準モデルに譲るものの、ローカル環境での高速なイ…

2026-06-11 10:10 JSTITmedia AI+LLM/生成AI

公式がワンコーラス公開→AIで無断フルコーラス化、拡散　大原ゆい子氏「無職転生III」OPが被害

公式が公開したワンコーラスだけの音源を基に、生成AIを使って無断でフルコーラス化し、本人クレジット入りで公開する――こんな悪質な行為が明るみに出た。

2026-06-11 09:00 JSTITmedia AI+ロボティクス

中国が人型ロボット開発競争をリードする「納得の理由」　日本に残された逆転シナリオは？

米中が先行するヒューマノイド開発競争で日本はどう戦うか。「Humanoids Summit Tokyo 2026」でのマッキンゼーと経済産業省の講演を基に、米中に続く第三極を目指す日本の戦略を解説する。

2026-06-11 09:00 JSTOpenAIエージェント

How an astrophysicist uses Codex to help simulate black holes

Discover how astrophysicist Chi-kwan Chan uses Codex to build black hole simulations, helping scientists study extreme physics and test Ein…

2026-06-11 08:00 JSTITmedia AI+その他

「DX銘柄2026」事例レポート公開　51社のAI活用事例を掲載

IPAが「DX銘柄2026」選定企業のDX事例をまとめたレポートを公開した。グランプリ企業3社をはじめとするDX事例の他、東証上場企業289社を対象とした調査結果も紹介している。

2026-06-11 08:00 JSTITmedia AI+LLM/生成AI

「ChatGPTのコネクタでつながるし、M365 Copilotいらなくない？」→有識者3人に聞いてみた　知らないと損するコンテキスト管理「Work IQ」の仕組み

他社の生成AIにコネクタでM365のデータをつなげばCopilotは不要なのか。両者を分けるのが、参照するコンテキストを管理する「Work IQ」だ。その3層構造の仕組みと、恩恵を最大化するためにユーザーがやるべきことを、3人のMicrosoft MVPが語る。

2026-06-11 07:31 JSTTechCrunch AIビジネス/資金調達規制/政策

xAI fired an engineer who raised alarms about Grok safety, new lawsuit claims

A former xAI engineer is suing the company and SpaceX, alleging he was fired for raising AI safety concerns about Grok days before SpaceX's…

2026-06-11 07:00 JSTITmedia AI+LLM/生成AI

「何でもIT化」が組織を壊す　「GIGAスクール名付け親」に聞くAI時代のリーダー論

業務を劇的に効率化させる一方で、扱い方を間違えれば組織のエンゲージメントを破壊する生成AI。テスト採点時間を最大80％削減するDXを実現しながらも「記述式の自動採点は絶対に導入しない」と言い切るEdLog社長の中川哲氏（元日本マイクロソフト業務執行役員）。同氏が形だけのDXで組…

2026-06-11 05:19 JSTTechCrunch AIその他

Fresh off bond sale, Amazon borrows $17.5B from banks as AI spending continues

Companies are burning through exorbitant sums of money to keep pace in the AI arms race. Debt is climbing.

2026-06-11 05:00 JSTITmedia AI+LLM/生成AIエージェント

スマホからWindowsのCodexアプリを操作できるの？　外出中でもAIコーディングを止めない方法

OpenAIのCodexアプリで、Windows上の開発作業をスマートフォンから確認し指示できるようになった。AIコーディング中にPCの前を離れても、作業が止まりにくい。実用面でかなりうれしい機能を紹介する。

2026-06-11 05:00 JSTOpenAILLM/生成AIエージェント

Access OpenAI models and Codex through your Oracle cloud commitment

Access OpenAI models and Codex through Oracle Cloud, using existing commitments to build and deploy AI with enterprise security and governa…

2026-06-11 02:07 JSTTechCrunch AIその他

‘AI-pilled’ firms spend $7,500 per employee each month on AI

The most AI-obsessed firms are spending roughly $7,500 monthly per employee on AI, per Ramp AI Index. That's not more than an engineer's sa…

2026-06-11 01:24 JSTGoogle DeepMindその他

DiffusionGemma: 4x faster text generation

2026-06-11 01:11 JSTTechCrunch AI研究/論文

How memory tools can make AI models worse

New research suggests that AI memory systems can degrade model performance and encourage sycophantic tendencies.

2026-06-11 00:41 JSTTechCrunch AILLM/生成AI研究/論文

Cybersecurity researchers aren’t happy about the guardrails on Anthropic’s Fable

Cybersecurity researchers are complaining that Anthropic's new model Fable has guardrails that are too strict for any cybersecurity work.

2026-06-11 00:00 JSTTechCrunch AIエージェントビジネス/資金調達

Datadog veterans launch AI coding startup Niteshift on a bet against Big AI lock-in

AI coding agent startup Niteshift has raised a $7 million seed round from a who's who of angels. It's betting companies will want power ove…

2026-06-10（411件）

2026-06-10 23:48 JSTTechCrunch AIビジネス/資金調達

The three hard-tech moonshots fueling SpaceX’s unbelievable IPO

Most of the value in SpaceX's IPO is effectively a call option on the company's ambitious space data center plans.

2026-06-10 23:31 JSTTechCrunch AIビジネス/資金調達

Warner Music acquires AI attribution startup Sureel AI

Through the acquisition, WMG aims to better track when its artists' work is used in AI-generated content or for training AI models.

2026-06-10 22:33 JSTTechCrunch AIエージェントビジネス/資金調達

Jedify raises $24M to help companies arm AI agents with context on their business

The funding round was led by Norwest, with participation from S Capital VC, Cerca Partners, and Oceans Ventures. Snowflake Ventures also pa…

2026-06-10 22:07 JSTTechCrunch AIエージェント

Decart’s new world model can simulate hours of photorealistic driving — with some caveats

Decart is launching Oasis 3, a real-time world model that generates photorealistic driving environments for autonomous vehicle testing, now…

2026-06-10 21:00 JSTOpenAILLM/生成AI

PRC-linked influence operations are targeting AI debates in the US

A new report from OpenAI details PRC-linked influence operations using AI to target U.S. tech debates, data center narratives, tariffs, and…

2026-06-10 20:50 JSTITmedia AI+LLM/生成AI

ChatGPTで広告表示へ　無料・Goプランが対象　6月22日にポリシー更新

米OpenAIは6月10日、「ChatGPT」の広告に関する規定を追加したプライバシーポリシーを改定した。無料プラン、「Go」プランが対象となる。

2026-06-10 19:42 JSTITmedia AI+エージェント

AIエージェントもフィッシング詐欺に引っかかる？　米セキュリティ企業がOpenClawで検証　結果は……

AIエージェントが話題になる昨今。ローカル環境で動作するエージェントにPCを操作させ、作業を効率化しようと試みる人も散見される。ただ、AIエージェントがフィッシング詐欺に引っ掛かったら、大変なことになるかもしれない。米セキュリティ企業Varonisが6月9日（現地時間）に発表し…

2026-06-10 18:58 JSTITmedia AI+その他

Apple「Siri AI」、13億台超が“利用不能”か？　新機能の拡大阻む“弱点”とは

米Morgan Stanleyは調査レポートで、米Appleが新たに発表した「Siri AI」について、新機能の拡大を阻む、ある弱点を指摘した。

2026-06-10 17:30 JSTITmedia AI+LLM/生成AIハードウェア/半導体

「Siri AI」の進化に「Geminiそのまま」の誤解――現地取材で見えた“新生Apple Intelligence”の全貌

「GeminiがApple Intelligenceの正体」は誤解だ。WWDC 2026の現地取材で見えてきた第3世代は、200億パラメータのAIをiPhoneで動かす革新技術、Google Cloud＋NVIDIAによるインフラ刷新、そして静かに変わる「無料」の定義まで、想像…

2026-06-10 17:01 JSTITmedia AI+LLM/生成AI

生成AI台頭、経営コンサルの倒産・廃業が過去最多ペース　“補助金頼み”限界に

「専門性による差別化を図れず、労働集約的・制度依存的なビジネスから脱却できない事業者は、生成AIの台頭による下押し圧力に耐えきれず、今後さらに淘汰が加速する」

2026-06-10 16:05 JSTTechCrunch AIその他

Meta signs first AI data center deal in India with Reliance

The 168-megawatt facility will support Meta's global AI computing needs and can be expanded over time.

2026-06-10 15:42 JSTITmedia AI+LLM/生成AI

“Claude Fable 5の次"に備えよ――Anthropicが東京でイベント開催、「Claude」責任者が明かした開発者向け3つの指針

Anthropicが東京で開発者向けイベント「Code with Claude」を開催。同日に一般提供を始めた新モデル「Claude Fable 5」を念頭に、高性能なAIを組み込んだサービスを開発する際の指針が語られた。

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Business World Model

Businesses are increasingly adopting AI-enabled tools to improve productivity, reduce costs, and enhance products and services. However, th…

2026-06-10 13:00 JSTarXiv cs.AIエージェント

Deployment-Time Memorization in Foundation-Model Agents

Foundation-model agents are increasingly long-lived systems that remember users across interactions, making memorization an explicit deploy…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Exploratory Responsiveness and Adaptive Rigidity under AI-Assisted Optimization

This paper develops a theory of exploratory adaptation under AI-assisted optimization. The central argument is that the long-run adaptive e…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Predictive Assistance and the Temporal Dynamics of Exploratory Compression

Classical theories of cognition describe problem solving as exploratory search through structured problem spaces in which repeated interact…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

From Senses to Decisions: The Information Flow of Auditory and Visual Perception in Multimodal LLMs

Multimodal Large Language Models (MLLMs) can listen and see, but how do audio and visual signals actually travel through the network to sha…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Less Context, Better Agents: Efficient Context Engineering for Long-Horizon Tool-Using LLM Agents

Large language models deployed as autonomous agents for enterprise workflows face a key challenge: verbose tool responses from enterprise s…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Minimalist Genetic Programming

Genetic programming (GP) is based on two important insights. First, that any learning task can fundamentally be posed as a program inductio…

2026-06-10 13:00 JSTarXiv cs.AIエージェント

Regimes: An Auditable, Held-Out-Gated Improvement Loop Demonstrated on LongMemEval with ActiveGraph

Autonomous improvement loops are hard to trust because the improvement process is usually external scaffolding bolted onto the agent: failu…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

RealMath-Eval: Why SOTA Judges Struggle with Real Human Reasoning

While Large Language Models (LLMs) have achieved near-perfect performance in \emph{solving} high-school mathematics, their ability to \emph…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Supervised Fine-tuning with Synthetic Rationale Data Hurts Real-World Disease Prediction

Supervised fine-tuning with synthetic rationale data is widely assumed to improve language model performance on clinical prediction tasks b…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Sim2Schedule: A Simulator-Guided LLM Framework for Autonomous Open-Pit Mine Scheduling

Open-pit mine scheduling is a critical process for maximizing economic return under complex geotechnical and operational constraints. While…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

From Context-Aware to Conflict-Aware: Generalizing Contrastive Decoding for Knowledge Conflict in LLMs

When large language models generate from retrieved or augmented contexts, conflicts between external context and parametric priors remain a…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成エージェント

What Spatial Memory Must Store: Occlusion as the Test for Language-Agent Memory

Language-agent "memory palace" systems anchor each memory to a world coordinate, on the intuition that geometry adds something text cannot.…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Mobility Anomaly Generation using LLM-Driven Behavior with Kinematic Constraints

Although the study of human trajectory anomalies is critical for advancing spatial data mining, empirical research remains severely hindere…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Self-Distillation Policy Optimization via Visual Feedback: Bridging Code and Visual Artifacts

Code-generating large language models (LLMs) increasingly produce visual artifacts such as charts, web pages, and slides by writing program…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Reasoning or Memorization? Direction-Aware Diversity Exploration in LLM Reinforcement Learning

Reinforcement learning has become a key paradigm for eliciting reasoning abilities in large language models, where exploration is crucial f…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント

ReflectiChain: Epistemic Grounding in LLM-Driven World Models for Supply Chain Resilience

AI agents in supply chains face a fundamental epistemic gap: large language models (LLMs) interpret policies but lack physical grounding, w…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Belief-Space Control for Personalized Cancer Treatment via Active Inference

Cancer treatment is at the core a sequential decision-making problem with partial observability, latent patient heterogeneity, and explicit…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

Beyond Static Evaluation: Co-Evolutionary Mechanisms for LLM-Driven Strategy Evolution in Adversarial Games

Recent advances in LLM-driven code evolution have enabled automated discovery by iteratively generating and improving programs. However, ap…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Instruction Finetuning DeepSeek-R1-8B Model Using LoRA and NEFTune

Financial named-entity recognition (NER) is essential for translating unstructured financial reports and news into structured knowledge gra…

2026-06-10 13:00 JSTarXiv cs.AIエージェント研究/論文

STAGE-Claw: Automated State-based Agent Benchmarking for Realistic Scenarios

Large language models are increasingly used to power personal agents for everyday applications, but evaluating these agents remains a chall…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

A Unified Multi-Modal Framework for Intelligent Financial Systems: Integrating Reinforcement Learning, High-Frequency Trading, and Game-Theoretic Approaches with Cross-Modal Sentiment Analysis

The rapid evolution of financial technology demands sophisticated artificial intelligence systems capable of handling diverse challenges ac…

2026-06-10 13:00 JSTarXiv cs.AIエージェント

Soul Computing: A Theoretical Framework and Technical Architecture for Intelligent Agents with Independent Consciousness

Breakthroughs in large language models and multimodal generation technologies have propelled the digital reconstruction of human mental tra…

2026-06-10 13:00 JSTarXiv cs.AIエージェント

Trace2Policy: From Expert Behavior Traces to Self-Evolving Decision Agents

Decision rules that enterprise experts apply tacitly -- in auditing, compliance, and contract review -- can be systematically recovered and…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

ComBench: A Benchmark for Rigorous Proof Reasoning and Constructive Realization in Olympiad-Level Combinatorics

Combinatorics is central to Olympiad-level mathematical problem solving, requiring deep discrete reasoning, creative constructions, and rig…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

A complementary study on PlanGPT: Evaluation with defined Performance Metrics and comparison with a planner

Automated Planning is a subfield of Artificial Intelligence (AI) where the main objective is generating a sequence of actions, known as a p…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

A Reliable Fault Diagnosis Method Based on Belief Rule Base Consider Robustness Analysis

In equipment operation, the implementation of fault diagnosis is essential to ensure the continuity and safety of production equipment, imp…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Cross-Modal Knowledge Distillation without Paired Data: Theoretical Foundation and Algorithm

Cross-modal knowledge distillation (CMKD) studies how a (large) teacher model trained on one type of data (e.g., images) can guide a (small…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント

HIPIF: Hierarchical Planning and Information Folding for Long-Horizon LLM Agent Learning

While Large Language Models (LLMs) have demonstrated strong capabilities as autonomous agents across a wide range of tasks, their performan…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント

ActiveMem: Distributed Active Memory for Long-Horizon LLM Reasoning

Memory is essential for enabling large language model (LLM) agents to handle long-horizon reasoning tasks. Existing memory mechanisms are l…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

One Token per Multimodal Evidence: Latent Memory for Resource-Constrained QA

External memory effectively grounds large language models (LLMs) and vision-language models (VLMs)-based question answering (QA) in relevan…

2026-06-10 13:00 JSTarXiv cs.AIエージェント

Learning What to Remember: Observability-Safe Memory Retention via Constrained Optimization for Long-Horizon Language Agents

Long-horizon language agents accumulate observations, reasoning traces, and retrieved facts that exceed their finite context windows, makin…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Infini Memory: Maintainable Topic Documents for Long-Term LLM Agent Memory

Long-term LLM agents need persistent memory that can track changing facts and provide relevant evidence across sessions. Existing memory sy…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

When the Chain of Thought Knows Better: Failure Modes in Multi-Turn Reasoning Models

Failures in multi-turn reasoning models are largely invisible to terminal-score evaluation. A model can lock onto an unsafe stance early in…

2026-06-10 13:00 JSTarXiv cs.AIエージェント

The Arbiter Agent: Continually Monitoring Multi-Agent Conversations to Detect Emergent Misalignment

As AI systems built from multiple language-model agents become more common, they are increasingly used to make decisions together: discussi…

2026-06-10 13:00 JSTarXiv cs.AIエージェント

AutoPDE: Reliable Agentic PDE Solving via Explicitly Represented Solver Strategies

Numerical solvers for partial differential equations (PDEs) are core computational tools in science and engineering. Building reliable PDE…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Accelerating NeurASP with vectorization and caching

Neurosymbolic AI combines neural networks with symbolic programs to create robust and explainable predictions. One such framework is NeurAS…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント

READER: Robust Evidence-based Authorship Decoding via Extracted Representations

As agentic applications increasingly route user tasks through official and third-party LLM APIs, provenance becomes an operational question…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

Evaluating Research-Level Math Proofs via Strict Step-Level Verification

Large Language Models (LLMs) struggle to rigorously verify complex mathematical proofs. Standard global evaluation approaches suffer from "…

2026-06-10 13:00 JSTarXiv cs.AIエージェント研究/論文

Moonshine: An Autonomous Mathematical Research Agent Centered on Conjecture Generation

Moonshine is an autonomous agent whose central objective is to generate mathematical conjectures. Its core capability is to extract structu…

2026-06-10 13:00 JSTarXiv cs.AIビジネス/資金調達研究/論文

Do VLMs Reason Like Engineers? A Benchmark and a Stage-wise Evaluation

Vision-Language Models (VLMs) demonstrate strong performance on general multimodal reasoning benchmarks, yet their ability to perform engin…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Large-scale semantic mapping of learner agency and autonomy reveals what measurement and generative AI research overlook

Learner agency and autonomy are foundational to personal development, yet a pervasive "jingle-jangle" fallacy (i.e. identical terms denotin…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Role-Agent: Bootstrapping LLM Agents via Dual-Role Evolution

Although Large Language Model (LLM) agents have demonstrated strong performance on complex tasks, their learning is often limited by ineffi…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Frontier Coding Agents Use Metaprogramming to Adapt to Unfamiliar Programming Languages

LLM-based coding agents are usually evaluated in familiar software settings: mainstream languages, common libraries, and public repositorie…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

WorldKernel: A World Model is the Coupling Kernel of Admissible Possible Worlds

A common assumption holds that enough observational and interventional data, given to a strong enough predictor, suffices. We report a fail…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

Recalling Too Well: Sycophancy Evaluation and Mitigation in Memory-Augmented Models

Persistent memory systems promise to make LLMs more helpful by storing user beliefs over time. We show they also make models less correct b…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成

Architect-Ant: Editable Automatic Furnishing of Architectural Floor Plans

Furnished floor plans are fundamental to real estate visualization, interior design, and architectural workflows. However, progress in auto…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Mind the Gap: Can Frontier LLMs Pass a Standardized Office Proficiency Exam?

The deployment of Large Language Model (LLM) agents for computer automation is accelerating, yet their ability to navigate complex, profess…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Bellman-Taylor Score Decoding for Markov Decision Processes with State-Dependent Feasible Action Sets

Many Markov decision processes (MDPs) in operations research have feasible actions that are state dependent and defined implicitly by vario…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Null-Space Constrained Low-Rank Adaptation for Response-Specified Large Language Model Unlearning

Large language model unlearning aims to suppress designated undesirable knowledge while preserving benign capabilities. Many unlearning obj…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

Structure from Reasoning, Numbers from Search: On-Premise Open LLMs as Structural Priors for Coupled MIMO Controller Tuning

Tuning controllers for strongly coupled multi-input multi-output (MIMO) industrial processes is hard: decentralized classical auto-tuning i…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Superficial Beliefs in LLM Decision-Making

We ask whether large language models (LLMs) merely imitate rationales when choosing between two options, or whether their choices reflect a…

2026-06-10 13:00 JSTarXiv cs.AIエージェントビジネス/資金調達

Workflow-GYM: Towards Long-Horizon Evaluation of Computer-use Agentic tasks in Real-World Professional Fields

Recent years have witnessed the rapid evolution of AI agents toward handling increasingly complex, real-world tasks. However, existing benc…

2026-06-10 13:00 JSTarXiv cs.AIエージェント研究/論文

What Fits (Into Few Tokens) Doesn't Overfit: Compression and Generalization in ML Research Agents

Reusing a held-out benchmark adaptively should, in principle, invite overfitting. Yet benchmark-driven machine learning (ML) has produced s…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

CIAware-Bench: Benchmarking Control Intervention Awareness Across Frontier LLMs

AI control protocols oversee untrusted models by monitoring their actions and modifying potentially unsafe steps, often using a trusted mod…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成エージェント

A History-Aware Visually Grounded Critic for Computer Use Agents

Various test-time interventions for Computer Use Agents (CUAs), including critic models, have been developed to improve performance through…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成ビジネス/資金調達

Monte Carlo Pass Search: Using Trajectory Generation for 3D Counterfactual Pass Evaluation in Football

We recast pass evaluation in football (soccer) as a Monte Carlo Tree Search (MCTS)-like evaluation problem whose components mostly exist in…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

ABC-Bench: An Agentic Bio-Capabilities Benchmark for Biosecurity

Large language models (LLMs) are rapidly acquiring capabilities relevant to biological research, from literature synthesis to interpretatio…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

ReasonAlloc: Hierarchical Decoding-Time KV Cache Budget Allocation for Reasoning Models

Long chain-of-thought (CoT) trajectories in large language model (LLM) reasoning cause severe inference bottlenecks due to rapid key-value…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

The Role of Feedback Alignment in Self-Distillation

Conditioning a language model on additional context, such as feedback on a previous attempt, typically improves its response. Self-distilla…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

More Human or More AI? Visualizing Human-AI Collaboration Disclosures in Journalistic News Production

Within journalistic editorial processes, disclosing AI usage is currently limited to simplistic labels, which misses the nuance of how huma…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Culturally-Aware AI for Cross-Boundary Community Learning: Undergraduate Innovation at the Intersection of Computation and Design

Research on artificial intelligence in education (AIED) is rapidly expanding, yet technical progress often lacks human-centered grounding a…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

AI-Driven Analytics of Team-Teaching Talk: Acoustic Patterns across Experience, Cohorts and the Learning Design

As classroom cohorts expand, team teaching is increasingly used to integrate the expertise and pedagogical perspectives of multiple teacher…

2026-06-10 13:00 JSTarXiv cs.AIエージェント

Agentic Social Affordance Framework (ASAF): Agent Identity Design as a Collaboration Interface in Multi-Agent Systems

As AI systems evolve from single conversational agents to complex multi-agent architectures, a critical design dimension has been overlooke…

2026-06-10 13:00 JSTarXiv cs.AIエージェント

CollabSkill: Evaluating Human-Agent Collaboration On Real-World Tasks

AI agents are reshaping the workspace, leading to drastic change of how humans work. Despite the considerable potential of human-agent coll…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Self-EmoQ: Plutchik-Guided Value-based Planning to Drive Streaming Emotional TTS

Emotional interaction is increasingly crucial for conversational AI, yet current systems lack a self-emotion determination mechanism to dri…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Aesthetic Perspectives in Information Systems Research: A Hermeneutic Analysis

How might implicit aesthetic perspectives shape what Information Systems (IS) scholarship recognises as worthy of study (or not)? In this h…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

Integrated Real-Time Motion Tracking and AI Analysis for Athletic Performance Optimization

Applying Human Pose Estimation (HPE) in real world environments remains a challenging task, this paper explores and surveys real time HPE a…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

An LLM-Native Psychometric Instrument Does Not Predict LLM Behavior: Evidence Across 25 Models

Large language models (LLMs) produce stable self-reports on personality inventories, but these self-reports do not predict observed behavio…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント

The Interlocutor Effect: Why LLMs Leak More Personal Data to Agents Than Humans

Large Language Models (LLMs) alter their privacy behavior based on the perceived identity of their interlocutor. While safety mechanisms ty…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

CANVAS: Captioning Art with Narrative Visual-Audio AI Systems

Visual art remains largely inaccessible to blind and low-vision (BLV) audiences due to brief or absent alt-text, which rarely conveys the s…

2026-06-10 13:00 JSTarXiv cs.AIエージェント

Human-AI Coordination Zones: A Framework for Designing Human-in-the-Loop Experiences with Agentic AI

As generative and agentic AI becomes embedded in everyday products, practitioners face a persistent challenge: how to design human-AI coord…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

LLM-Based Code Documentation Generation and Multi-Judge Evaluation

High-quality source code documentation is vital yet often neglected, especially in critical domains like healthcare where reliability and m…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Can Multi-Agent LLMs Identify Their Peers? Stylometric Fingerprinting in Role-Constrained Political Analysis

Multi-agent large language model (LLM) pipelines for political statement analysis are vulnerable to peer-preservation bias: models tend to…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Using Probabilistic Programs to Train Inductive Reasoning in Large Language Models

Post-training Large Language Models (LLMs) for reasoning typically focuses on deductive tasks such as mathematics and coding where correctn…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Support sufficiency as action-sufficient compression: a single-cycle rate-regret formulation

Robust decision-making requires compression. A system that forms a rich support state cannot usually preserve its full structure at the poi…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Mitigating Manifold Departure: Uncertainty-Aware Subspace Rectification for Trustworthy MLLM Decoding

MLLMs frequently hallucinate objects inconsistent with visual inputs. This issue is typically attributed to the over-reliance on language p…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Conformal Risk Prediction for Non-Alcoholic Fatty Liver Disease Using Gradient Boosting with Distribution-Free Coverages

Non-alcoholic fatty liver disease (NAFLD) affects roughly 25% of global adults, posing substantial hepatic and cardiovascular risks. Yet, p…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Time Series as Language: A Universal Tokenizer for General-Purpose Time Series Foundation Models

While Next-Token Prediction (NTP) has unified LLM pretraining, its adaptation to unbounded, continuous time series (TS) remains open. To br…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Blurry Window Attention

The Softmax Attention operation in Transformer language models has a quadratic complexity in the sequence length and a growing state size i…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

Alignment Collapse Under KV Cache Quantization: Diagnosis and Mitigation

Key-value (KV) cache quantization is widely used to reduce Large Language Model (LLM) inference memory, yet existing evaluations solely foc…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Two to Tango: Coupled Task-Reference Selection for Safe LLM Fine-tuning

Fine-tuning safety aligned large language models (LLMs) on downstream data improves adaptation but may erode learned safety behavior. Exist…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

EstRTL: Functional Estimation Guided RTL Code Generation

Optimizing register transfer level (RTL) code is of vital importance in hardware design. Large language models (LLMs) provide new methods f…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

SPACE: Source-free Proxy Anchor Concept Erasure for MLLMs

As Multimodal Large Language Models (MLLMs) face growing privacy risks and regulatory constraints, machine unlearning (MU) has emerged as a…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

QSplitFL: Capability Aware Deep Q-Learning for Optimal Split Point Selection in Split Federated Learning

Federated Learning (FL) combined with Split Learning (SL) is a privacy preserving paradigm that enables training deep neural networks (DNNs…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

SD-GRPO: Verifiable Segment Decomposition for Long-Form Vision-Language Generation

Group Relative Policy Optimization (GRPO) and its variants, originally developed for Large Language Models (LLMs), have recently been appli…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

PatchSTG: Scalable Spatiotemporal Graph Transformers for Traffic Forecasting on Irregular Sensor Networks

Traffic forecasting is a fundamental component of intelligent transportation systems, yet remains challenging in real-world settings due to…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Rotate2Think: Geometric Priming via Orthogonal Rotation to Improve Language Model Reasoning

Reasoning models achieve strong performance on challenging tasks by generating explicit intermediate reasoning traces before producing a fi…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Integrating Local and Global Entropy for Uncertainty Quantification in LLMs

Large language models hallucinate confidently, making uncertainty quantification (UQ) essential for reliable deployment. Existing methods r…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

TD-Grokking: Learning from Zero-Reward Problems by Training-Time Decomposition

Large language models (LLMs) have made remarkable progress in reasoning tasks, largely driven by post-training paradigms, especially reinfo…

2026-06-10 13:00 JSTarXiv cs.AIエージェント

Failure Modes of Deep Multi-Agent RL in Asynchronous Pricing: Reproducible Triggers, Trace Diagnostics, and a Partial Fix

We study two reproducible failure modes of deep multi-agent reinforcement learning in continuous-time pricing markets: (i) tacit cartel for…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

SHAPE: Coalition-Aware Expert Pruning for Sparse Mixture-of-Experts LLMs

Sparse Mixture-of-Experts (MoE) large language models achieve strong quality with low per-token compute, yet their deployment is often limi…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

SocraticPO: Policy Optimization via Interactive Guidance

Reinforcement learning (RL) for large language models usually supervises reasoning with scalar outcome rewards, such as binary correctness.…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

PreAct-Bench: Benchmarking Predictive Monitoring in LLMs

Large language models (LLMs) are increasingly deployed as autonomous agents capable of executing multi-step action trajectories toward a gi…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Representation Curriculum: Stagewise Training for Robust Ranking and Allocation

Ranking in digital marketplaces is a dynamic exposure-allocation mechanism: displayed items shape discovery trajectories and success events…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Tractogram foundation model

Diffusion MRI (dMRI) tractography is the only noninvasive approach for mapping white-matter pathways in the living human brain. It represen…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

HMAF: A Hierarchical Multi-Slot GD-RTB Allocation Framework

In modern online advertising platforms, Guaranteed Delivery (GD) contracts coexist and bid with Real-Time Bidding (RTB) auctions. Recent ap…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

When Attribution Patching Lies: Diagnosis and a Second-Order Correction

A central goal of mechanistic interpretability is to identify which internal components causally drive a language model's behavior. Because…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Less Context, More Accuracy: A Bi-Temporal Memory Engine for LLM Agents Where a Lean Retrieved Context Beats the Full History

Long-term memory is the missing layer for LLM agents: across sessions they forget, and the common workaround -- replaying the whole history…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

The Whale That Outswam Evolution: Swarm Intelligence Maximises Memory in Connectome Reservoirs

Reservoir computing exploits the fixed dynamics of a recurrent network for temporal processing, requiring only a trained linear readout. Bi…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

LongMoE: Longitudinal Multimodal Learning via Trajectory-Aware Mixture-of-Experts

Multimodal clinical learning is increasingly important for integrating diverse patient data, including imaging, text, and personalised heal…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

IDP-Bench: Benchmarking ability of LLMs to protect personal information in interdependent privacy contexts

Large language models (LLMs) are becoming widely deployed as personal AI assistants with access to sensitive user data, making privacy a ma…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成規制/政策

Bypassing Copyright Protection in Diffusion-based Customization via Two-Stage Latent Feature Optimization

With the growing concerns over copyright infringement in diffusion-based customization, adversarial attacks have emerged as a prominent def…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Mix, Don't Pick: Why Synthetic Corpus Composition Matters for Time Series Foundation Model Pretraining

Choosing the wrong synthetic generator for time-series foundation model pretraining is costly: under identical training budgets, the best a…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント

IntentKV: Cross-Turn Intent-Aware KV Cache Pruning for Agent Inference

Multi-turn LLM agents fan short queries into long trajectories of tool calls, search results, and intermediate reasoning. Both KV memory an…

2026-06-10 13:00 JSTarXiv cs.AIロボティクス

Co-GLANCE: Uncertainty-Aware Active Perception for Heterogeneous Robot Teaming

Perceptual uncertainty is a central challenge for heterogeneous robot teams operating in unstructured outdoor environments, where no single…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

The Bioelectrical Information Theory: Investigating the theoretical compression limit of bioelectrical signals under artificial intelligence

Bioelectrical signals are increasingly acquired at scales that challenge the bandwidth of brain-computer interfaces. However, their compres…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Conformal Prediction for Neural Operators: Distribution-Free Uncertainty Quantification in Physics Simulation

Neural operators such as the Fourier Neural Operator (FNO) have emerged as powerful surrogates for solving partial differential equations (…

2026-06-10 13:00 JSTarXiv cs.AIハードウェア/半導体

Sigma-Branch: Hierarchical Single-Path Network Reconstruction for Dynamic Inference with Reduced Active Parameters

Deploying deep neural networks on memory-constrained edge accelerators is bottlenecked by per-inference off-chip weight transfer rather tha…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Sample Where You Struggle: Sharpening Base Model Reasoning via Entropy-Guided Power Sampling

Sampling from the sequence-level power distribution $p^\alpha$ elicits RL-level reasoning from base language models without any parameter u…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Trainable Smooth-Rotation Transforms with Learned Channel Scales for LLM Quantization

Post-training quantization (PTQ) is one of the most practical ways to reduce the serving cost of Large Language Models (LLMs), but activati…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Forward-Only Convolutional Neural Networks with Learnable Channel-Class Assignment

The Forward-Forward (FF) algorithm offers a biologically inspired alternative to backpropagation by replacing gradient-based credit assignm…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Between Amnesia and Chaos: A Memory Stability Expressivity Trilemma for Trainable Dissipative Oscillator Networks

Physical reservoir computing harnesses nonlinear mechanical dynamics but, by convention, freezes the substrate and trains only a linear rea…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

A Note on the Strategic Confinement Problem

Lampson's confinement problem asks how to prevent a program that processes confidential information from leaking it to a third party. We in…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

When RL Fails after SFT: Rejuvenating Model Plasticity for Robust SFT-to-RL Handoff

Supervised Fine-Tuning (SFT) followed by Reinforcement Learning (RL) has become a standard pipeline for Large Language Model (LLM) post-tra…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント

GitInject: Real-World Prompt Injection Attacks in AI-Powered CI/CD Pipelines

AI-powered agents are increasingly embedded in continuous integration and continuous delivery/deployment (CI/CD) pipelines to autonomously…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

One Lens, Many Worlds : A Capability-Typed Interface for World-Model Interpretability

World models are now built on substantially different computational substrates. Latent recurrent state-space models such as PlaNet and the…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

RKSC: Reasoning-Aware KV Cache Sharing and Confident Early Exit for Multi-Step LLM Inference

We introduce RKSC (Reasoning-Aware KV Cache Sharing), a training-free inference framework that eliminates two structural redundancies in mu…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Interactions Between Crosscoder Features: A Compact Proofs Perspective

Dictionary learning methods like Sparse Autoencoders (SAEs) and crosscoders attempt to explain a model by decomposing its activations into…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Anomaly Detection and Root Cause Analysis for Microservice Systems

Microservice systems are widely used to build cloud applications, yet their complexity makes failures inevitable, degrading user experience…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

GAGI: A Gini-Adjusted GDP-per-Capita Index for Distribution-Aware Macroeconomic Welfare Monitoring

GDP per capita is the default lens through which governibng bodies track the economic prosperity and consequences of economic events , yet…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Learning Where to Simulate: Generative Active Sampling for Online PDE Surrogate Training

Data-driven PDE surrogates are trained with data produced by numerical PDE solvers. However, when the surrogate's goal is to generalize acr…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Deep Slice Interpolation for Reducing Through-Plane Anisotropy and Noise in Head CT

Head computed tomography (CT) typically uses sub-millimeter in-plane resolution but 2-5 mm through-plane spacing, creating substantial anis…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Does Normalization Choice Matter for Causal Large Time-Series Models?

Large models for time-series forecasting have been emerged as a promising paradigm for training models on heterogeneous collections of sign…

2026-06-10 13:00 JSTarXiv cs.AIエージェントロボティクス

Uncertainty-Aware Motion Planning for Autonomous Driving in Mixed Traffic Environment

In mixed-traffic environments where autonomous and human-driven vehicles may co-exist, motion planning for autonomous vehicles requires ant…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Temporal Context Conditioning for Seasonality-Aware Precipitation Nowcasting of High-Intensity Rainfall

Precipitation nowcasting is increasingly being approached with deep learning models that learn directly from recent radar observations. Alt…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

HydraCIL: Decoupled Class-Incremental Learning through Prototype-Guided Multi-Head Classifiers

We present HydraCIL, a decoupled continual learning model based on prototype-guided multi-head classifiers, targeting sustainable deploymen…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント

3SPO: State-Score-Supervised Policy Optimization for LLM Agents

Training large language models (LLMs) as autonomous agents via reinforcement learning (RL) has enabled frontier models to achieve superhuma…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Optimality of FSQ Tokens for Continuous Diffusion for Categorical Data with Application to Text-to-Speech

Continuous diffusion for categorical data is a framework belonging to the diffusion family and aiming at generating discrete data. The scie…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Geometry-Aware Anisotropic Boundary Correction for Aerodynamic Simulation

Aerodynamic simulation is a key component of engineering shape design, where core quantities such as the surface pressure coefficient stron…

2026-06-10 13:00 JSTarXiv cs.AIビジネス/資金調達

DeRA-MOS: Optimizing Text-to-Music Evaluation via Decoupled Listwise Ranking and Modality Alignment

Evaluating text-to-music (TTM) systems remains expensive because music impression (MI) and text alignment (TA) scores rely on human mean op…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成ロボティクス

Generalized-CVO: Fast and Correspondence-Free Local Point Cloud Registration with Second Order Riemannian Optimization

We propose a fast and correspondence-free local point cloud registration method that leverages geometric surface structure and reproducing…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Interpreting and Steering a Text-to-Speech Language Model with Sparse Autoencoders

Language models increasingly serve as the backbone of text-to-speech (TTS) systems, yet we understand little about the representations they…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Inside the Latent Flow: Causal Deciphering of Attention Dynamics in Audio Separation Foundation Models

Flow-matching transformers achieve strong audio separation, yet their attention dynamics are opaque. We adapt established causal-interventi…

2026-06-10 13:00 JSTarXiv cs.AIエージェント

Bittensor Agent Arenas as a Trajectory Primitive: Distilling a Shopping Agent from ShoppingBench Subnet Traces

Small-model agentic post-training is bottlenecked less by the algorithm than by the trajectory substrate it consumes. Leading recipes (RLVR…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

A Controlled Audit of Pretraining Contamination in Public Medical Vision-Language Benchmarks

Medical vision-language models (VLMs) are evaluated on public benchmarks whose images and question-answer pairs have been freely downloadab…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Importance-Aware Scheduling for High-Dimensional Hyperparameter Optimization

Hyperparameter Optimization (HPO) is essential for building high-performing ML/DL models, yet conventional optimizers often struggle in hig…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Temporal Sheaf Neural Networks with Dynamic Orthogonal Transport

We introduce Temporal Sheaf Neural Networks (TSNN), a temporal link prediction framework that equips each node with a time-varying orthogon…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

VFUSE: Virulent Feature Understanding with Sparse autoEncoders

Generative models have shown remarkable progress in a variety of domains such as protein design, but such power enables the opaque generati…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Divide-and-Conquer Modeling for the CTF-4-Science Lorenz Benchmark

This work presents a divide-and-conquer modeling strategy for the CTF-4-Science Lorenz benchmark, which evaluates chaotic-system prediction…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

A Theory on Flow Matching with Neural Networks

In this work, we develop theoretical foundation for flow matching with neural-network-parameterized conditional velocity fields. We establi…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

Unsupervised Style Representation Learning for AI-Text Detection via Paraphrase Inversion

The rapid development of large language models (LLMs) has raised concerns about misuse such as plagiarism, misinformation, and automated in…

2026-06-10 13:00 JSTarXiv cs.AIエージェント

What makes a harness a harness: necessary and sufficient conditions for an agent harness

The term agent harness now circulates widely in software engineering with generative artificial intelligence. It names the layer that wraps…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Duality for Optimal Multi-Item, Multi-Bidder Auction Design: Revenue Certificates through Deep Learning

Characterizing revenue-optimal auctions for multi-item, multi-bidder settings remains a fundamental open problem, with no known closed-form…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Emotion Profiling in LLM-Based Literary Translation: Systematic Shifts Across MT and Post-Editing

This paper investigates whether LLM translations exhibit identifiable emotional profiles and how post-editing reshapes them toward human-li…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

MetaPlate: Counterfactual-Guided RAG-LLM Tool for Personalized Food Recommendation and Hyperglycemia Prevention

Postprandial hyperglycemia is a key risk factor for metabolic disorders; however, existing dietary guidance is often static, impractical, a…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

FedSteer: Taming Extreme Gradient Staleness in Federated Learning with Corrective Projections and Caching

Federated learning (FL) is often subject to aggregation variance if clients do not consistently participate in training rounds. While reusi…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Pareto-Guided Teacher Alignment for Fair Personalized Text Generation

Personalized persuasive text generation can improve relevance and engagement, but demographic conditioning may also introduce unequal frami…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成

BiWM: Advancing Open-Source Interactive Video World Models with Bidirectional Autoregression

Transitioning bidirectional video diffusion models into an autoregressive paradigm improves the interactivity of video world models, but ex…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェントビジネス/資金調達研究/論文

$\tau$-Rec: A Verifiable Benchmark for Agentic Recommender Systems

As recommender systems transition toward agentic, multi-turn conversational interfaces, evaluation paradigms have struggled to keep pace. C…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Gaming AI-Assisted Peer Reviews Poses New Risks to the Scientific Community

AI is increasingly used to support scientific peer review, from manuscript screening, reviewer assistance to editorial triage. Although suc…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Local Is Not a Sufficient Privacy Boundary: Governing OS-Integrated On-Device AI

As AI systems move into operating systems, privacy no longer turns only on whether a model runs locally. A local assistant may assemble ema…

2026-06-10 13:00 JSTarXiv cs.AIロボティクス

Flow Control: Steering Vision-Language-Action Models with Simple Real-Time Inputs

We introduce flow control of vision-language-action (VLA) models, a simple and effective way to steer VLA actions in real-time through gene…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成

Making Time Editable in Video Diffusion Transformers

Modern Diffusion Transformers for video generation provide limited control over the progression of time and the editing of temporal dynamic…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Dropout-GRPO: Variational Stochasticity for Continuous Latent Reasoning

Group Relative Policy Optimization (GRPO) relies on the diversity of $K$ rollouts within each group; otherwise, the group-mean advantage $A…

2026-06-10 13:00 JSTarXiv cs.AIビジネス/資金調達研究/論文

MMClima: A Framework for Multimodal Climate Science Data and Evaluation

Climate change research increasingly requires AI systems that reason across text, dynamic visual content, and scientific figures, yet exist…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成

Fisher-Guided Progressive Parameter Selection for Adaptive Fine-Tuning

Parameter-efficient fine-tuning (PEFT) aims to adapt pretrained models with a small trainable parameter subset, however, most existing meth…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Integral Field Unit Spectroscopy with One Fiber

Integral field unit (IFU) spectroscopy provides spatially resolved spectra across galaxies, offering crucial insights into their evolution.…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Density Ridge Selective Prediction for LLM and VLM Hallucination Detection under Calibration Label Scarcity

Hallucination detection in large language and vision-language models is increasingly framed as selective prediction, where a detector assig…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

An Improved Generative Adversarial Network for Micro-Resistivity Imaging Logging Restoration

An improved GAN-based imaging logging image restoration method is presented in this paper for solving the problem of partially missing micr…

2026-06-10 13:00 JSTarXiv cs.AIロボティクス

Exploration of Foundation Model-Based Robots in Patient and Elderly Care

Demand for older-adult and patient care is growing rapidly as populations age worldwide. Foundation models are increasingly being integrate…

2026-06-10 13:00 JSTarXiv cs.AIビジネス/資金調達

Automated Pronunciation Evaluation for Korean Toddler Speech using Speech Diarization and Self-Supervised Learning

Speech sound disorders affect approximately 44% of Korean pediatric communication disorder cases, yet automated assessment tools for Korean…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

A Source Domain is All You Need: Source-Only Cross-OS Transfer Learning for APT Anomaly Detection via Semantic Alignment and Optimal Transport

Advanced Persistent Threats (APTs) are stealthy, multi-stage cyberattacks whose detection is difficult due to scarce labeled traces, severe…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Fast Exact Nearest-Neighbor Learning for High-Frequency Financial Time Series

AI efficiency at scale is becoming critical in finance as market data volumes surge across equities, ETFs, FX, options, and high-frequency…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成

Dual-Branch Gated Fusion for Open-Set Audio Deepfake Source Tracing

Attributing a synthetic utterance to its originating system remains an open challenge: closed-set models fail to reject unseen synthesizers…

2026-06-10 13:00 JSTarXiv cs.AIエージェントロボティクス研究/論文

SHAPO: Sharpness-Aware Policy Optimization for Safe Exploration

Safe exploration is a prerequisite for deploying reinforcement learning (RL) agents in safety-critical domains. In this paper, we approach…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Hyperbolic Neural Population Geometry Benefits Computation

Neural population geometry shapes downstream computation. Recent empirical findings in neurobiology suggest that a hyperbolic structure und…

2026-06-10 13:00 JSTarXiv cs.AIロボティクス

YUBI: Yielding Universal Bidigital Interface for Bimanual Dexterous Manipulation at Scale

We introduce Yielding Universal Bidigital Interface (YUBI), a finger-aligned gripper designed to enable intuitive, ergonomic, and scalable…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Linguistically Augmented Audio Speech Data (LinguAS)

Maliciously-created fake speech, including deepfaked and spoofed audio, is proliferating at an alarming rate, and detection models are raci…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Multi-Level Analyzation of Imbalance to Resolve Non-IID-Ness in Federated Learning

Class imbalance is a common problem in deep learning that severely degrades performance. In federated learning (FL), it is a critical facto…

2026-06-10 13:00 JSTarXiv cs.AIエージェントロボティクス

What Matters in Orchestrating Robot Policies: A Systematic Study of Hierarchical VLA Agents

Hierarchical vision-language-action (Hi-VLA) systems have emerged as a promising paradigm for complex robot manipulation, by using high-lev…

2026-06-10 13:00 JSTarXiv cs.AIロボティクス

Hierarchical Policies from Verbal and Egocentric Human Signals for Natural Human-Robot Interaction

For natural human-robot interaction, a robot must understand human intent expressed not only through language but also through nonverbal si…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Towards Robust Arabic Speech Emotion Recognition with Deep Learning

Speech Emotion Recognition (SER) aims to identify a speaker's emotional state from audio signals. While recent advances in deep learning ha…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

LLM-Guided Neural Architecture Search for Robust Co-Design of Physical Neural Networks

Deploying neural networks on unconventional hardware demands architectures that co-optimize task accuracy and platform-specific constraints…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント

The Confident Liar: Diagnosing Multi-Agent Debate with Log-Probabilities and LLM-as-Judge

Multi-agent debate systems are typically evaluated only on whether the final answer is correct, overlooking the quality of the intermediate…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Catching One in Five: LLM-as-Judge Blind Spots in Production Multi-Turn Transaction Agents

LLM-as-judge is the default instrument for evaluating conversational agents, yet its reliability is almost always reported as agreement wit…

2026-06-10 13:00 JSTarXiv cs.AIロボティクス

Baseline-Free Policy Optimization for Neural Combinatorial Optimization

Neural combinatorial optimization (NCO) trains autoregressive policies to solve routing problems. The standard training algorithm, REINFORC…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成

Content-Induced Spatial-Spectral Aggregation Network for Change Detection in Remote Sensing Images

The integration of spatial and spectral information is beneficial to the improvement of change detection performance. However, existing met…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成

Building Change Detection in Earthquake: A Multi-Scale Interaction Network and A Change Detection Dataset

As one of the most destructive natural disasters, earthquakes have struck many countries around the world in recent years, causing serious…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Routing-Aware Expert Calibration for Machine Unlearning in Mixture-of-Experts Language Models

Machine unlearning is increasingly important for large language models, yet unlearning in Mixture-of-Experts (MoE) architectures remains un…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Atomic Intent Reasoning: Bringing LLM Semantics to Industrial Cross-Domain Recommendations

Cross-domain recommendation is a core problem in content-to-e-commerce platforms. Its objective is to leverage user interactions with conte…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

KG-SoftMAP: Soft Knowledge-Graph Priors for Bayesian Network Structure Learning from Sparse Discrete Data

Learning Bayesian network (BN) structure from sparse discrete data is hard: when each instance records only a few variables, most variable…

2026-06-10 13:00 JSTarXiv cs.AIロボティクスビジネス/資金調達

A Practical Recipe Towards Improving Sim-and-Real Correlation for VLA Evaluation

Simulation has become an essential tool for evaluating and improving vision-language-action (VLA) policies, offering scalable, reproducible…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Speech Meets ELF: Audio Conditional Continuous-Target Diffusion for Speech Recognition and Translation

Speech-to-text (S2T) systems for recognition (ASR) and translation (S2TT) typically generate discrete text tokens. In contrast, continuous-…

2026-06-10 13:00 JSTarXiv cs.AIロボティクス

Test-time Adversarial Takeover: A Real-time Hijacking Interface against Robotic Diffusion Policies

Diffusion-based action generation has become a foundational component of embodied AI, but its reliance on visual conditioning leaves deploy…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Expert-Level Crisis Detection in Mental Health Conversations

Real-world crisis intervention is inherently conversational, yet existing research largely focuses on static texts.Real-world crisis interv…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

Agentic Hybrid RAG for Evidence-Grounded Muon Collider Analysis

Muon collider research spans accelerator physics, detector instrumentation, and high-energy phenomenology, with relevant evidence scattered…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Towards Critical Branching Mechanism in Recurrent Neural Networks

Criticality has been proposed as a key organizing principle in biological neural systems, yet its origin and relevance in artificial neural…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Beyond Absolute Imitation: Anchored Residual Guidance for Privileged On-Policy Distillation

On-policy distillation (OPD) has demonstrated strong empirical gains in enhancing complex reasoning in LLMs by aligning a student model wit…

2026-06-10 13:00 JSTarXiv cs.AIエージェント

SkillResolve-Bench: Measuring and Resolving Same-Capability Ambiguity in Agent Skill Retrieval

Agent skill libraries are becoming routable software assets: a retrieved skill can contribute instructions, scripts, resource bindings, and…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

Harnessing the Collective Intelligence of AI Agents in the Wild for New Discoveries

Scientific discovery is often a collective process: researchers share partial results, inspect failed attempts, and build on each other's i…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

FOGO: Forgetting-aware Orthogonalization Optimizer

We argue that forgetting is not confined to continual learning but is a general optimization phenomenon: during standard training, dominant…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成

Vision-Assisted Foundation Model for Solving Multi-Task Vehicle Routing Problems

Multi-task vehicle routing problems play a critical role in enhancing efficiency across various industries and service sectors. These probl…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Mitigating Bias in Low-SNR Financial Reinforcement Learning via Quantum Representations

The financial market is a typical low signal-to-noise ratio (SNR) setting, which often destabilizes off-policy maximum-entropy methods like…

2026-06-10 13:00 JSTarXiv cs.AIエージェント

The Distributed Detectability Band Against Marginal-Preserving Attacks

AI-control monitors score individual agent actions to detect misbehavior, but real harm can be distributed across many benign-looking steps…

2026-06-10 13:00 JSTarXiv cs.AIハードウェア/半導体

Minimum Distortion Quantization with Specified Output Distribution

We derive the optimal quantizer of a real-valued random variable $W$ with distribution $P_W$ such that 1) the distribution of the quantizat…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

LakeQA: An Exploratory QA Benchmark over a Million-Scale Data Lake

Recent large language models (LLMs) have shown rapid progress in reading-based question answering (QA), where evidence is explicitly provid…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

ERAlign: Energy-based Representation Alignment of GNNs and LLMs on Text-attributed Graphs

Text-attributed Graphs (TAGs) incorporate textual node attributes with graph structures to describe rich relational semantics. Recent effor…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

UPLOTS: A Unified Pretrained Language Model for Constrained Time-series Generation

In time-series generation, existing approaches typically handcraft ortrain a separate model for each dataset, which hinders their scalabili…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Detecting Speculative Language in Biomedical Texts using Recurrent Neural Tensor Networks

In this investigation, we delve into the automated detection of speculative language within biomedical articles by utilizing distributed se…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Decoupling Thought from Speech: Knowledge-Grounded Counterfactual Reasoning for Resilient Multi-Agent Argumentation

Multi-agent debate frameworks have been shown to improve large language model performance in convergent tasks, but they are currently optim…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Advancing the State-of-the-Art in Empirical Privacy Auditing

Parameter-efficient fine-tuning of large language models (LLMs) can exhibit problematic memorization of individual training examples. Empir…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

Stop Early, Spend Less: Hidden-State Probes as a Practical Recipe for Streaming Moderation of LLM Outputs

Deploying large language models in user-facing systems requires efficient output safety filtering. Existing approaches typically rely on a…

2026-06-10 13:00 JSTarXiv cs.AIハードウェア/半導体

Achieving Cloud-Grade SLOs for Local Mixture-of-Experts Inference through CPU-GPU Hybrid Design

Local deployment of large Mixture-of-Experts (MoE) models falls short of the service quality achieved in cloud-scale environments, even und…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

MoE Enhanced Federated Learning for Spatiotemporal Prediction

Traffic prediction is fundamental to intelligent transportation systems and urban computing, yet many cities continue to suffer from traffi…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Assessing Automated Prompt Injection Attacks in Agentic Environments

Indirect prompt injection poses a critical threat to LLM agents that interact with untrusted external data, yet automated attack methods--p…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Machine Learning Methods for Studying Latent Neural Activity Dynamics

Recent developments in brain recording are driving a demand for machine learning tools capable of decoding the latent structure of large po…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

LC-QAT: Data-Efficient 2-Bit QAT for LLMs via Linear-Constrained Vector Quantization

Quantization-aware training (QAT) is essential for extremely low-bit large language models (LLMs). Current QAT methods are mainly based on…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Flexible Flows for Biological Sequence Design

Designing functional biological sequences requires navigating vast discrete spaces under strict evolutionary and biophysical constraints. D…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Benchmarking Knowledge Editing using Logical Rules

Large Language Models (LLMs) are increasingly deployed in real-world applications that require access to up-to-date knowledge. However, ret…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Hidden Consensus:Preference-Validity Compression in Human Feedback

Standard RLHF pipelines often reduce heterogeneous human judgments into a single scalar reward target. We argue that this reduction can mis…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成

Improving Adversarial Transferability on Vision-Language Pre-training Models via Surrogate-Specific Bias Correction

Adversarial examples reveal vulnerabilities in Vision-Language Pre-training (VLP) models and provide insights for improving robustness. A k…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Convergence of Monte Carlo Optimistic Policy Iteration: Beyond Uniform State-Action Updates

The asymptotic behaviour of Monte Carlo optimistic policy iteration (MC-O-PI) is a long-standing open question. When the model of the envir…

2026-06-10 13:00 JSTarXiv cs.AIエージェント

Drawing with Strangers: Population Scaling Drives Zero-Shot Mutual Intelligibility in Emergent Sketching

Generalization in emergent communication has largely focused on novel inputs or linguistic structures, yet the capacity for agents to commu…

2026-06-10 13:00 JSTarXiv cs.AIエージェント

NOVA: Symbolic Regression Discovery of Interpretable Car-Following and Lane-Change Models with Driver Heterogeneity

We present NOVA, an autonomous symbolic regression framework that identifies interpretable car-following and lane-change structures from ra…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Towards Diverse Scientific Hypothesis Search with Large Language Models

Large language models (LLMs) are on the rise for accelerating scientific discovery, most recently in advanced tasks such as generating vali…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

From Data Heterogeneity to Convergence: A Data-Centric Review of Federated Learning

Federated Learning (FL) has emerged as a promising solution for data hunger in centralized learning. This paradigm enables privacy with mul…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Embedding Hybrid Systems into Continuous Latent Vector Fields

This work proves that an $n$-dimensional hybrid system can be embedded into an $m$-dimensional Euclidean space equipped with a continuous v…

2026-06-10 13:00 JSTarXiv cs.AIエージェント

Dmsh: A Multi-Agent Reinforcement Learning Framework for All-Quad Mesh Generation

Generating high-quality meshes for arbitrary geometries remains a fundamental bottleneck in computational engineering, often demanding heur…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Causal Ensemble Agent: Hierarchical Causal Discovery with LLM-guided Expert Reweighting

Causal discovery aims to uncover causal structures from observational data, which is crucial for real-world decision-making. However, diffe…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Fast and Highly Expressive Policy Learning for Offline Reinforcement Learning via Bootstrapped Flow Q-Learning

Diffusion-based Q-learning has emerged as a powerful paradigm for offline reinforcement learning, but its reliance on multi-step denoising…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

Can Image Models Imagine Time? ImageTime: A Novel Benchmark for Probing Visual World Modeling Through Spatiotemporal Consistency

Image generation models now produce high-quality static images, yet their ability to represent how a visual world changes over time remains…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

STORM: Stepwise Token Optimization with Reward-Guided Beam Search

Modern retrieval increasingly relies on dense and learned-sparse neural models that are effective but require encoding the entire corpus in…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Is Fairness Truly Fair? Towards Reliable Lipschitz Fairness in Multi-Task Learning via Fixed-\texorpdfstring{$\delta$}{delta} Alignment

Lipschitz-style individual fairness formalizes the idea that semantically similar examples should receive similar predictions, but its eval…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Dynamic Linear Attention

The scalability of Large Language Models (LLMs) to long contexts is fundamentally constrained by the quadratic complexity of standard atten…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Post-Quantum Secure Federated DeFi for Inclusive Banking

Recent advances in error-corrected qubits have accelerated the timeline for practical quantum computing. It poses a threat to cryptographic…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Accounting for AI Inference in Corporate GHG Inventories: A Four-Tier Methodology for Scope 3 Category 1 Reporting

AI inference services -- API subscriptions, enterprise chat tools, and SaaS products with embedded AI features -- fall unambiguously within…

2026-06-10 13:00 JSTarXiv cs.AIエージェント

Decentralized Multi-Agent Systems with Shared Context

Multi-agent systems (MAS) can scale large language model reasoning at test time by decomposing complex problems into parallel subtasks. How…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

In Defense of Information Leakage in Concept-based Models

Concept-based models (CMs), deep neural networks that ground their predictions on representations aligned with human-understandable concept…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成ロボティクス

UniDexTok: A Unified Dexterous Hand Tokenizer from Real Data

Dexterous hands are essential for fine-grained manipulation, but their hardware designs vary substantially across embodiments. Differences…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Divide and Cooperate: Role-Decomposed Multi-Agent LLM Training with Cross-Agent Learning Signals

Modern language agents which perform multi-step reasoning have shown strong performance in knowledge-intensive question answering. However,…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成

Using the YOLOv12 Model for Verifying the Correct Color Sequence of Wires in Network Cables (Patch Cords) on the Production Line

In the production process of network cables, ensuring the correct color sequence of wire pairs inside the standard connector plays a critic…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Event-Driven Reinforcement Learning Enables Long-Horizon Control in Semiconductor Fabrication

Reinforcement learning promises to optimize sequential decisions in large-scale systems. Semiconductor manufacturing systems are stochastic…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Unifying Data, Memory, and Compute Efficiency in LLM training: A Survey

Resource constraints increasingly determine what can be trained, fine-tuned, and deployed in large language models (LLMs), yet efficiency i…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Effective Reinforcement Learning for Agentic Search by Recycling Zero-Variance Queries During Training

The use of GRPO-style algorithms has become the standard strategy for training LLM search agents under outcome-only rewards. With these alg…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成

++nnU-Net: Scaling nnU-Net with Prefix-Based Data Augmentation

The nnU-Net has demonstrated continuous success in medical segmentation tasks, which heavily rely on the availability and diversity of anno…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Attention Expansion: Enhancing Keyphrase Extraction from Long Documents with Attention-Augmented Contextualized Embeddings

Pre-trained language models (PLMs) have achieved strong performance in keyphrase extraction (KPE), largely due to their ability to generate…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Transformer Based Model for Spatiotemporal Feature Learning in EEG Emotion Recognition

Electroencephalography (EEG) is a widely adopted technique for monitoring brain activity, offering valuable insights into neurological stat…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Detecting Knowledge Gaps from Conversational AI Interactions Using Curriculum Prerequisite Graphs

Large online courses generate thousands of student questions directed at conversational AI teaching assistants, yet these interaction logs…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Spatial-Omni: Spatial Audio Understanding Integration in Multimodal LLMs via FOA Encoding

Recent multimodal large language models mainly process audio as monaural signals, thereby discarding the spatial cues contained in spatial…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェントビジネス/資金調達

Toward Secure LLM Agents: Threat Surfaces, Attacks, Defenses, and Evaluation

Large language model (LLM) agents are rapidly moving from conversational interfaces to software components that plan, invoke tools, maintai…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

A Bayesian Network Approach for Enhancing Security-Focused Decision Support Systems

The adoption and integration of heterogeneous stacks in most of today's open-source based networks brings clear benefits like interoperabil…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Dep-LLM: Training-Free Depression Diagnosis via Evidence-Guided Structured Multi-factor with Reliable LLM Reasoning

Automatic Depression Detection (ADD) from clinical interviews is a pivotal task in computational mental health, yet it remains challenging…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Boosting ECG Classification Performance by Pre-training with Synthesized Data

Deep Neural Networks (DNNs) typically require extensive datasets for effective training. In the medical domain, acquiring large-scale data…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Beyond APIs: Probing the Limits of MLLMs in Physical Tool Use

Multimodal Large Language Models (MLLMs) excel at utilizing digital APIs and increasingly serve as the "brain" of embodied AI, instructing…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Earth-OneVision: Extending Remote Sensing Multimodal Large Language Models to More Sensor Modalities and Tasks

RS-MLLMs enable natural-language understanding and spatial reasoning over earth observation imagery. However, existing models support only…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

K-Forcing: Joint Next-K-Token Decoding via Push-Forward Language Modeling

Autoregressive (AR) language modeling is the dominant paradigm for text generation, yet its sequential token-by-token decoding makes infere…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

A Unified Siamese Learning Framework for Zero-Day Anomaly Detection and Classification in Optical Networks

A multi-similarity Siamese neural network unifies zero-day anomaly detection and one-shot classification in optical networks, achieving ove…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Attention-Discounted Adaptive Sampler for Masked Diffusion Language Models

Masked diffusion language models can reduce inference steps by revealing multiple tokens per denoising iteration, but this parallelism is f…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Geometrically Averaged Hard Target Updates for Linear Q-Learning

Periodic hard target updates are among the most common stabilization devices in modern deep Q-learning. Recent studies suggest that target…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Janus: A Benchmark for Goal-Conditioned Information Distortion in LLMs

LLM deception is often evaluated through direct markers such as fabricated claims, explicit lies, or strategic concealment. However, many r…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

From Perception to Action: Can UI Interventions Foster Sustainable LLM Chatbot

LLM-powered chatbots are increasingly embedded in everyday workflows, raising sustainability concerns due to their energy use. Most mitigat…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成ビジネス/資金調達研究/論文

LIBERO-Occ: Evaluating and Improving Vision-Language-Action Models under Scene-Induced Occlusion via Viewpoint Imagination

Vision-Language-Action (VLA) models achieve strong performance on standard manipulation benchmarks, but most evaluations assume that task-r…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Optimal Post-Training Quantization Scales and Where to Find Them

Post-training quantization (PTQ) compresses large language models by mapping weights to low-bit representations. The scaling factor that de…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成

Improving Text-Instance Alignment Of Foreground Conditioned Out-Painting Via Customized Concept Embedding

To showcase products, merchants often incur substantial costs creating high-quality display images. Foreground Conditioned Outpainting (FCO…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Pose-ICL: 3D-Aware In-Context Learning for Pose-Controllable Subject Customization

Subject Customization is a foundational task in modern image generation. By providing a few reference images and a text prompt, users can g…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Human-AI Teaming Through the Lens of Calibration

We study models for human-AI teaming through the lens of statistical calibration. We assume the team consists of an AI model and human -- b…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

RAT: Reference-Augmented Training for ASV Anti-Spoofing

We introduce a spoofing countermeasure architecture conditioned on speaker-reference recordings, but observe that it converges to a solutio…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Ethical and Technical Limits of Deepfake Speech Datasets

Claims about the robustness and fairness of deepfake speech detectors are only as credible as the datasets used to train and evaluate those…

2026-06-10 13:00 JSTarXiv cs.AIハードウェア/半導体

What Do Deepfake Speech Detectors Actually Hear?

Deepfake speech detectors often output a single score without explaining why an audio sample is flagged, where in the signal the evidence l…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

A Constrained Natural-Language Interface for Variational Multi-Physics Finite Element Simulations in FEniCS

Large language models can reduce the manual effort required to set up finite element simulations, but they introduce reliability risks when…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Recoverable but Not Stationary:Local Linear Structures in Weights and Activations

Task vectors, LoRA, activation steering, and random search around pretrained weights all suggest that learned behaviour can be controlled b…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

CLP: Collocation-Length Prediction for Zero-Loss Adaptive Multi-Token Inference

Large language model inference is bottlenecked by autoregressive decoding, where each token requires a full forward pass. Multi-token predi…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Provenance Tracking in AI Compilers through the Lens of Coalgebra

AI compilers aggressively rewrite computation graphs through normalization, lowering, and optimization, making it difficult to track the pr…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成

Democratising Camera Trap AI: An Open-Source Model for Detecting UK Mammals

Camera traps have become a cornerstone of biodiversity monitoring, but the artificial intelligence that turns vast quantities of images int…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Generative Explainability for Next-Generation Networks: LLM-Augmented XAI with Mutual Feature Interactions

As artificial intelligence and machine learning (AI/ML) models become integral to network operations, their lack of transparency poses a si…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning

Reinforcement learning with verifiable rewards (RLVR) has become standard for improving LLM reasoning. However, existing PPO-style trust-re…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Optimizing 2D Input Representations and Sub-phase Fusion Strategies for Differential Diagnosis of Asthma and COPD Using CNN- and GRU-Based Networks

This study aims to explore the performance of the VAR model in comparison with mel-frequency cepstral coefficient (MFCC) matrices and log-m…

2026-06-10 13:00 JSTarXiv cs.AIエージェント

Understanding and mitigating the risks of OpenClaw for non-technical users: A practical guide with Skill

OpenClaw has rapidly emerged as a transformative artificial intelligence (AI) agent framework, and its ability to autonomously execute comp…

2026-06-10 13:00 JSTarXiv cs.AIエージェントロボティクス

Diffusion Forcing Planner: History-Annealed Planning with Time-Dependent Guidance for Autonomous Driving

Learning-based motion planners, despite recent progress, often suffer from temporal inconsistency. Small perturbations across frames can ac…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

AuRA: Internalizing Audio Understanding into LLMs as LoRA

Recent efforts to extend large language models (LLMs) to speech inputs typically rely on cascaded ASR-LLM pipelines, end-to-end speech-lang…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

T1-Bench: Benchmarking Multi-Scenario Agents in Real-World Domains

Recent advances in reasoning and tool-calling capabilities of large language models (LLMs) have enabled increasingly capable agentic system…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Modeling Complex Behaviors: Multi-Personality Composition and Dynamic Switching in Vision-Language Models

With the widespread deployment of Multimodal Large Language Models (MLLMs) in social interaction, understanding and controlling their behav…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Unifying Local Communications and Local Updates for LLM Pretraining

Communication-efficient pre-training of LLMs is increasingly important as training draws on compute distributed across clusters, data cente…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Test-Time Gradient Guidance of Flow Policies in Reinforcement Learning

Expressive continuous control policies, such as diffusion and flow models, form the backbone of recent advances in scaling imitation learni…

2026-06-10 13:00 JSTarXiv cs.AIロボティクス

RoboNaldo: Accurate, Stable and Powerful Humanoid Soccer Shooting via Motion-Guided Curriculum Reinforcement Learning

Elite humanoid soccer shooting requires whole-body stability, high-impulse whole-body interactions, and accuracy to targets. Motion trackin…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

PhantomBench: Benchmarking the Non-existential Threat of Language Models

Hallucinations, where language models (LMs) generate factually ungrounded responses, pose serious risks, as users tend to blindly rely on t…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成

FADA: Accessible fetal ultrasound interpretation and annotation with a selectively distilled unified vision-language model

A global shortage of trained sonographers limits prenatal ultrasound screening in low- and middle-income countries, where over half of preg…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Designed by Journalists, but Is It for Readers? Rethinking AI Disclosures and Transparency in News

As newsrooms integrate generative AI, journalists face a disclosure challenge: how to communicate AI involvement in ways that maintain read…

2026-06-10 13:00 JSTarXiv cs.AIエージェント

Towards Autonomous Accelerator Design: FPGA Accelerator Generation with SECDA

Designing FPGA-based accelerators for modern artificial intelligence workloads requires exploring a large and complex hardware design space…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント

TRACE: A Unified Rollout Budget Allocation Framework for Efficient Agentic Reinforcement Learning

Reinforcement learning with verifiable rewards (RLVR) is a promising approach for enhancing reasoning and agentic behavior in large languag…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Provenance-Grounded Gating and Adaptive Recovery in Synthetic Post-Training Data Curation

Synthetic post-training pipelines commonly filter generated samples with reward models or holistic LLM judges, yet two practices remain rar…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Data assimilation for subsurface flow using latent diffusion model parameterization: performance of ensemble-Kalman and Monte Carlo techniques

Data assimilation (DA) in subsurface flow entails calibrating model parameters to match observed data, typically at wells, while preserving…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Flaws in the LLM Automation Narrative

Large Language Models (LLMs) are increasingly described as performing at the level of human experts on knowledge economy tasks. These claim…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Piper: A Programmable Distributed Training System

Large-scale model training increasingly relies on composing multiple parallelism strategies, such as data, pipeline, and expert parallelism…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

EEVEE: Towards Test-time Prompt Learning in the Real World for Self-Improving Agents

In this paper, we propose EEVEE, the first multi-dataset test-time prompt learning framework for LLM agents, enabling test-time prompt lear…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

A Unifying Lens on Supervised Fine-Tuning Through Target Distribution Design

Supervised fine-tuning (SFT) typically maximizes the likelihood of every token in a demonstrated trajectory. However, an observed token can…

2026-06-10 13:00 JSTarXiv cs.AIビジネス/資金調達研究/論文

Belief Acquisition as Stochastic Filtering

This paper studies how belief acquisition can be accomplished using stochastic filtering. First, a theoretical foundation for empirical bel…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

A Survey on Semantic Modeling for Building Energy Management

Building Energy Management (BEM) is central to reducing energy use and CO2 emissions in the building sector. Although IoT technologies now…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

A Comprehensive Survey of Direct Preference Optimization: Datasets, Theories, Variants, and Applications

With the rapid advancement of large language models (LLMs), aligning policy models with human preferences has become increasingly critical.…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Position: The ML Community Must Build an AI-Augmented Peer-Review Ecosystem

Peer review, the bedrock of scientific advancement in machine learning (ML), is strained by a crisis of scale. Exponential growth in manusc…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Why Does Reasoning Length Converge? Unveiling the Underfitting-Overfitting Trade-off in Chain-of-Thought

Test-time scaling, primarily manifested through multi-step Chain-of-Thought (CoT) reasoning via Reinforcement Learning (RL), has emerged as…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Constructing coherent spatial memory in LLM agents through graph rectification

Given a map description through global traversal navigation instructions, an LLM can often infer the implicit spatial layout and answer use…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Impatient Users Confuse AI Agents: High-fidelity Simulations of Human Traits for Testing Agents

Despite rapid progress in building conversational AI agents, robustness is still largely untested. Small shifts in user behavior, such as b…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成エージェント

ChartAgent: A Multimodal Agent for Visually Grounded Reasoning in Complex Chart Question Answering

Recent multimodal LLMs have shown promise in chart-based visual question answering, but their performance declines sharply on unannotated c…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

How can we assess human-agent interactions? Case studies in software agent design

While benchmarks measure the accuracy of LLM-powered agents, they mostly assume full automation, failing to represent the collaborative nat…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェントロボティクス

RoboGPT-R1: Enhancing Robot Task Planning with Reinforcement Learning

Improving the reasoning capabilities of embodied agents is crucial for robots to complete complex human instructions in long-view manipulat…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成

Non-Parametric Structural Priors for Geometry Theorem Prediction

Multi-step theorem prediction is a central challenge in geometry problem solving. Existing neural-symbolic approaches rely heavily on super…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント

The Price of Agreement: Measuring LLM Sycophancy in Agentic Financial Applications

Given the increased use of LLMs in financial systems today, it becomes important to evaluate the safety and robustness of such systems. One…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Adaptive Teacher Exposure for Self-Distillation in LLM Reasoning

On-policy self-distillation has become a strong recipe for LLM reasoning, where a privileged teacher supervises the student's own rollouts…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Neurosymbolic Learning for Inference-Time Argumentation

Claim verification is an important problem in high-stakes settings, including health and finance. When information underpinning claims is i…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

AMEL: Accumulated Message Effects on LLM Judgments

Large language models are routinely used as automated evaluators: to review code, moderate content, or score outputs, often with many items…

2026-06-10 13:00 JSTarXiv cs.AIエージェント

A Sober Look at Agentic Misalignment in Automated Workflows

We study a class of emergent misalignment in multi-agent systems (MAS), with a focus on automated workflows, which we refer to agentic misa…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

EVA-Net: Subject-Independent EEG Motor Decoding with Video-Derived Motor Priors

Practical non-invasive Brain-Computer Interface (BCI) systems require EEG decoders with strong cross-subject generalization and minimal cal…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

VET: A Framework for Analyzing AI Discourse

Public discourse on AI has become polarized; exaggerated positions on AI in traditional and social media threaten the development of AI Lit…

2026-06-10 13:00 JSTarXiv cs.AIエージェント

AgentPLM: Agentic Protein Language Models with Reasoning-Augmented Decoding for Protein Sequence Design

Protein language models (PLMs) are passive oracles: they generate sequences in a single forward pass with no mechanism to consult external…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Parthenon Law: A Self-Evolving Legal-Agent Framework

As agents grow more capable, legal-domain LLM agents promise to turn document-heavy matters into reviewable work products -- yet reliable d…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

PSEBench: A Controllable and Verifiable Benchmark for Evaluating LLMs in Patient Safety Event Triage

Patient safety event triage, determining whether a clinical event is reportable under jurisdiction-specific policy, is a high-stakes task t…

2026-06-10 13:00 JSTarXiv cs.AIエージェント

Robust Deep Reinforcement Learning Through Adversarial Attacks and Training : A Survey

Deep Reinforcement Learning (DRL) is a subfield of machine learning for training autonomous agents that take sequential actions across comp…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Mixtures of Neural Operators Reduce Active Complexity in Operator Learning

Operator-learning systems are not governed solely by total parameter count; for one query, the relevant bottleneck can be the model that mu…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェントロボティクス

BadRobot: Jailbreaking Embodied LLM Agents in the Physical World

Embodied AI represents systems where AI is integrated into physical entities. Large Language Model (LLM), which exhibits powerful language…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成ハードウェア/半導体ビジネス/資金調達

Conditional Vendi Score: Prompt-Aware Diversity Evaluation for Generative AI Models and LLMs

Generative models guided by text prompts are widely evaluated for fidelity and prompt alignment, yet their ability to produce outputs remai…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成

Visual-TCAV: Concept-based Attribution and Saliency Maps for Post-hoc Explainability in Image Classification

Convolutional Neural Networks (CNNs) have shown remarkable performance in image classification. However, interpreting their predictions is…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Whisper-GPT -- Continuous Discrete Hybrid Representation Language Models For Speech And Music

We propose WHISPER-GPT: A generative large language model (LLM) for speech and music that allows us to work with continuous audio represent…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Dynamics of Adversarial Attacks on Large Language Model-Based Search Engines

The increasing integration of Large Language Model (LLM) based search engines has transformed the landscape of information retrieval. Howev…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Representational Alignment with Chemical Induced Fit for Molecular Relational Learning

Molecular Relational Learning (MRL) is widely applied in natural sciences to predict relationships between molecular pairs by extracting st…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

CITRAS: Covariate-Informed Transformer for Time Series Forecasting

In time series forecasting, covariates represent external factors that influence target variables. Some covariates are observable only in t…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成ロボティクス

NuWa: Deriving Lightweight Class-Specific Vision Transformers for Edge Devices

Vision Transformers (ViTs) often need to be compressed for deployment on resource-constrained edge devices like drones and smart vehicles.…

2026-06-10 13:00 JSTarXiv cs.AIエージェントロボティクス

A Survey of Robotic Navigation and Manipulation with Physics Simulators in the Era of Embodied AI

Navigation and manipulation are core capabilities in Embodied AI, but training agents to perform them directly in the real world is costly,…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

CleanPatrick: A Benchmark for Image Data Cleaning

Robust machine learning depends on clean data, yet current image data cleaning benchmarks rely on synthetic noise or narrow human studies,…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Attacks on Machine-Text Detectors Retain Stylistic Fingerprints

Despite considerable progress in the development of machine-text detectors, the ease with which machine-text can be manipulated to evade de…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

ASyMOB: Algebraic Symbolic Mathematical Operations Benchmark

Large language models (LLMs) are increasingly applied to symbolic mathematics, yet existing evaluations often conflate pattern memorization…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Fact-Augmented Lookahead Planning for LLM Agents

Large Language Models (LLMs) are increasingly capable, but LLM agents still struggle to plan effectively in interactive, partially observab…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Quantifying Perception-Based Student Success with Generative AI: An Exploratory Monte Carlo Simulation

Generative artificial intelligence (GenAI) tools such as ChatGPT have attracted growing attention in higher education, particularly in rela…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント

TinyTroupe: An LLM-powered Multiagent Persona Simulation Toolkit

Recent advances in Large Language Models (LLM) have led to a new class of autonomous agents, renewing and expanding interest in the area. L…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

GRID: Scaling Task-Agnostic Inference in Continual Prompt Tuning

Prompt-based continual learning (CL) offers a parameter-efficient way to adapt large language models (LLMs) across task sequences. However,…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

LLM-Aided Joint Secrecy Precoding and Trajectory for RSMA-Based Heterogeneous UAV Networks

This paper investigates secure communications in rate-splitting multiple access (RSMA) enabled heterogeneous UAV networks, where multiple U…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Assessment of Personality Dimensions Across Situations in Dyadic Role-Play Scenarios

Prior research indicates that users prefer assistive technologies whose personalities align with their own. This has sparked interest in au…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Whisfusion: Parallel ASR Decoding with Masked Diffusion

Autoregressive (AR) encoder-decoder models dominate high-quality multilingual ASR, but their left-to-right decoders make inference latency…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning

While large language models (LLMs) have demonstrated strong performance on factoid question answering, they are still prone to hallucinatio…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Deep Generative Model for Human Mobility Behavior

Understanding and modeling human mobility is central to challenges in transport planning, sustainable urban design, and public health. Desp…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Learning-Guided Integration Contours Construction for Fast Large-Scale Generalized Eigensolvers

Solving large-scale Generalized Eigenvalue Problems (GEPs) is a fundamental yet computationally prohibitive task in science and engineering…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Lost in Serialization: Invariance and Generalization of LLM Graph Reasoners

While promising, graph reasoners based on Large Language Models (LLMs) lack built-in invariance to symmetries in graph representations. Ope…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

On the Condition Number Dependency in Bilevel Optimization

Bilevel optimization minimizes an objective function, defined by an upper-level problem whose feasible region is the solution of a lower-le…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

When Distance Distracts: Representation Distance Bias in BT-Loss for Reward Models

Reward models are central to Large Language Model (LLM) alignment within the framework of RLHF. The standard objective used in reward model…

2026-06-10 13:00 JSTarXiv cs.AIロボティクス

Model-Based Diffusion Sampling for Predictive Control in Offline Decision Making

Offline decision-making via diffusion models often produces trajectories that are misaligned with system dynamics, limiting their reliabili…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

V-REX: Benchmarking Exploratory Visual Reasoning via Chain-of-Questions

While many vision-language models (VLMs) are developed to answer well-defined, straightforward questions with highly specified targets, as…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成

Scone: Bridging Composition and Distinction in Subject-Driven Image Generation via Unified Understanding-Generation Modeling

Subject-driven image generation has advanced from single- to multi-subject composition, while neglecting distinction, the ability to distin…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Model-Based Reinforcement Learning in Discrete-Action Non-Markovian Reward Decision Processes

Many practical decision-making problems involve tasks whose success depends on the entire system history, rather than on achieving a state…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

SCOPE: Sequential Causal Optimization of Process Interventions

Prescriptive Process Monitoring (PresPM) recommends interventions during running business processes to optimize key performance indicators…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

HiGR: Industrial-Scale Hierarchical Generative Slate Recommendation Framework in Tencent

Slate recommendation, which presents users with a ranked item list in a single display, is ubiquitous across mainstream online platforms. W…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成

MMD Guidance: Training-Free Distribution Adaptation for Diffusion Models via Maximum Mean Discrepancy Guidance

Pre-trained diffusion models have emerged as powerful generative priors for both unconditional and conditional sample generation, yet their…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

torch-sla: Differentiable Sparse Linear Algebra with Adjoint Solvers and Sparse Tensor Parallelism for PyTorch

Differentiable sparse linear algebra is foundational for scientific machine learning, yet PyTorch lacks a unified library for it: torch.spa…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Adoption of Generative Artificial Intelligence in the German Software Engineering Industry: An Empirical Study

Generative artificial intelligence (GenAI) tools have seen rapid adoption among software developers. While adoption rates in the industry a…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Structure-Preserving Learning Improves Geometry Generalization in Neural PDEs

We aim to develop physics foundation models for science and engineering that provide real-time solutions to Partial Differential Equations…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

MemCast: Memory-Driven Time Series Forecasting with Experience-Conditioned Reasoning

Time series forecasting (TSF) plays a critical role in decision-making for many real-world applications. Recently, large language model (LL…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント

ASA: Backbone-Training-Free Representation Engineering for Tool-Calling Agents

Adapting LLM agents to domain-specific tool calling remains notably brittle under evolving interfaces. Prompt and schema engineering is eas…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Capture Timing-Attention of Events in Clinical Time Series

The contemporary paradigm of trajectory learning operates fundamentally at the level of group dynamics, systematically reducing individual-…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

RankLLM: Weighted Ranking of LLMs by Quantifying Question Difficulty

Benchmarks establish a standardized evaluation framework to systematically assess the performance of large language models (LLMs), facilita…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Exploring Accurate and Transparent Domain Adaptation in Predictive Healthcare via Concept-Grounded Orthogonal Inference

Deep learning models for clinical event prediction on electronic health records (EHR) often suffer performance degradation when deployed un…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Improving Topic Modeling by Distilling Soft Labels from Language Models

Traditional neural topic models are typically optimized by reconstructing the document's Bag-of-Words (BoW) representations, overlooking co…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

GRAU: Generic Reconfigurable Activation Unit Design for Neural Network Hardware Accelerators

With the continuous growth of neural network scales, low-precision quantization is widely used in edge accelerators. Classic multi-threshol…

2026-06-10 13:00 JSTarXiv cs.AIエージェントロボティクス研究/論文

TaCarla: A comprehensive benchmarking dataset for end-to-end autonomous driving

Collecting a high-quality dataset is a critical task that demands meticulous attention to detail, as overlooking certain aspects can render…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

MedFeat: Model-Aware and Explainability-Driven Feature Engineering with LLMs for Clinical Tabular Prediction

In clinical tabular prediction, classical machine learning models with feature engineering often outperform neural methods. LLMs are increa…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Quantifying Uncertainty in AI Visibility: A Statistical Framework for Generative Search Measurement

AI-powered answer engines are inherently non-deterministic: identical queries submitted at different times can produce different responses…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

FinTradeBench: A Financial Reasoning Benchmark for LLMs

Real-world financial decision-making is a challenging problem that requires reasoning over heterogeneous signals, including company fundame…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Reasoning over Semantic IDs Enhances Generative Recommendation

Recent advances in generative recommendation have leveraged pretrained LLMs by formulating sequential recommendation as autoregressive gene…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

The Model Says Walk: How Surface Heuristics Override Implicit Constraints in LLM Reasoning

Large language models fail when a salient surface cue conflicts with an unstated feasibility constraint. We introduce the Heuristic Overrid…

2026-06-10 13:00 JSTarXiv cs.AIハードウェア/半導体

Trust and Reliance on AI in Education: AI Literacy and Need for Cognition as Moderators

As generative AI systems are integrated into educational settings, students often encounter AI-generated output while working through learn…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

SAFE: An LLM-as-Verifier Framework for Evidence-Grounded Multi-Hop Reasoning

Multi-hop QA benchmarks often reward Large Language Models (LLMs) for spurious correctness, where models reach correct answers through inva…

2026-06-10 13:00 JSTarXiv cs.AIエージェント

Prosociality by Coupling, Not Mere Observation: Homeostatic Sharing in an Inspectable Recurrent Artificial Life Agent

Artificial agents can be made to ``help'' through explicit social rewards, hard-coded prosocial bonuses, or direct access to another agent'…

2026-06-10 13:00 JSTarXiv cs.AIエージェント

GCA Framework: A GCC Countries-Grounded Dataset and Agentic Pipeline for Climate Decision Support

Climate decision-making in the GCC states increasingly demands systems that can translate heterogeneous scientific and policy evidence into…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Generating Concept Lexicalizations via Dictionary-Based Cross-Lingual Sense Projection

We study the task of automatically expanding WordNet-style lexical resources to new languages through sense generation. We generate senses…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Beyond Single-Model Optimization: Preserving Plasticity in Continual Reinforcement Learning

Continual reinforcement learning must balance retention with adaptation, yet many methods still rely on \emph{single-model preservation}, c…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

Learning Evidence Highlighting for Frozen LLMs

Large Language Models (LLMs) can reason well, yet often miss decisive evidence when it is buried in long, noisy contexts. We introduce HiLi…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

People-Centred Medical Image Analysis via Fairness-Aware Human-AI Cooperation

Machine learning models for medical image analysis often exhibit subgroup-dependent performance, which impacts how decisions should be allo…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

RAG over Thinking Traces Can Improve Reasoning Tasks

Retrieval-augmented generation (RAG) has proven effective for knowledge-intensive tasks, but is widely believed to offer limited benefit fo…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Communication Dynamics Neural Networks: FFT-Diagonalized Layers for Improved Hessian Conditioning at Reduced Parameter Count

Communication Dynamics Neural Networks (CDNNs) apply the circulant-spectral machinery of the Communication Dynamics framework to neural-net…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

A Theory of Training Profit-Optimal LLMs

Scaling LLMs requires tremendous computational resources, and recent advances in AI have gone hand in hand with massive amounts of capital…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIエージェント

From Volume to Value: Preference-Aligned Memory Construction for On-Device RAG

With the rapid emergence of personal AI agents based on Large Language Models (LLMs), implementing them on-device has become essential for…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成

Rotation-Invariant Spherical Watermarking via Third-Order SO(3) Representation Coupling

Reliable watermarking of panoramic imagery is fundamentally challenged by arbitrary 3D rotations. As panoramas are defined on the sphere, t…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Falcon-X: A Time Series Foundation Model for Heterogeneous Multivariate Modeling

Time series foundation models (TSFMs) are transforming the forecasting paradigm through large-scale cross-domain pretraining. However, most…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

Does Capability Transfer to Subjective Behavior -- and Would Our Instruments Tell Us? A Self-Evolving, Trust-by-Construction Evaluation Paradigm

Benchmarking is mature where answers are verifiable -- math, code, reasoning -- but the fastest-growing uses of LLMs are subjective and hum…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

On the Learnability of Test-Time Adaptation: A Recovery Complexity Perspective

Test-time adaptation (TTA) aims to adapt models to maintain reliable performance on non-stationary test streams without requiring labeled d…

2026-06-10 13:00 JSTarXiv cs.AILLM/生成AI

PromptEmbedder: Efficient and Transferable Text Embedding via Dual-LLM Soft Prompting

Large Language Models (LLMs) have demonstrated remarkable efficacy in text embedding, yet current adaptation methods like LoRA face signifi…

2026-06-10 13:00 JSTarXiv cs.AI画像/動画生成

Updating the standard neuron model in artificial neural networks

From their inception in the 1950s, artificial neural networks (ANNs) started using the so-called point neuron model then prevalent in neuro…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

Variational Learning for Insertion-based Generation

Non-monotonic sequence generation methods, such as masked diffusion models, provide a flexible alternative to left-to-right autoregressive…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

When Do Attention Circuits Form? Developmental Trajectories of Capability and Attention-Sink Emergence Across Three 1B-ClassArchitectures

We track the developmental trajectory of attention-head circuit formation across three 1B-class language models spanning two architecture f…

2026-06-10 13:00 JSTarXiv cs.AI研究/論文

LiveBand: Live Accompaniment Generation in the Audio Domain

We present LiveBand, a real-time system that generates high-fidelity music accompaniments to live audio input, respecting strict causal con…

2026-06-10 13:00 JSTarXiv cs.AIエージェントロボティクス

AgenticRL: Self-Refining Agentic Reinforcement Learning for Vision-Conditioned UAV Navigation

Deep reinforcement learning has shown strong potential for enabling autonomous robots to learn complex navigational tasks. However, its pra…

2026-06-10 13:00 JSTarXiv cs.AIロボティクス

CoRe-MoE: Contrastive Reweighted Mixture of Experts for Multi-Terrain Humanoid Locomotion with Gait Adaptation

Humans primarily rely on walking and running to traverse complex terrains. Similarly, humanoid robots should be able to smoothly transition…

2026-06-10 12:22 JSTITmedia AI+LLM/生成AI

Anthropic、最上位「ミュトス」級モデルを一般提供　悪用防ぐ保護機能を備えた「Claude Fable 5」

米Anthropicは6月9日（現地時間）、新AIモデル「Claude Fable 5」の一般提供を始めた。同社がOpusクラスを上回る能力を持つと位置付ける最上位の「Mythos（ミュトス）クラス」に属するモデルで、これまでセキュリティ上の懸念から一般公開を見送ってきた水準の…

2026-06-10 11:19 JSTITmedia AI+LLM/生成AI

Anthropicの最新AI「Fable 5」、試すなら今？　Claudeのレート制限リセット　サブスクで使えるのは6月22日まで

米Anthropicは、チャットAI「Claude」の5時間および週次のレート制限をリセットしたと発表した。最上位の「Mythosクラス」に属するAIモデル「Claude Fable 5」を試すよう促している。

2026-06-10 10:36 JSTITmedia AI+LLM/生成AI

JR東「みどりの窓口」に生成AI導入検証　乗客と対話→係員に引き継ぎ　NECと共同

利用者が音声でAIと対話し、きっぷ購入に必要な情報を整理した上で窓口係員に引き継ぐ。

2026-06-10 09:26 JSTTechCrunch AIその他

Google just fired a warning shot in the AI subscription price wars

Google just made it significantly cheaper to enjoy its budget AI subscription tier.

2026-06-10 08:17 JSTTechCrunch AIその他

How Justin Ernest invested nearly $500M into hot startups without a traditional VC fund

Instead of spending a year raising a formal venture fund, the Sabertooth VC founder used a captive network of LPs to invest in startups lik…

2026-06-10 08:00 JSTITmedia AI+その他

大阪ガスが日本IBM、オージス総研とパートナーシップ締結　AIを軸としたシステム変革とは

大阪ガスとオージス総研、日本IBMの3社が、AIを軸にした次世代ITシステムに向けて共創パートナーシップを結んだ。既存システムのモダナイゼーションやAI駆動開発、セキュリティ対策、人材育成などでの検討と試行を進めるとしている。

2026-06-10 08:00 JSTITmedia AI+LLM/生成AI

考えるSaaSは死に、SoRが生き残る──急成長中Sansan「Contract One」から読み解くリーガルテックの明暗

契約業務系のリーガルテックは、大きく分けて「契約レビュー」と「契約管理」の2つ。このうち、契約レビューは生成AIの影響が早期に表れたSaaS領域の一つだ。明暗を分けた線は、どこにあるのか。

2026-06-10 07:00 JSTITmedia AI+エージェント

“机の下でこっそり”AI使う――セールスフォース社長のAIエージェント活用術とは

AIエージェント製品に注力する米Salesforce。日本法人社長も「日常的に使っている」という。その活用方法とは。

2026-06-10 07:00 JSTITmedia AI+その他

「猫も杓子もAI」な現状は今後も続くのか？【後編】AI時代に必要な3つの検討事項

近年「製品セキュリティ」と呼ばれ始めたセキュリティの新分野に関する事象を紹介し考察する本連載。今回は、「AIの今後」について筆者が必要だと考えている「3つの重要検討事項」について述べる。

2026-06-10 07:00 JSTITmedia AI+規制/政策

政府・著名人のInstagramアカウントが次々に乗っ取り被害　原因はMetaのAIアシスタント？

米宇宙軍の幹部やオバマ元大統領時代のホワイトハウスが使っていたInstagramのアカウントが何者かに乗っ取られ、イラン支持の画像やメッセージが投稿される被害が相次いだ。攻撃者は米Metaの「AIサポートアシスタント」が抱える脆弱性を突き、狙ったアカウントのパスワードをリセット…

2026-06-10 07:00 JSTITmedia AI+LLM/生成AI

AIがシステムの弱点を暴き、AIが攻撃する時代へ　自治体サイバー防衛の「新・生存戦略」

生成AIの進化により、ソフトウェアの弱点が発見されるようになり、サイバー攻撃を取り巻く環境は大きく変わりつつある。これまでなら見過ごされていた問題が明らかになる一方で、AIを活用した攻撃の自動化も進み「防ぎ切ること」を前提とした従来の対策だけでは十分とは言えなくなってきた。AI…

2026-06-10 05:50 JSTTechCrunch AIロボティクス

Hey, Siri, here’s what I actually want from AI

I'm desperate for a personal AI assistant, but do I really want to become the kind of person who can't function without the friendly robot…

2026-06-10 05:37 JSTTechCrunch AILLM/生成AI

Anthropic’s Fable 5 can make weirdly fun video games with the click of a button

Anthropic's Claude Fable 5 is going to be a big hit with the web's vibe coders.

2026-06-10 03:56 JSTTechCrunch AIその他

Can tech companies learn to love cheaper AI models?

If those same AI workloads can be handled by cheaper models without affecting quality, it would mean a massive shift in the economics of AI.

2026-06-10 03:04 JSTTechCrunch AIその他

WWDC 2026: Everything announced on Siri AI, iOS 27, Apple Intelligence, and more

Apple primarily made the case for an improved experience with its long-standing Siri assistant, which like most other announcements had a h…

2026-06-10 02:00 JSTTechCrunch AILLM/生成AI

Anthropic’s Claude Fable 5 is a version of Mythos the public can access today

Anthropic is releasing Claude Fable 5, its first Mythos-class model available to the public. The model comes with guardrails that block res…

2026-06-10 01:09 JSTTechCrunch AILLM/生成AI

It’s not FAANG anymore. It’s MANGOS.

With SpaceX, Anthropic, and OpenAI all eyeing massive public debuts, the tech industry may soon have a new class of corporate overlords — a…

2026-06-10 00:16 JSTGoogle DeepMindLLM/生成AI

Fluid, natural voice translation with Gemini 3.5 Live Translate

Gemini 3.5 Live Translate brings near real-time, natural speech translation to Google AI Studio, Google Translate and Google Meet.

2026-06-09（791件）

2026-06-09 23:10 JSTGoogle DeepMindその他

Introducing Gemma 4 12B: a unified, encoder-free multimodal model

2026-06-09 23:02 JSTGoogle DeepMindロボティクス

Powering the future of robotics in Europe

2026-06-09 22:47 JSTTechCrunch AIビジネス/資金調達

Sandstone raises $30M to bring AI to in-house legal teams

Sandstone's Series A comes just six months after a Sequoia-led seed round.

2026-06-09 22:00 JSTTechCrunch AIその他

Lovable says it has hit $500M in annualized revenue, with 1 million new projects a week

Lovable says it has now surpassed $500 million in annualized run-rate revenue and its users are building businesses and replacing internal…

2026-06-09 21:00 JSTTechCrunch AIビジネス/資金調達

How an e-scooter founder raised $5 million to build space data centers

Orbital founder Euwyn Poon built 250,000 scooters at Spin. Now he wants to launch 10,000 space data centers.

2026-06-09 21:00 JSTOpenAILLM/生成AIエージェント

How engineers at Nextdoor use Codex to build without limits

How engineers at Nextdoor use Codex with GPT-5.5 to investigate hard-to-reproduce issues, build across platforms, and focus on product outc…

2026-06-09 19:00 JSTOpenAIエージェント

What Codex unlocks for Notion

How Notion uses Codex to one-shot specs, build AI Voice Input for the web, and multiply engineering power across small teams.

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

PathoSage: Towards Multi-Source Evidence Adjudication in Pathology via Experience-Aware Agentic Workflow

Recent advances in Multimodal Large Language Models (MLLMs) and agent workflows have shown strong promise for computational pathology, yet…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

OmniMem: Perturbation-aware Memory Compression for Streaming Audio-Visual LLMs

Audio-visual large language models (LLMs) hold strong promise for long-form video understanding, yet their long-video inference is fundamen…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

Syll: Open-Source Personal Automation with Cross-Surface Execution

Personal AI agents must increasingly operate across APIs, shells, web surfaces, and desktop GUIs, yet many systems remain tuned to a single…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成エージェント研究/論文

A case study of evaluating AI agents on a neuroscience data-to-discovery pipeline

Agentic AI tools offer a promising path to automating software development bottlenecks in scientific research pipelines, particularly for s…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Why Limit the Residual Stream to Layers and Not Tokens? Persistent Memory for Continuous Latent Reasoning

Large language models (LLMs) have demonstrated remarkable reasoning abilities on mathematical and multi-hop planning tasks. The CoCoNuT (Ch…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Automatic Extraction of Structured Information from Brain MRI Reports Using an Open-Weight Large Language Model

Objectives: Automatic data extraction from free-text radiology reports enables large-scale research, but few studies assessed the performan…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Some hypotheses on how chatbots work in problem-solving-driven conversations. Large Language Models as confirmation of the Innovation Illusion

This article offers a perspective on the nature of chatbots as genuine conversation partners when discussing problems in relation to their…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

Land cover and flood type govern the detection limits of satellite-based flood mapping across diverse global flood events

Floods are among the most destructive natural hazards, and their increasing frequency under climate change makes satellite-based inundation…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Reconstructing and forecasting disease trajectories of patients with Alzheimer's disease using routine data in resource-constrained settings

Alzheimer's disease is a progressive neurodegenerative disorder, and its progression varies substantially across patients. Existing work ai…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Improving Multimodal Reasoning via Worst Dimension Optimization

Multimodal reasoning requires a path that retains integrity over a wide range of constraints, from visual grounding to logic consistency. H…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

Beyond Goodhart's Law: A Dynamic Benchmark for Evaluating Compliance in Multi-Agent Systems

The rapid evolution of Large Language Models (LLMs) from passive assistants to autonomous, execution-capable agents has introduced critical…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

Where Instruction Hierarchy Breaks: Diagnosing and Repairing Failures in Reasoning Language Models

Reasoning language models deployed in agentic workflows must follow an instruction hierarchy: when instructions from different sources conf…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Scaling Participation in Modular AI Systems

Humanity is a mosaic of multifaceted talents and needs, and any truly intelligent AI must reflect that richness. Yet the LLMs used by all a…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Joint Structural Pruning and Mixed-Precision Quantization for LLM Compression

Recently, the efficiency of Large Language Models (LLMs) deployment has become a critical concern in practical applications. While post-tra…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

Overcoming the Regulatory Bottleneck via Agent-to-Agent Protocols: A Nuclear Case Study

Regulatory review of advanced nuclear reactor designs routinely spans more than three years and consumes hundreds of millions of dollars in…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Safety is Contextual, LLM-Judges Are Not: Navigating the Rigid Priors of Evaluators

LLMs-as-judges are the only way to evaluate safety at scale. Despite their importance, LLM-judges themselves are rarely evaluated beyond hu…

2026-06-09 13:00 JSTarXiv cs.AIビジネス/資金調達

The AI Epistemic Deference Index: A Continuous Measure of Sycophancy

Current AI models frequently exhibit epistemic sycophancy, endorsing claims to agree with a user. Existing evaluations typically measure th…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Contract2Tool: Learning Preconditions and Effects for Reliable Tool-Augmented LLM Agents

Tool-augmented large language model agents increasingly rely on external APIs, but standard tool schemas describe how to call a tool, not w…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

MemToolAgent overview with a simple restaurant booking scenario where the agent retrieves similar memories, receives feedback on an invalid time format, and generates a reflection to update its memory

Modern large language model (LLM) agents can use external tools to help users solve complex tasks. However, for problems that require learn…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

EditSR: Enhancing Neural Symbolic Regression via Edit-based Rectification

Neural symbolic regression models improve inference efficiency by shifting structural search to pretraining, but their one-pass autoregress…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

The CIFAR Synthetic Evidence Corpus for Detecting AI-Generated Evidence

The growing ability of generative models to produce realistic documents poses a direct challenge to evidentiary workflows in the justice sy…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Stress-testing medical large language models reveals latent safety pathology beyond benchmark accuracy

Large language models (LLMs) are entering clinical practice based on benchmark accuracy that may fail to detect safety-relevant failure mod…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Unification of Closed-Open Industrial Detection Scenarios: New Large-Scale Benchmarks,Challenges and Baselines

Large-scale Visual-Language Models (LVLMs) have achieved remarkable success in natural visual tasks, yet their application to industrial de…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Shared Latent Structures Enable Unified Backdoor Detection and Mitigation in LLMs

Backdoor attacks in large language models (LLMs) are often treated as isolated trigger-response failures, motivating defenses tailored to s…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Zero-Shot Learning in Industrial Scenarios: New Large-Scale Benchmark, Challenges and Baseline

Large Visual Language Models (LVLMs) have achieved remarkable success in vision tasks. However, the significant differences between industr…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

PAFO: Pareto Fairness Optimization for Personalized Reward Modeling

Large language models (LLMs) increasingly rely on reward models to align their outputs with diverse user preferences. While personalized re…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

VATS: Exploiting Implicit Authority in Error-Path Injection via Systematic Mutation

As the Model Context Protocol (MCP) standardizes tool-calling for autonomous agents, it introduces a critical, unexamined attack surface: t…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

Efficient Skill Grounding via Code Refactoring with Small Language Models

Effective skill grounding is essential for deploying reusable skills in embodied agents, as even minor embodiment or environmental differen…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

UniQL: Towards Dialect-Universal Benchmarking for Text-to-SQL

Existing text-to-SQL benchmarks are largely centered on SQLite, making it difficult to evaluate whether models can generalize across hetero…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

OSMGraphCLIP: Learning Global Location Representations from OpenStreetMap Graphs

We present OSMGraphCLIP, a CLIP-style geospatial representation model that learns global location embeddings from freely available OpenStre…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

SKILL.nb: Selective Formalization and Gated Execution for Durable Agent Workflows

AI agents increasingly turn past experience into reusable artifacts such as code, workflows, and procedural memories. Reuse can improve eff…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

How Small Can You Go? LoRA Fine-Tuning 270M-8B Models for Merchant Information Extraction in Financial Transactions

Financial transaction processing requires extracting structured merchant information from noisy, abbreviated bank transaction strings at sc…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

A Multi-modal Agentic Co-pilot for Evidence Grounded Computational Pathology

Pathology is the cornerstone of modern medicine, where accurate decision-making relies heavily on evidence-based practices. While artificia…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

When Does Delegation Beat Majority? A Delegation-Based Aggregator for Multi-Sample LLM Inference

Majority voting over sampled answers is the dominant unsupervised aggregator for multi-sample LLM inference. We show that piping the signal…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

PACE: Anytime-Valid Acceptance Tests for Self-Evolving Agents

Self-evolving agents improve by repeatedly proposing changes to their own prompts, skills, or workflows and keeping those that score higher…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Think Before You Act: Intention-Guided Reasoning for LLM-Based Location Prediction

Predicting a user's next Point-of-Interest (POI) based on their historical check-in records is a fundamental task in location-based service…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Cross-LLM Consistency in Inference: Evidence from Shared Interactions

Large language models (LLMs) differ in architecture, training data, and optimization procedures, yet they may still develop similar interna…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

SAGE: An LLM-driven Self Reflective Agentic Framework for Fraud Detection

Fraud detection in payment, e-commerce, and telecommunications systems requires accuracy at the individual level, robustness under severe c…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Decision-Aware Memory Cards: Counterfactual-Inspired Context Selection and Compression for Tool-Using LLM Agents

Tool-using LLM agents often fail not because relevant text is absent, but because decisive evidence is not selected, compressed, or surface…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェントハードウェア/半導体ビジネス/資金調達

Online Agent-as-a-Judge: Situation-Generating Evaluation for Interactive Agents

Evaluating LLM-powered interactive social agents is challenging because socially relevant behaviors depend not only on isolated outputs, bu…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

SciTrace: Trajectory-Aware Safety Reasoning for Scientific Discovery Agents

LLM-based scientific agents have shown strong capacity for autonomous research, yet their safety layers remain structurally divorced from c…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

When No Answer Is Correct: Diagnosing Absent Answer Detection for MLLMs in Video Understanding

Multimodal large language models (MLLMs) have made substantial advancements in video understanding, yet the reliability of their responses…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

Traxia: A Framework for Verifiable, Agent-Native Scientific Publishing

Verifiability, attribution, and reproducibility are foundational requirements of scientific knowledge, yet current publishing infrastructur…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

From Validator Selection to Portfolio Collection Optimization in Proof-of-Stake Blockchains

We consider a problem arising in proof-of-stake blockchain environments, where agents called nominators select validators - entities respon…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Beyond Agent Architecture: Execution Assumptions and Reproducibility in LLM-Based Trading Systems

Large language models (LLMs) and agentic systems are increasingly proposed for financial trading, yet their reported performance remains di…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Ablation-Reversible Heads Don't Transfer: A Stress Test for Mechanistic Role Claims in Transformers

In mechanistic interpretability, attention heads are commonly elevated to role claims (e.g., "this head represents addition") when they are…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

Revisiting the shutdown problem

A key premise in leading arguments for existential risk from artificial intelligence is that malfunctioning artificial agents could not be…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

To Nuke or Not to Nuke: LLMs' (Missing) Ethical Reasoning and Actions in a High-Stakes Decision-Making Simulation

Large language models (LLMs) are increasingly deployed as long-horizon agents with decision-making capacities. While LLMs can show ethical…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Curation of a Cardiology Interface Terminology for Highlighting Electronic Health Records using Machine Learning

Electronic health record (EHR) notes are dense medical documents containing large amounts of information, often filled with complex medical…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Neuro-Symbolic Injection of LTLf Constraints in Autoregressive Reinforcement Learning Policies

In this work we study offline reinforcement learning (RL) under temporally extended task constraints expressed in Linear Temporal Logic ove…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Integrating Deep Learning Demand Forecasting with Multi-Objective Optimization for Circular Coffee Supply Chains: A Data-Driven Framework for Cost, Emissions, and Freshness Management

The coffee supply chain is one of the most complex agri-food networks, marked by geographically dispersed production, multi-tier coordinati…

2026-06-09 13:00 JSTarXiv cs.AIエージェント研究/論文

Benchmarking Open-Ended Multi-Agent Coordination in Language Agents

As language models are increasingly deployed as autonomous agents, they must coordinate with others over long horizons in open-ended intera…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

TT-DAC-PS: Twin-Target Deterministic Actor-Critic with Policy Smoothing for Optimal Trade Execution

This study addresses the optimal execution of large stock sell programs by introducing TT-DAC-PS (Twin-Target Deterministic Actor-Critic wi…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

Self-Evolving Scientific Agent Discovers Generalizable Physically-Reasoned Fluid Control

While data-intensive deep reinforcement learning can optimize complex control policies, scientific discovery in physical systems fundamenta…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Trajectory-Refined Distillation

On-policy distillation (OPD) has become a central post-training tool for large language models (LLMs), providing dense per-token teacher su…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

GIFT: LLM-Guided State-Reward Interface for Financial Reinforcement Learning

Financial portfolio trading is naturally formulated as a reinforcement learning problem, where an agent sequentially rebalances assets unde…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

A Variability-Based Framework for Interpretable Naming in Formal and Relational Concept Analysis

Knowledge extraction from symbolic data often produces abstractions that are formally defined but not immediately interpretable by users. F…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

Testing the Black Box: Structural Barriers to Independent Evaluation of Consumer-Facing Health LLMs

Background: Consumer-facing large language models are now a common source of health information, and they interpret and personalize respons…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

What Makes a Desired Graph for Relational Deep Learning?

Relational deep learning (RDL) converts relational databases (RDBs) into heterogeneous graphs, but graphs derived directly from database sc…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Explaining Black-Box Language Models: Learning to Optimize Linguistically-Structured Word Subsets

As deep language models (DLMs) are increasingly deployed in high-stakes domains such as healthcare, understanding their decision rationale…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Standpoint Logics with Defeasible Beliefs

In this paper, we integrate the defeasible logic of Kraus, Lehmann and Magidor (KLM) with the standpoint logic framework of G\'omez \'Alvar…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Scaffold Effects on GAIA: A Controlled Comparison

Published agent capability scores conflate what a model can do with what its scaffold lets it do, and the magnitude of this elicitation gap…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェントビジネス/資金調達

VESTA: A Fully Automated Scenario Generation and Safety Evaluation Framework for LLM Agents

Large language models (LLMs) are increasingly evolving from simple text-based interaction systems into LLM agents that can maintain memory,…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

DN-Hypo-Pipeline: An AI-Driven Workflow for Hypothesis Generation via Large Language Models and Scientific Explanations

A scientific hypothesis is the first step in research and undergoes experimental validation, yet it also reflects a deep understanding of a…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

AgentTrust: A Self-Improving Trust Layer for AI-Agent Actions

AI agents increasingly take consequential actions -- shell commands, cloud operations, and arbitrary tool-calls -- so a trust layer must de…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

PAEC: Position-Aware Entropy Calibration for LLM Reasoning in RLVR

Reinforcement learning with verifiable rewards (RLVR) improves large language model reasoning but often suffers from rapid policy-entropy c…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

Quantitative Promise Theory: Intentionality and Inference in Autonomous Agents

I discuss some quantitative representations of Promise Theory for processes involving autonomous agents. Agent models are common in softwar…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Distilling LLM Reasoning into an Interpretable Policy Tree for Human-AI Collaboration

Constructing efficient and reliable policies to assist humans is indispensable for human-AI collaboration. Existing methods mainly follow t…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

InA-Probe: Instruction-Aware Active Probing for Time Series Forecasting with LLMs

Large Language Models (LLMs) have recently demonstrated impressive potential for time series forecasting. However, existing methods predomi…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Towards Long-Horizon Vessel Trajectory and Destination Forecasting with Reasoning Large Language Models

Long-horizon maritime trajectory prediction is important for shipping management, logistics planning, and maritime risk analysis, yet month…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Extending Ontologies: From Dense Embeddings to Hybrid Quantum-Fuzzy Systems

LLMs have revolutionized knowledge representation and retrieval, but lack the explicit modeling that knowledge ontologies possess. This pap…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

ConMem: Structured Memory-Guided Adaptation in Training-Free Multi-Agent Systems

Recent advances have improved the adaptive capabilities of LLM-based multi-agent systems (MAS) through memory-, skill-, and learning-based…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Artificial Intelligence for Mathematical Reasoning: An Integrated Survey of Language Models, Neuro-symbolic Systems, and Verified Discovery

Mathematical reasoning has long served as a stringent test of machine intelligence; over the past decade, it has moved from a niche problem…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Structure-Conditioned Actor-Critic Branches for Quality-Diversity Reinforcement Learning

Quality-diversity reinforcement learning (QD-RL) aims to construct policy repertoires that contain both high-performing and behaviorally di…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

RAILS: Verification-Native Clearing For Agentic Commerce

Autonomous agents negotiate, purchase, deploy code, and move funds, but no neutral mechanism determines whether they met their delegated ob…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Bridging Expert Knowledge and Automated Feature Engineering via Self-Evolution

In high-stakes settings such as brand compliance, clinical care, and content moderation, machine learning cannot be deployed as opaque orac…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Q-Delta: Beyond Key-Value Associative State Evolution

Linear attention reformulates sequence modeling as recurrent state evolution, enabling efficient linear-time inference. Under the key-value…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

STAR: Rethinking MoE Routing as Structure-Aware Subspace Learning

Mixture-of-Experts (MoE) scales model capacity efficiently by selectively routing inputs to a specialized subset of experts. However, input…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Momentum for Reasoning: Dense Intrinsic Signals in Policy Optimization

Reinforcement learning with verifiable rewards (RLVR) has emerged as a powerful paradigm for eliciting long-chain reasoning in large langua…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Inference-Time Conformal Reasoning with Valid Factuality Control for Large Language Models

Large language models (LLMs) increasingly perform multi-step reasoning, where intermediate claims form implicit directed acyclic graphs who…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Instrumental convergence and power-seeking

Recent years have seen increasing concern that artificial intelligence may soon pose an existential risk to humanity. One leading ground fo…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

Beyond Pass Rate: A Multilingual, Execution-Grounded Evaluation of Open Code LLMs

Code generation models are typically compared using compact execution benchmarks and aggregate pass rates, but such summaries obscure how p…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成ハードウェア/半導体

ZIPP:Zero-shot Image Personalization from Personas

Text-to-image diffusion models are increasingly deployed in open-ended creative contexts, yet their outputs remain impersonal, optimized fo…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

A Resilience-as-a-Service assessment framework for coordinated disruption response in interdependent urban transit systems

Urban public transport disruptions require rapid response strategies, yet existing studies rarely provide a decision support framework to c…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

Hybrid E-Assessment in Higher Education: Semi-Automated Grading of Paper-Based Written Examinations

This paper examines the limitations of fully digital and partially digital e-assessment approaches in summative examinations in higher educ…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Can the Environment Speak for Itself? $T^{2}$-GRPO: A Turn-Trajectory Group Relative Policy Optimization for Caregiver Agents

Optimizing large language models (LLMs) for long-horizon caregiver agents requires balancing delayed task objectives with immediate environ…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

FAME: Forecastability-Aware Mixture of Experts for Heterogeneous Time Series Forecasting

Large-scale retail and industrial forecasting systems contain many heterogeneous time series whose lifecycle, sparsity, volatility, seasona…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

Order Matters: Unveiling the Hidden Impact of Macro Placement Sequences via Proxy-Guided LLM Evolution

Macro placement is a fundamental step in modern chip physical design, playing a crucial role in determining the solution quality of high-di…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Oversight Has a Capacity: Calibrating Agent Guards to a Subjective, Fatiguing Human

As LLM agents begin to take real, irreversible actions (shell commands, file edits, deploys), the standard safety pattern is a human-in-the…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

AlloSpatial: Agentic Harness Framework for Spatial Reasoning in Foundation Models

Multimodal Foundation Models (MFMs) have made substantial progress, yet remain fragile in spatial reasoning over the physical world. A key…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

An Effective Router for Vision-Language Model Selection

Vision-language models (VLMs) with varying performance and resource requirements are widely deployed, making it difficult for users to sele…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Diverse Thinking Schemata Elicit Better Reasoning in Large Language Models

Large reasoning models (LRMs) have attracted increasing attention for their ability to solve complex mathematical problems by generating ex…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

RTL-BenchLS: A Large-Scale Benchmark for RTL Reasoning and Generation with Large Language Models

LLM-based RTL generation and reasoning is a promising direction for hardware design automation. High-quality benchmarks are critical infras…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

Baichuan-M4: A Clinical-Grade Medical Agent System for Continuous Care

Baichuan-M4 is Baichuan Intelligence's clinical-grade medical large model, designed for \emph{continuous care} rather than single-turn medi…

2026-06-09 13:00 JSTarXiv cs.AIエージェントハードウェア/半導体

The Token Not Taken: Sampling, State, and the Variability of AI Agent Outputs

Agentic AI systems can behave differently across runs: the same request may produce a different plan, a different tool call, a different co…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

LATTEArena: An Evaluation Framework for LLM-powered Tabular Feature Engineering (Extended Version)

Feature engineering remains essential for tabular data analysis, and Large Language Models (LLMs) have emerged as a promising paradigm for…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

A Multi-Agent System for IPMSM Design Optimization via an FEA-AI Hybrid Approach

Interior permanent magnet synchronous motor (IPMSM) design requires balancing conflicting objectives and multi-physics constraints, while m…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Personalization Meets Safety:Mechanisms,Risks,and Mitigations in Personalized LLMs

Large Language Models (LLMs) have enabled increasingly personalized interactions by adapting to users' preferences, contexts, and long-term…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

Agent Economics: An Entropy-Controlled Pluralistic Alignment Framework for Preventing Artificial Hivemind in Autonomous Agents

This study proposes the Behavioral Protocol Framework (BPF), an entropy-controlled pluralistic alignment framework designed to address two…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

REFLECT: Intervention-Supported Error Attribution for Silent Failures in LLM Agent Traces

Large language model (LLM) agents now solve complex tasks through long plan-and-execution traces, yet the ability to locate errors in a com…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

DynaOD: Dynamic Origin-Destination Flow Generation with Discrete-to-Continuous Temporal Semantic Modeling

Dynamic origin-destination (OD) flow generation seeks to synthesize realistic mobility dynamics from temporal context alone, without relyin…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Graph2Idea:Retrieval-Augmented Scientific Idea Generation with Graph-Structured Contexts

Generating novel, feasible, and high-quality research ideas is an important yet challenging task in scientific discovery.Recent Large Langu…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

ComplexConstraints and Beyond: Expert Rubrics for RLVR

As LLM capabilities advance rapidly, the evaluation methods used to assess them increasingly lag behind. Traditional benchmarks relied on p…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

A Regret Minimization Framework on Preference Learning in Large Language Models

Reinforcement learning with verifiable rewards (RLVR) has enabled progress on reasoning-intensive tasks by relying on task-specific verifie…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Late-Layer Fusion is Enough: Dual-Path Vision Token Routing for Multimodal Large Language Models under Visual Saturation

Multimodal large language models (MLLMs) commonly inherit the deep, symmetric Transformer backbone designed for unimodal text modeling, and…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Vision Language Model Helps Private Information De-Identification in Vision Data

Visual Language Models (VLMs) have gained significant popularity due to their remarkable ability. While various methods exist to enhance pr…

2026-06-09 13:00 JSTarXiv cs.AIハードウェア/半導体ビジネス/資金調達

Reliable to Expressive: A Curriculum for Rubric-Following Safety Judges

Safety judges are increasingly deployed to evaluate model outputs against evolving criteria, yet recent meta-evaluation work shows they rem…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

IMUG-Bench: Benchmarking Unified Multimodal Models on Interleaved Understanding and Generation

In recent years, unified multimodal models (UMMs) have emerged to support both understanding and generation within a single framework. Mast…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

MASS: Deep Research for Social Sciences with Memory-Augmented Social Simulation

Deep Research agents powered by Large Language Models (LLMs) have exhibited extraordinary potential in automated paper writing tasks. Howev…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

FF-JEPA: Long-Horizon Planning in World Models with Latent Planners

Joint Embedding Predictive Architectures (JEPAs) have shown promising world modeling capabilities, enabling planning in latent space by opt…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

Anything2Skill: Compiling External Knowledge into Reusable Skills for Agents

Retrieval-augmented generation (RAG) enables agents to access external knowledge at inference time, but it primarily retrieves fragmented d…

2026-06-09 13:00 JSTarXiv cs.AIビジネス/資金調達

TRL-Bench: Standardizing Cross-Paradigm Representation-Level Evaluation of Tabular Encoders

Tabular encoders are usually evaluated inside task-specific end-to-end pipelines, so models from different training paradigms are difficult…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Leveraging Structural Constraints for Diffusion-based Neural TSP Solvers

Neural combinatorial optimization has recently achieved strong results on the Euclidean Traveling Salesman Problem (TSP) using generative m…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Experience Makes Skillful: Enabling Generalizable Medical Agent Reasoning via Self-Evolving Skill Memory

Medical agent systems are increasingly expected to support interactive clinical decision making rather than only static question answering.…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Capability-Aligned Hierarchical Learning for Tool-Augmented LLMs

Tool learning enables LLMs to invoke external tools to accomplish tasks. Prior studies have demonstrated the effectiveness of a hierarchica…

2026-06-09 13:00 JSTarXiv cs.AIビジネス/資金調達

From Coarse to Fine: Managing Temporal Granularity in Spatio-Temporal Data for Fine-Grained Traffic Prediction

Efficient acquisition, storage, and utilization of traffic data are critical challenges in spatio-temporal data management. Most traffic da…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

RunAgent SuperBrowser: A Theory of Autonomous Web Navigation Grounded in Human Browsing Behaviour

We present SUPERBROWSER, an autonomous web-navigation agent designed against a single guiding hypothesis: a web agent should browse the way…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Correct Looks Better: Pairwise Comparisons Reveal Accuracy Rankings

Pairwise comparisons combined with aggregation methods like Elo have become central to evaluating generative models, yet concerns remain th…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

Capacity, Not Format: Rethinking Structured Reasoning Failures

Prior work treats structured output as a reasoning tax, but this framing is incomplete: the cost of formatting depends strongly on a model'…

2026-06-09 13:00 JSTarXiv cs.AIエージェント研究/論文

WeaveBench: A Long-Horizon, Real-World Benchmark for Computer-Use Agents with Hybrid Interfaces

Computer-use agents (CUAs) increasingly operate in runtimes that combine visual desktop control, command-line execution, code editing, brow…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Bayesian Selective Latent Inference for Wastewater-First Influenza Monitoring

Wastewater influenza surveillance can reveal community circulation before clinical reporting, but wastewater alone is not a fully identifia…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

SIFT: Selective-Index For Fast Compute of RAG Prefill by Exploiting Attention Invariance

Retrieval-Augmented Generation (RAG) injects LLM queries with relevant documents to improve response quality. This injection increases prom…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

AliyunConsoleAgent: Training Web Agents in Real-World Cloud Environments via Distillation and Reinforcement Learning

We present AliyunConsoleAgent, a web agent framework for automated documentation verification in real-world cloud consoles. Major cloud pla…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

TheoremBench: Evaluating LLMs on Theorem Proving in Formal Mathematics

LLMs have recently achieved strong results on formal proving benchmarks. However, existing evaluations remain heavily concentrated on compe…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Emergent alignment and the projectability of ethical personas

Work on `emergent misalignment' shows that finetuning LLMs on narrow tasks can induce broadly misaligned behavior. This supports the `perso…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

LLM-Orchestrated Conformance Checking in Stroke Care Without Computer-Interpretable Guidelines

Objective: Conformance checking in healthcare seeks to assess whether patient care pathways adhere to clinical guidelines. However, its pra…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Deterministic Integrity Gates for LLM-Assisted Clinical Manuscript Preparation: An Auditable Biomedical Informatics Architecture

Objective. Large language models (LLMs) increasingly draft clinical research manuscripts, but their fluency can hide fabricated citations,…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

From Rigid to Dynamic: Entropy-Guided Adaptive Inference for Long-Context LLMs

Existing sparse attention and KV cache compression methods for long-context LLM inference typically apply fixed sparsity patterns or unifor…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェントビジネス/資金調達

AI Scientists Are Only as Good as Their Evidence: A Stratified Ablation of Proprietary Data and Reasoning Skills in Drug-Asset Valuation

AI Scientist agents are often evaluated as if capability were mainly a function of model quality, prompting, or reasoning scaffolds. We tes…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェントハードウェア/半導体

PRISM: Recovering Instruction Sets from Language Model Activations

As LLMs are deployed as agents, reliable monitoring requires knowing not only what they output, but which instructions are steering their b…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Self-Explainability in Self-Adaptive and Self-Organising Systems: Status and Research Directions

The growing complexity of self-adaptive and self-organising systems, fuelled by advances in Artificial Intelligence (AI), has made them inc…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

TABVERSE: Benchmarking Cross-Format Table Understanding in LLMs and VLMs

Large Language Models (LLMs) and Vision-Language Models (VLMs) are increasingly evaluated on table reasoning tasks, but the role of table r…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Optical Reasoning: Rethinking Images as an Expressive Reasoning Medium Beyond Text

Chain-of-Thought (CoT) improves the performance of Large Language Models (LLMs) and has been extended to Multimodal Large Language Models (…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Next-Token Prediction Learns Generalisable Representations of Sleep Physiology

Foundation models offer a promising route to compress multi-modal physiological signals into compact representations of human health, with…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

From 0-to-1 to 1-to-N: Reproducible Engineering Evidence for MetaAI Recursive Self-Design

Recursive self-design refers to AI-assisted modification of the mechanisms by which an AI system is built, evaluated, and improved. This pa…

2026-06-09 13:00 JSTarXiv cs.AIハードウェア/半導体

Frequency-based Constrained Sampling for Interval Patterns

Output space pattern sampling is a powerful alternative to exhaustive pattern mining for exploring large pattern spaces, as it enables user…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

SpatialWorld: Benchmarking Interactive Spatial Reasoning of Multimodal Agents in Real-World Tasks

Spatial reasoning is a foundational capability for multimodal large language models (MLLMs) to perceive and operate within the physical wor…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Correlation Is Not Enough: Embedding Human Metadata for Individual Causal Discovery

Ask a pretrained biomedical language model whether "cortisol 28 ug/dL" and "stock-market volatility" are related, and it returns a cosine s…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

(Auto)formalization is supposed to be easy: Trellis process semantics for spelling out rigorous proofs

We present Trellis: an autoformalization system that leverages LLM agents in a deterministically constrained workflow to enforce incrementa…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Proxy Reward Internalization and Mechanistic Exploitation: A Learned Precursor to Reward Hacking and Its Generalization

Reward hacking is usually studied after it becomes visible, once a model earns high proxy reward while failing the intended task. We instea…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Beyond Probabilistic Similarity: Structural, Temporal, and Causal Limitations of Retrieval-Augmented Generation in the Legal Domain

Retrieval-Augmented Generation (RAG) has become a standard architectural response to unreliability in legal AI, yet high-profile failures,…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

SearchSwarm: Towards Delegation Intelligence in Agentic LLMs for Long-Horizon Deep Research

Large language models are increasingly expected to handle complex, long-horizon real-world tasks whose context demands can grow without bou…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェントハードウェア/半導体ビジネス/資金調達研究/論文

Multi-Turn Evaluation of Deep Research Agents Under Process-Level Feedback

Existing benchmarks for deep research agents (DRAs) assess only single-shot outputs, ignoring a key question: can DRAs improve their report…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Collaborative Human-Agent Protocol (CHAP)

Foundation models are moving from response generation into operational roles. They plan across steps, call tools, request human input, coor…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

SIGA: Self-Evolving Coding-Agent Adapters for Scientific Simulation

Advanced scientific simulators expose specialized input languages that turn simulation goals into executable configurations, but learning t…

2026-06-09 13:00 JSTarXiv cs.AIビジネス/資金調達研究/論文

Evaluation Cards: An Interpretive Layer for AI Evaluation Reporting

AI evaluation results are produced at scale but reported inconsistently across leaderboards, model cards, benchmark papers, and company blo…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

XAInomaly: Explainable and Interpretable Deep Contractive Autoencoder for O-RAN Traffic Anomaly Detection

Generative Artificial Intelligence (AI) techniques have become integral part in advancing next generation wireless communication systems by…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

BRAIN: Bayesian Reasoning via Active Inference for Agentic and Embodied Intelligence in Mobile Networks

Future sixth-generation (6G) mobile networks will demand artificial intelligence (AI) agents that are not only autonomous and efficient, bu…

2026-06-09 13:00 JSTarXiv cs.AIロボティクス

Blockchain Infrastructure for Intelligent Cyber--Physical--Social Systems:Post-Quantum Security, Interoperability, and Trustworthy Data Economies in the Era of Embodied AI

The deployment of embodied artificial intelligence via world-model-based robotics presents a transformative opportunity for blockchain infr…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Bidirectional Small-Granularity Search between Code and Text

We introduce the novel task of bidirectional small-granularity search between code and text, where the queries are small snippets of text o…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Evaluating Hallucinations in Domain-Adapted Large Language Models

This study investigates the phenomenon of hallucinations in domain-adapted Large Language Models (LLMs), focusing on the fine-tuning of the…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Retrieval Augmented Generation Framework for the Nepali Legal Domain Question Answering

Legal domains in high-resource languages like English have widely adopted artificial intelligence for legal question answering. However, da…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

ABLE: Representing and Mapping LLMs via Attribution-Based Large-model Embedding

The explosive growth of large language models (LLMs) has created a heterogeneous and poorly documented ecosystem, making systematic model c…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Implicit Causal Graph Construction in Text via Chain Discovery

Causal graphs in text are typically populated by observable, predefined events. In contrast, we study implicit causal graph construction fr…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

GraphLoRA: Structure-Aware Low-Rank Adaptation for Large Language Model Recommendation

Large Language Models (LLMs) have shown strong potential for recommendation (LLMRec) due to their powerful reasoning and generalization abi…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Post-training is (Massive) Supervised Learning

The prevailing paradigm for training LLMs has evolved to rely on a massive post-training phase consisting of SFT and RL. In this position p…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

BEACON: Behavioral Entropy Aggregation for Cross-Model Hallucination Detection in Large Language Models

Hallucination in large language models (LLMs), defined as the generation of factually incorrect or unsupported content, remains a critical…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

CAPruner: Conceptual-Adjacent Scene Graph Pruner for Enhancing 3D Spatial Reasoning of Large Language Models

Large language models (LLMs) have recently been applied to 3D vision-language (3D-VL) tasks, which require spatial reasoning to identify ta…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

mllm-shap: A Shapley Value Explainability Platform for Text-Audio Multimodal Large Language Models

We introduce mllm-shap, an open-source Python framework designed to extend Shapley Value (SV) explainability from text-only Large Language…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Principled Agent Debate: Adversarial Arbitration for Sycophancy Reduction in Large Language Models

RLHF-trained models are systematically biased toward agreement over accuracy, a structural property of the training process. We present Pri…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Bridging Traditional Explainability Methods and Multimodal Multilingual Models: An XAI-Based Analysis

Multimodal Large Language Models (MLLMs) effectively integrate text and audio to interpret context in complex interactive dialogues. Howeve…

2026-06-09 13:00 JSTarXiv cs.AIハードウェア/半導体

Beware of GeeksBearing Gifts: Building True EU Frontier AI Sovereignty

Frontier artificial intelligence is reshaping all aspects of society, from economic output or military capability to democratic institution…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Bidirectional Semantic Complementary Tool Retrieval for Remote Sensing Agents

Large language model (LLM)-based agents provide a novel paradigm for the automated processing of remote sensing(RS) data. Their success in…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成ビジネス/資金調達

Multimodal Large Language Models as Synthetic Participants in Video-Based Studies: An Evaluation

Multimodal large language models (MLLMs) have shown strong performance on objective tasks such as video understanding and reasoning. Howeve…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

DIYHealth Suite: Dataset, Model, and Benchmark for Health Management at Home

Generative AI is reshaping healthcare, yet most existing advances rely on hospital-grade devices, which limits their accessibility and pote…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Concerns and Strategic Responses of Older Workers Navigating Generative AI in Bridge Employment

Generative AI (GenAI) is transforming workplaces at a rapid pace. This disproportionately affects vulnerable communities, including older w…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

AI-Integrated Learning Management System for Middle School: A Longitudinal Study of Learning Outcomes Through High School and Beyond

Middle school is a key window for building core academic skills and the learning routines students carry into later grades, yet many studen…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Beyond Item IDs: Scaling Short-Form-Video Recommendation via Semantic-Native Long Sequence Modeling

Capturing user interests across extensive watch histories is critical for short-form video recommendation, yet scaling sequence length is l…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

Liberating LLM Capabilities in Full-Duplex Speech Models

Speech-based large language models are typically constrained to spoken replies, which limits their user-facing outputs to what can be verba…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Evaluating Advanced Prompting on Gemini Flash for Multi-Hop Biomedical QA

The MedHopQA challenge presents a critical test for Large Language Models (LLMs): complex, multi-hop reasoning in the high-stakes biomedica…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Offline Reinforcement Learning for Plasma Control in Nuclear Fusion: Codebase and Benchmark

Offline reinforcement learning (RL) offers a promising route for developing plasma controllers from historical tokamak data, since online t…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Symbolic Reasoning Frameworks Modulate LLM Risk Aversion in Multi-Agent Strategic Settings

Large language models exhibit innate behavioral tendencies when deployed as strategic agents -- notably a risk-averse "turtle" bias toward…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

MedicalRec: Medical recommender system for image classification without retraining

The emergence of machine learning and deep learning has revolutionized the efficiency of diagnostic, therapeutic, and administrative system…

2026-06-09 13:00 JSTarXiv cs.AIビジネス/資金調達

Selecting New Measurement Locations to Diversify Traffic-Pattern Coverage: A Real-World Evaluation for Total Traffic Volume Estimation

Accurate measurement of traffic volumes and flows is vital for modern intelligent transportation. However, despite recent technological adv…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

Page image classifier fine-tuned on century-spanning archives of scanned documents for further content-specific processing

Purpose: Digitization projects in the humanities produce vast, heterogeneous archives of historical documents, making manual sorting imprac…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Phantom transitions in language model fine-tuning

Fine-tuning a language model on contexts whose correct completion has a near-synonym competitor often fails silently. The cross-entropy los…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

The Montparnasse Algorithm for RNA Design

RNA design consists of discovering a nucleotide sequence that optimizes predefined criteria, such as secondary structure. It is useful for…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Emergence via Phase Transitions: Mechanism Landscapes and Universal Convergence Across Complex Systems

Across machine learning, biology, and physics, independently evolving systems often converge toward strikingly similar high-level structure…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Considerations for an Integrated Detector Design at FCC-ee: A Human-AI Exploration

This report explores detector design considerations for the Future Circular Collider in its electron-positron mode (FCC-ee) through an exte…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

SurfDesign: Effective Protein Design on Molecular Surfaces

Protein function is largely determined by molecular surface geometry and physicochemical complementarity, yet most protein design methods c…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

A Systematic Study of Behavioral Cloning for Scientific Data Annotation

Scientific data annotation, such as tracking animals in video or proofreading neural reconstructions, remains bottlenecked by the "last mil…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Enabling KV Caching of Shared Prefix for Diffusion Language Models

Key-value (KV) caching for shared prefixes is essential for high-throughput large language model (LLM) serving, but it faces critical chall…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Accelerating Birkhoff Projection for Manifold-Constrained Hyper-Connections

Manifold-constrained hyper-connections (mHCs) have recently been proposed as a principled extension of hyper-connections, where the residua…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Training-Inference Kernel Contracts: Bounding Divergence in Post-Training and Deployment

A modern post-training pipeline often writes one symbol for its policy, pi_theta, while evaluating it through two different programs: a tra…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Customer Churn Prediction on Structured Data Using FT-Transformer and Stacking Ensembles

Customer churn prediction is essential across data-driven industries such as insurance, digital banking, eCommerce, and subscription platfo…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Outage Detection in Self-Healing Smart Grids Using Reinforcement Learning with Spectral Graph Neural Networks

Self-healing smart grids can quickly adjust their network configuration during outages to minimize power disruptions. During an outage, sev…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

Multimodal Group Emotion Recognition In-the-Wild Towards a Privacy-Safe Non-Individual Approach

This thesis addresses group emotion recognition (GER) in-the-wild with a focus on privacy preservation. Unlike traditional emotion recognit…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

From Human Guidance to Autonomy: Agent Skill System for End-to-End LLM Deployment on Spatial NPUs

Spatial neural processing units (NPUs) provide an energy-efficient platform for edge LLM inference, but efficiently deploying an LLM end-to…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

SlideCheck: Guiding Self-Supervised Pretraining of Pathology Foundation Models via Dataset Distributions

Pathology foundation models are pretrained on large streams of WSI-derived patches, while supervision during data construction is often sli…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

ResearchClawBench: A Benchmark for End-to-End Autonomous Scientific Research

AI coding agents are increasingly used for scientific work, but their end-to-end autonomous research capability remains difficult to verify…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

A Mechanistic Analysis of Adversarial Fine-tuning of Vision Transformers

The widespread use of image classification models in high-risk, real-world situations necessitates making these models robust to slight dis…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成エージェント

VisualLeakBench: Reproducible Action-Boundary Propagation Failures in Vision-Language Agents

Vision-language agents increasingly consume screenshots, documents, and user interfaces before writing to memory, sending messages, or invo…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Repetition Mismatch: Why Data Mixture Experiments Don't Scale and How to Fix Them

Pre-training data mixtures are commonly tuned by running small-scale experiments and extrapolating to the target training budget. When high…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

A Topological Characterization of Graph Neural Networks via Stochastic Block Model Embeddings on the n-Sphere

We propose a topological framework for comparing trained Graph Neural Networks (GNNs) by mapping the Stochastic Block Models (SBMs) induced…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

DiffoR: A Unified Continuous Generative Framework for Universal Ordinal Regression

Ordinal Regression (OR) aims to predict target values with inherent order, underpinning critical applications across diverse domains, from…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Reachability and asymptotics of Gaussian Transformer dynamics

We formulate data propagation through the Transformer, the machine learning architecture powering large language models, as a nonlinear con…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

LFNO: Bridging Laplace and Fourier via Transient-Steady Decomposition

We introduce the Laplace-Fourier Neural Operator (LFNO), a unified framework for modeling dynamical systems across transient and steady-sta…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Sample-Efficient Post-Training for LEGO Spatial-Physics Reasoning

LLM-based LEGO assembly generation requires both semantic grounding and physical feasibility. We identify a data-induced failure mode, Phys…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

MetaEvo: A Meta-Optimization Framework for Experience-Driven Agent Evolution

Large language models (LLMs) exhibit strong reasoning capabilities, yet most LLM-based agents are statically deployed and unable to improve…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Contribution Weights: A Geometrical Analysis of Self-Attention Transformers

Analyzing attention weights has become a standard approach for interpreting the information flow of Large Language Models (LLMs). However,…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

SRT: Super-Resolution for Time Series via Disentangled Rectified Flow

Fine-grained time series data with high temporal resolution is critical for accurate analytics across a wide range of applications. However…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Subtitle-Aligned Fine-Tuning of Whisper for Swiss German ASR: Benchmark Contamination, Convention Mismatch, and an Honest Baseline at 25.6% WER (13.8% cWER)

We present a systematic study of fine-tuning OpenAI's Whisper large-v3 for Swiss German ASR, using 1,367 hours of broadcast speech paired w…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

LEAF: Growing Trees Without Branching for Speech-Aware Large Language Model Post-Training

State-of-the-art GRPO-style methods for speech-aware large language model post-training suffer from coarse credit assignment, broadcasting…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

MIRAGE: Metadata-Integrated Repository Analysis and Guided Enhancement for MSR Datasets

This paper proposes an improved approach to the analysis of Mining Software Repositories (MSR) datasets via metadata enrichment, FAIRness a…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Position: Anthropomorphic Misalignment Research Needs Stronger Evidence

We argue that many Anthropomorphic Misalignment Research (AMR) studies need stronger evidence to ensure that they can provide a robust foun…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

Can You Trust What You See? Human and AI Detection of Synthetic Legal Evidence

Visual evidence has long been treated as a reliable form of legal proof, but advances in artificial intelligence (AI) are undermining that…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Structured Neuron Pruning in Deep Neural Networks Using Multi-Armed Bandits

Deep neural networks often contain redundant hidden units. Removing individual weights can reduce parameter count, but unstructured sparsit…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Item Response Scaling Laws: A Measurement Theory Approach for Efficient and Generalizable Neural Scaling Estimation

Scaling laws provide a fundamental framework for understanding the performance of Language Models (LMs), yet deriving them requires prohibi…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Query Lens: Interpreting Sparse Key-Value Features with Indirect Effects

While sparse autoencoders provide features more interpretable than individual neurons, reliably characterizing them remains challenging. We…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

ScaleSweep: Accurate NVFP4 Post-Training Quantization of LLMs via Block Scale Initialization

NVFP4 is a recently introduced hardware-supported FP4 format that improves the fidelity of 4-bit quantization through fine-grained block sc…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成エージェント

SENTRY: Statistical Reliability Analysis of Vision Transformers Under Soft Errors

With the growth of Vision Transformers in safety-critical domains like autonomous systems and medical imaging, ensuring their reliability a…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

HASA: Subnet Allocation for Compute-Constrained Model-Heterogeneous Federated Learning

Edge services increasingly use federated learning to personalize on-device models while keeping sensitive data local. In practice, deployme…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成エージェント

Eyes All Around: Design and Analysis of 360-Degree LiDAR Perception Using Equivariant Feature Learning in Unstructured Traffic

Perception in dense, unstructured urban traffic remains a major challenge for autonomous driving because of the wide variety of road users,…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Large Language Models Should Learn Personalized Rather Than Aggregated Human Preferences

Current approaches to aligning large language models (LLMs) aggregate diverse human preferences into a single reward signal, effectively op…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Active Learning with Foundation Model Priors: Efficient Learning under Class Imbalance

Real-world datasets across image and text domains are often characterized by skewed class distributions and noisy annotations, which jointl…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Trait-space Monitoring for Emergent Misalignment During Supervised Finetuning

Emergent misalignment (EM) occurs when narrow finetuning causes a model to behave dangerously outside the finetuning task. Standard trainin…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

AMN: An Adaptive Multi-Scale Fusion Network with Boundary and Uncertainty Modeling for Nuclei Segmentation

Accurate classification of nuclei subtypes in histopathology images is critical for downstream tasks including tumor grading, immune infilt…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

NeuroAlign: Hierarchical Multimodal Fusion of Dynamic and Structural Neuroimaging for MCI Analysis

Multimodal neuroimaging fusion of functional MRI (fMRI) and diffusion tensor imaging (DTI) provides complementary information for cognitive…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

Anchor-Conditioned Compositional Control for Landscape Image Generation

Image generative models, though widely used as creative tools, offer limited support for the kind of compositional control that photographe…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

MOSS-Video-Preview: Toward Real-Time Video Understanding via Cross-Attention

Video understanding is shifting from the offline paradigm -- taking a fully recorded video as input and producing a single answer after it…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

No Free Lunch for Synthetic Images under Data Scarcity Conditions

This study investigates the trade-offs between fidelity, privacy, and utility in synthetic data generation under conditions of data scarcit…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

AVI-Bench: Toward Human-like Audio-Visual Intelligence of Omni-MLLMs

Recent advances in Omni-Multimodal Large Language Models (Omni-MLLMs) have enabled strong integration of vision, audio, and language. Howev…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成エージェント

FineGen: A VLM-based Multi-Agent Framework for Fine-Grained Image-Text Dataset Construction

The scarcity of hard negative samples in current vision-language datasets significantly hinders fine-grained perception. To address this, w…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

DOME: Learning Transferable Domain Variables from Sparse Supervision for Test-Time Adaptation

Test-time adaptation (TTA) aims to align a model to shifting test domains using only unlabeled streaming data. Most existing methods implic…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

AQIFormer: A Transformer-Based Multi-View Architecture for Cross-City Air Quality Classification

Air pollution represents one of the most critical environmental and public health challenges globally, with traditional sensor-based monito…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成エージェント

ViMax: Agentic Video Generation

Long-form video generation requires systematic narrative planning and visual consistency that current short-clip methods cannot provide. Ex…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

A Dataset for Dynamic Human Preferences for Vision Language Models

Given the increased adoption of Vision Language Models (VLMs) in human-interactive settings, it is important that we evaluate how well thes…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

MM-Matryoshka: Towards Budget-Elastic Visual Document Retrieval via a 2D Multimodal Matryoshka Training Framework

Multi-vector visual document retrievers achieve strong fine-grained matching by representing each page with multiple vectors from deep Visi…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Seq103: A Unified Neuroevolution Framework for Compact Sequence Architecture Discovery

Neuroevolution is a representative neural architecture search paradigm that evolves both network topology and weights through evolutionary…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

AgentCompile: An LLM-Guided Compiler for Direct CUDA Inference

Transformer inference increasingly depends on specialized compiler and runtime support, but real model graphs still require semantic decisi…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

MemoVAD: Resource-Efficient Video Anomaly Detection via Dynamic Semantic Memory in Edge Computing Scenarios

Deploying Video Anomaly Detection (VAD) in real-world surveillance faces a fundamental tension between the demand for high-level semantics…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

Liquid Neural Networks as a Drop-in Continuous-Time Deformation Field for Dynamic 3D Gaussian Splatting

Deformable 3D Gaussian Splatting (D-3DGS) re-constructs dynamic scenes from monocular video by deforming a canonical set of 3D Gaussians th…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

A Hierarchical Feature Engineering Framework for Automated Classification of Phonotraumatic and Non-Phonotraumatic Vocal Hyperfunction

Ambulatory neck-surface acceleration enables non-invasive monitoring of vocal hyperfunction, yet robust biomarkers for its subtypes remain…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Single-Cell Cross-Modal Transfer by Adversarial Fine-Tuning of Foundation Models

Spatial transcriptomics (ST) is a powerful tool for exploring biological properties dependent on structure, proximity, and interaction in t…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

DOG-DPO:Dynamic Optimization in Geometry for Safety Alignment

Safety alignment for large language models relies on preference data, but current pipelines often train on large, redundant datasets. Exist…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Systematic LLM Translation of Legacy Scientific Code to Differentiable Frameworks: Application to a Land Surface Model

Differentiable programming offers transformative capabilities for scientific modeling, enabling gradient-based parameter estimation, sensit…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

SWE-Marathon: Can Agents Autonomously Complete Ultra-Long-Horizon Software Work?

AI agents are increasingly expected to complete long-horizon workflows that require sustained progress over hours, millions of tokens, and…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Semantic Cache Distillation: Efficient State Transfer via Reuse and Selective Patching

Disaggregated serving alleviates memory bottlenecks in Large Language Model (LLM) inference but creates a severe communication bottleneck:…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Test-Time Adaptive Composition for Machine Learning as a Service (MLaaS) in IoT Environments

The dynamic nature of Internet of Things (IoT) environments affects the long-term effectiveness of Machine Learning as a Service (MLaaS) co…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Knowledge-Inclusive Adaptive Physics-Informed Neural Network for Microbial Interaction Modelling

Physics-Informed Neural Network (PINN) is a way of including knowledge in the form of equations in Machine Learning methods. Beyond equatio…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

What Makes Video World Model Latents Action-Relevant: Prediction over Reconstruction

Video world models are increasingly used to provide predictive visual representations, yet it remains unclear which pretraining signals ind…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

TRACER: Token ReAssignment for Concept ERasure in Generative Recommendation

Generative recommendation formulates next-item prediction as autoregressive generation over semantic ID (SID) sequences derived from users'…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

HARP: Efficient Data Selection for Finetuning Large Language Models

Finetuning data selection requires balancing two competing goals: selecting examples that improve the downstream objective, and doing so wi…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

BCG-FM: A Foundation Model for Ambient Cardiac Health Sensing

Foundation models for wearable biosignals have matched or exceeded supervised specialists across a range of clinical tasks, yet all rely on…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

DSFNet: Learning Dual-Domain Spectral Operators for Multi-Modality Spatio-Temporal Forecasting in Urban Transportation Systems

Multi-Modality Spatio-Temporal Forecasting (MoSTF) extends traditional spatio-temporal forecasting by incorporating diverse traffic modalit…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Adversarial Robustness of Activation Steering in Large Language Models

Activation steering has become a popular training-free method to control LLM behavior by injecting precomputed direction vectors into the m…

2026-06-09 13:00 JSTarXiv cs.AIエージェント研究/論文

TianJi-Environ: An Autonomous AI Scientist for Atmospheric Environmental Research

As atmospheric environmental prediction continues to improve, interpretable validation of pollution mechanisms and feedback processes has b…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Pharmacogenomic Knowledge Graph Augmentation for Graph Neural Network-Based Drug-Drug Interaction Prediction

Graph neural networks (GNNs) applied to drug-drug interaction (DDI) prediction rely exclusively on molecular structure encoded as SMILES-de…

2026-06-09 13:00 JSTarXiv cs.AIハードウェア/半導体

EssentialGIN: a new approach for gene essentiality prediction based on graph isomorphism neural networks

Background: Prediction of essential genes (proteins), is a basic and challenging problem but at the same time very costly and time-consumin…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

EvoCSFL: Surrogate-Assisted Evolutionary Client Selection for Efficient and Robust Federated Learning

The heterogeneity of client data and systems makes it difficult to achieve satisfactory convergence speed and robustness in federated learn…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

How Much Dense Attention is Necessary? Oracle-Guided Sparse Prefill for Full/GQA Layers in Hybrid Long-Context Models

Long-context prefill remains expensive because full/GQA layers still score the historical sequence, even in hybrid models with local, spars…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

FunctionEvolve: Structure-Guided Symbolic Regression with LLMs

Symbolic regression aims to uncover explicit scientific laws from data. Recent methods use LLMs to guide mutation from background text, whi…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

SAW: Stage-Aware Dynamic Weighting for Multi-Objective Reinforcement Learning in Large Language Models

Although multi-objective reinforcement learning (MORL) is central to aligning large language models with complex human preferences, the pre…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

MLingualFC: Evaluating Jailbreak Vulnerabilities in Multilingual Vision-Language Models

Vision-Language Models (VLMs) have demonstrated strong performance across multimodal tasks, yet their safety robustness remains an open cha…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成ロボティクス研究/論文

Cross-View Urban Traffic Dataset: Drone-Supervised Ground Truth for Monocular Bird's-Eye View Localization

We introduce a dataset and benchmark for cross-view urban traffic perception built from synchronized ego-centric bicycle videos and aerial…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

WhiFlash: Accelerating Speculative Decoding with Token-Level Cross-Paradigm Routing

The autoregressive nature of large language models (LLMs) remains a significant bottleneck for inference, particularly in complex agentic w…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Rosetta Memory: Adaptive Memory for Cross-LLM Agents

Memory is the key component for transforming a stateless LLM into a persistent, evolving agent through experience accumulation, long-horizo…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

MatMind: A Structure-Activity Knowledge-Driven Generative Foundation Model for Materials Science

Progress in AI-driven crystal materials science has so far been carried by narrow architectures purpose-built for individual tasks -- graph…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Attention at the Theoretical Minimum: A Mathematics of Arrays Framework for Memory-Optimal Transformer Kernels

The attention mechanism is the dominant computational bottleneck in modern transformer-based AI. Its standard implementation incurs quadrat…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Beyond Accuracy: Interpreting Topic Representation in Suicide Ideation Detection Models

Suicide ideation detection models are typically evaluated using aggregate performance metrics, yet little is known about how they internall…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

SHIELD-IDS: Structurally Heterogeneous Ensemble with Integrated Layered Defense for Intrusion Detection Systems

Adversarial attacks pose a serious and growing threat to Machine Learning (ML)-based Intrusion Detection Systems (IDS), where imperceptible…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

Multi-planar 2D-U-Net Segmentation of 3D-CT Abdominal Organs augmented by Spatial Occurrence Maps

This work proposes a lightweight 2D-U-Net-based framework for segmenting five abdominal organs in large field-of-view 3D CT scans. The meth…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

Quantum-Enhanced Similarity Measures for Polarimetric Materials Classification

We present a quantum--classical hybrid pipeline for polarimetric material classification that casts this as a point-matching problem. Voxel…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Beyond Point Estimates: Benchmarking Uncertainty Quantification Methods on the AION-1 Astronomical Foundation Model

Foundation models for astronomical surveys offer powerful learned representations that can be transferred to downstream regression tasks su…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Memetic Capture: A Pluralistic Policy Framework for Governing AI-Driven Cultural Disempowerment

Culture is the most insidious vector of gradual human disempowerment by AI: unlike economic or political displacement, cultural displacemen…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

SLMJury: Can Small Language Models Judge as Well as Large Ones?

Large language models (LLMs) are widely used as judges for evaluating model outputs, but their high cost, latency, and opacity limit scalab…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

The ACUTE Protocol: Operationalizing Language Model Activations for Better Calibration, Utility, and Trust

As language models improve and become increasingly deployed to solve a variety of tasks, trustworthiness becomes essential. Calibration is…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Jas: AI-Paired Engineering as a Revival of N-Version Programming

I report a case study in AI-paired software engineering: five working ports of a vector illustration application across Rust, Swift, OCaml,…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

Beyond Pass/Fail: Using Process Mining to Understand How LLMs Resist (and Fail) Red Team Attacks

Standard AI red teaming evaluations reduce adversarial campaigns to a single binary outcome, attack success rate (ASR), not taking into acc…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Cherry-pick Override: Unsafe Directional Commitment in LLM Judges under Mixed Evidence

LLM judges increasingly turn verdicts into system commitments. Under mixed evidence (claims with both supporting and refuting sources) this…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

Agentic multi-fidelity learning of quasiparticle and excitonic properties

Many-body GW-Bethe-Salpeter equation calculations are essential for accurate simulations of electronic structure and optical properties in…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Does Persona Make LLMs K-pop Fans? A Pilot Study of LLM-Based Online Concert Audience Agents

A concert is a collective experience, but recorded performance videos are typically watched alone, stripping away the shared audience prese…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Cost-Aware Speculative Execution for LLM-Agent Workflows: An Integrated Five-Dimension Method

LLM-agent workflows chain model calls and tool invocations, and spend most of their wall-clock time waiting on upstream operations before d…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

Beyond English benchmarks: clinical llm evaluation in Brazilian Portuguese

Large Language Models are transforming the support for clinical decision and their application in real scenarios. Yet, most benchmarks are…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Model Multiplicity for Adversarial Detection in Small Language Model Training on Edge Devices

The rise of edge-based machine learning has enabled distributed adaptation of language models across mobile and IoT devices, offering priva…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

The Last Visible Pixel: Probing Fine-Scale Perception in Vision-Language Models

Recent vision-language models (VLMs) excel at multimodal understanding and reasoning, yet their fine-grained visual perception remains unde…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Instrumented data for causal scientific machine learning

Scientific machine learning is limited less by model size than by the data it is trained on. Observational data records what happened but n…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

The Cross-Architecture Substrate: A Domain-Transcendent, Calibration-Surviving Geometric Invariant of Modern Vision Encoders

Different vision neural networks -- trained to classify, contrast, reconstruct, or match images to text -- should have correspondingly diff…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Strained Coherence: A Pre-Failure Signal in Coding Agent Execution Trajectories

LLM-based coding agents sometimes acknowledge a problem in their own reasoning and then proceed anyway. We call this pattern strained coher…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

3D Oral Modelling with Improved Vertex Distribution Using Matching-Based Learning

In our previous work, a deep learning-based framework for 3D intraoral reconstruction was proposed. The model directly predicts explicit 3D…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Larch: Learned Query Optimization for Semantic Predicates

With the advent of Large Language Models (LLMs), many database systems introduced semantic operators that enabled analytical queries over u…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成研究/論文

Decoupling Semantics and Logic: A Training-Free Coarse-to-Fine Pipeline for Video Retrieval-Augmented Generation

This paper presents our system description for the 2nd Workshop on Multimodal Augmented Generation via MultimodAl Retrieval (MAGMaR). Addre…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

Illusions of the Gold Standard: A Large-scale Analysis of Human Evaluation Protocols for Long-form Text Generation

Human evaluation plays a critical role in assessing the quality of generated text. However, the reliability and reproducibility of these ev…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

POISE: Position-Aware Undetectable Skill Injection on LLM Agents

Agent skills provide a lightweight mechanism for extending general-purpose agents, but their open format exposes them to skill-poisoning at…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

From `May' to `Is': Certainty Distortion in Language Model Rewriting

Humans increasingly turn to Language Models (LMs) in ways that shape beliefs and drive decisions, including discussing, rewriting, and summ…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Minibatch Selection via Partition Matroid Constrained Gradient Matching

Training large language models (LLMs) on heterogeneous data requires selecting minibatches that balance convergence speed with coverage acr…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

RecurGuard: Runtime Monitoring for Reasoning-Token Consumption Attacks

Reasoning-capable large language models can be induced to spend their generation budget on injected decoy tasks rather than answering the u…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Neutrality Bites: Gender Representation in AI-Generated Animal Stories

Gender bias in AI-generated stories is a well-documented problem. While much attention has been paid to reducing or mitigating this bias, i…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Defending Against Malicious Finetuning by Scaling Train-time Adversarial Attacks

Current open-weight large language models (LLMs) are prone to malicious finetuning attacks, which could compromise the safety alignment of…

2026-06-09 13:00 JSTarXiv cs.AIロボティクス

PRISM: PRior-guided Imagination Sampling in world Models

A learned world model provides a powerful physical intuition for evaluating future states. But its effectiveness in continuous control also…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

MC-PDD: Masked Corpus-Level Pretraining Data Detection for Black-Box Large Language Models

Pretraining is fundamental to the development of Large Language Models (LLMs), yet the opacity of pretraining data complicates model analys…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

Enhancing AI Interpretability and Safety through Localised Architectures

Recent advances in generative AI, especially powerful Large Language Models (LLMs) and Large Reasoning Models (LRMs), raise concerns over t…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Summarization is Not Dead Yet

The progress of large language models (LLMs) has fueled claims that model-generated summaries rival or even surpass human-written reference…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Rewrite to Translate, Translate to Reward: Reinforcement Learning for Source Rewriting in Machine Translation

Although directly prompting off-the-shelf Large Language Models (LLMs) to generate meaning-preserving source rewrites can effectively enhan…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

GVC-Seg: Training-Free 3D Instance Segmentation via Geometric Visual Correspondence

Accurate 3D instance segmentation in point cloud data is critical for machine vision applications. Recent advancements leverage multiple pr…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成エージェント

IEA: Amateur-Friendly Conversational Image Editing Agent via Three Stages of Multitask Alignment

Current image editing software often hinges on fixed filters or expert tuning, leaving a gap between amateur users' intent and outcomes. Cr…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Repair Before Veto, When Repair Is Hidden: Quantum-Accessible Features for Repair-Augmented Constraint Learning

Hard-constraint decision systems usually veto infeasible candidates. This is too rigid when the system can act: if a known affordable repai…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Semantic Quorum Assurance: Collective Certification for Non-Deterministic AI Infrastructure

As large language model (LLM) agents are integrated into autonomous cloud operations, distributed systems face a semantic reliability probl…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

CausShield: Sample Reconstruction-Resilient Vertical FL via Causal Representation Learning

Vertical federated learning (VFL) is a distributed learning paradigm that leverages vertically partitioned features across isolated parties…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

Voting Protocols as Coordination Mechanisms for Role-Constrained Multi-Agent Tutoring Systems

Agentic tutoring systems introduce a coordination challenge: multiple agents may propose different but reasonable interventions, yet only o…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成研究/論文

Sci-Rho: A Multilingual Visually-Grounded Symbolic Benchmark for STEM Problems

Symbolic benchmarks have emerged as a key approach to assess model robustness under minor modifications to STEM-related questions. However,…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

GIScholarBench: Benchmarking LLM Overconfidence in GIS Research

Large language models (LLMs) are increasingly used in academic research workflows, but scholarly tasks require high factual precision and t…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

SafeECGMatch: Calibration-Aware Joint Frequency and Time Space Semi-Supervised Learning for Open-Set ECG Classification

Electrocardiogram (ECG) classification models often suffer from severe label scarcity, making semi-supervised learning (SSL) an attractive…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

When Behavioral Safety Evaluation Fails: A Representation-Level Perspective

Large Language Model (LLM) safety has often been evaluated at the behavior level, which provides limited evidence of internal robustness, a…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

What's the Point? Spatial Grammar & Index Resolution for Sign Language Processing

Sign language models are predominantly trained with gloss-sequence or text supervision, thereby under-modeling non-lexical and productive c…

2026-06-09 13:00 JSTarXiv cs.AIロボティクス

EgoAERO: Learning Dexterous Manipulation from a Single Egocentric Video without Object Assets

Egocentric RGB-D videos offer a natural source of human dexterous manipulation demonstrations, but existing data is difficult to use for ro…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Robust-U1: Can MLLMs Self-Recover Corrupted Visual Content for Robust Understanding?

Multimodal Large Language Models (MLLMs) have demonstrated remarkable success in visual understanding, yet their performance degrades signi…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

"I understand your perspective": LLM Persuasion and Sycophancy through the Lens of Communicative Action Theory

Large Language Models (LLMs) can generate high-quality arguments, yet their ability to engage in nuanced and persuasive communicative actio…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Aligned but Not Partner-Specific: Distinguishing How Multimodal LLM Agents Succeed in Reference Games Without Human-Like Conventions

Repeated reference games test whether interlocutors replace their initially long descriptions with shorter, partner-specific conventions gr…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Fast LLM-Based Semantic Filtering: From a Unified Framework to an Adaptive Two-Phase Method

Evaluating a natural-language yes/no predicate over a document corpus under an accuracy target - the semantic filter - is a cornerstone of…

2026-06-09 13:00 JSTarXiv cs.AIロボティクスハードウェア/半導体

vla.cpp: A Unified Inference Runtime for Vision-Language-Action Models

Vision-Language-Action (VLA) policies are typically shipped as Python/PyTorch stacks that assume a workstation-class GPU, a mismatch for th…

2026-06-09 13:00 JSTarXiv cs.AIロボティクス

Continual Quadruped Robots Coordination via Semantic Skill Discovery

Multi-quadruped coordination has attracted increasing attention due to its enhanced payload capacity, broader contact coverage, and improve…

2026-06-09 13:00 JSTarXiv cs.AIロボティクス研究/論文

Ego-Pi: VLA Fine-Tuning for Ego-Centric Human and Robot Data

Robotics faces a fundamental challenge of data scarcity. Unlike language or vision research, there is no internet-scale dataset for robotic…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

Human-Centered Benchmarking of Driver Monitoring Models

Vision-based driver monitoring systems are increasingly deployed in safety-critical intelligent transportation settings, yet they are almos…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

LCAM: A Framework for Diagnosing Interactional Alignment Failures in Con-versational AI

Conversational AI is increasingly used for advice, interpretation, reassurance, and decision support in contexts where users may be vulnera…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

LogNEO: A GPT-Neo Reinforcement Learning Framework for Accurate Real-Time Log Anomaly Detection

Detecting anomalies in large-scale system logs is critical for the reliability and security of modern computing infrastructure. We present…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

RAPID: Layer-Wise Redundancy-Aware Pruning and Importance-Driven Token Merging for Efficient ViT

Vision Transformers (ViTs) achieve strong performance but suffer from high computational costs due to quadratic self-attention complexity.…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Constrained Paraphrase Consistency for LLM Hallucination Detection

Large language models (LLMs) can generate factually inconsistent claims, motivating accurate and scalable hallucination detectors. Prior wo…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Explaining Data Mixing Scaling Laws

Recent research has established empirical scaling laws to predict model performance on multi-domain data mixtures. However, a theoretical u…

2026-06-09 13:00 JSTarXiv cs.AIエージェントビジネス/資金調達

Closing the Sim-to-Real Gap: An Evaluation Framework for Autonomous Cyber Defense Configuration of Commercial EDR

Leading commercial endpoint detection and response (EDR) products have shifted from operator-configured rule sets to multi-component system…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIロボティクス

CLASP: Language-Driven Robot Skill Selection and Composition using Task-Parameterized Learning

Enabling robots to understand and execute tasks from natural language commands while maintaining data efficiency remains challenging. Found…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

The Governance of Human-LLM Interaction: Safety Gating, Civility Steering, and Affective Default Lock-In

Large language models (LLMs) increasingly mediate high-stakes interactions in finance, medicine, and mental-health support, yet users have…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Frequency-Domain Latent Attention Gating for Cross-Domain Token Aggregation

Token aggregation is a common bottleneck in models that map token representations to sample-level predictions, yet most pooling methods ope…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

GlobeAudio: A Multilingual Multicultural Benchmark for Naturalistic Evaluation of Large Audio-Language Models

Large Audio-Language Models (LALMs) integrate audio perception and language understanding within a unified framework, enabling a wide range…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Beyond Additivity: Causal Discovery in Location-Scale Noise Models with Hidden Variables

We study causal discovery from observational data when some variables are hidden and the data-generating process follows a location-scale n…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

How Deep Are Deep GPs, Really? A Sharp Threshold and a Non-Gaussian Limit for Compositional GPs

Compositional priors describe the generic properties of layered functions in deep Bayesian models, where deep neural networks with random w…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

AeroSpectra Sentinel: An Auditable LLM Prompt-Chaining Decision-Support Workflow for Acute Asthma Risk Assessment from Respiratory Sounds and Clinical Signals

Acute asthma risk assessment requires rapid interpretation of respiratory sounds, oxygenation, airflow limitation, speech ability, work of…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Contemporary AI lacks the imagination to diverge or negate in science

Bold projections that artificial intelligence will accelerate scientific discovery have raced ahead of evidence from working scientists, an…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Post-AGI Economies: Superposition and the Second Fundamental Theorem of Welfare Economics

The classical Second Welfare Theorem decentralizes any Pareto efficient allocation through prices and transfers under convexity and regular…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

An AI Security Agent for University ACMIS: Multi-Vector Threat Detection and Automated Response

University Academic Management Information Systems (ACMIS) are high-value targets for a wide spectrum of security threats including brute-f…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

AgriGov: A Structured Multilingual Dataset Curation for Indian Government Schemes for Farmers

AgriGov is a curated, trilingual (English-Hindi-Marathi) dataset designed to address the scarcity of domain-grounded multilingual resources…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Causal Agent Replay: Counterfactual Attribution for LLM-Agent Failures

When an LLM agent fails -- issues a refund it should not have, calls the wrong tool, leaks data -- existing tooling answers what happened (…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

"So There's a Catch-22 Here": How Early Adopters Who Build Multi-Agent LLM Systems Conceptualize Transparency

Multi-agent large language model (LLM) systems are rapidly emerging, yet transparency, a cornerstone of responsible AI, remains under-defin…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

Set-Based Transformer for Atmospheric Compensation in Standoff LWIR Hyperspectral Imaging

Passive long-wave infrared (LWIR) hyperspectral imaging under a standoff geometry depends on atmospheric absorption and emission, as well a…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Chiaroscuro Attention: Spending Compute in the Dark

Standard transformers apply self-attention uniformly at every layer and token, regardless of whether the input requires dynamic cross-token…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Generative Frontier Planning for Adaptive Peer-Referral Recruitment under Covariate-Dependent Arrivals

Peer-referral recruitment systems such as respondent-driven sampling are critical for studying and intervening on hidden populations affect…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

Self-Supervised Vision Transformers for CBCT-Based Detection of Temporomandibular Joint Osteoarthritis

Temporomandibular joint osteoarthritis (TMJ OA) is a prevalent degenerative condition whose osseous changes are often subtle on cone-beam C…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Pre-Intervention Prediction of Sparse Autoencoder Steering Side Effects

Sparse autoencoder (SAE) features are increasingly used to steer language models, but feature steering is rarely clean: the same interventi…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェントビジネス/資金調達

Emergence World: A Platform for Evaluating Long-Horizon Multi-Agent Autonomy

Most evaluations of LLM agents look like exams: a discrete task, a clean environment, a score in minutes or hours. We argue that this appro…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

An Information-Theoretic Definition for Open-Ended Learning

A growing body of work points to the great promise of AI systems that can continually expand their capabilities as they operate in an open-…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

RiskNet: A large-scale dataset of AI risk incidents from news with alignment and multi-dimensional annotations

As artificial intelligence (AI) systems are increasingly deployed across socially consequential domains, reports of AI-related harms and fa…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Auditing Proprietary Alignment in Large Language Models: A Comparative Framework Without a Ground-Truth Standard

Large language models (LLMs) are increasingly released and deployed through opaque development and deployment pipelines, enabling model pro…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

STAR-KV: Low-Rank KV Cache Compression via Soft Thresholding for Adaptive Rank Control

Low-rank projection has emerged as a promising approach for compressing the KV cache by exploiting hidden-dimension redundancy. However, pr…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Impacts of Histories and Models on LLM Grading: A Study in Advanced Software Engineering Courses

Graduate-level research reading report assessment creates a substantial labor burden for educators. While large language models (LLMs) hold…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成エージェント

SceneConductor: 3D Scene Generation from Single Image with Multi-Agent Orchestration

Generating complete 3D scenes from a single image requires inferring globally consistent geometry, object relationships, and environmental…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Hiding in Plain Floats: Steganographic Carriers for Indirect Prompt and Content Injection

Text-centered prompt-injection defenses assume that the malicious signal is visible in one of the inspected text views. We study a reproduc…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

TimpaTeks: Automatic In-place Text Sequence Modification via Diffusion Language Model Steering

We extend activation steering to diffusion language models (DLMs) and study a novel problem that arose due to the inference mechanism of DL…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Provably Efficient Personalized Multi-Objective Bandits with Proactive Conversational Queries

Personalized decision-making in multi-objective bandits requires learning user-specific trade-offs among competing objectives. Since arm ut…

2026-06-09 13:00 JSTarXiv cs.AIロボティクス

PACT: Self-Evolving Physical Safety Alignment for Diffusion Policies in Embodied Manipulation

Diffusion policies have achieved remarkable success in robotic manipulation, yet they often fail to satisfy strict physical constraints req…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

CoVEBench: Can Video Editing Models Handle Complex Instructions?

While recent text-guided video editing models excel at elementary tasks (e.g., style transfer, object insertion), real-world user requests…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

Hacking Generative Perplexity: Why Unconditional Text Evaluation Needs Distributional Metrics

Diffusion and continuous flow-based language models have emerged as the leading non-autoregressive alternatives to language modeling. Progr…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

AI Code Sandboxes: A Comparative Security Study. Part 1 of 2 -- Engine-Level Properties (Attack Surface, Leakage, Stackability, CVE History, Patch Cadence, Fuzzing)

This paper reads six engine-level measurements together -- 1.1 host attack surface, 1.2 information leakage, 1.3 defense-in-depth stackabil…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Segment-level Tree Search for Long Meeting Document Summarization

Meeting documents are challenging to summarize due to their length and complex conversational structure. Existing approaches typically adop…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Sparrow: Sparse Rollout for Stable and Efficient Long-context RL of Large Language Models

Despite being powerful, reinforcement learning with verifiable rewards (RLVR) induces extremely long COT, making it computationally expensi…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Not Just After One: Sleep-Inspired Replay Prevents Catastrophic Forgetting After Sequential Tasks

One of the critical limitations of artificial neural networks is their lack of ability to continually learn: training on new tasks often le…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Sycophancy as a Multilingual Alignment Failure: How Safety Degrades Across Languages, Topics, and Models

Safety-aligned large language models often exhibit sycophancy, which is the tendency to affirm users' opinions regardless of factual accura…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

The Confidence Trap: Calibration Attacks for Graph Neural Networks

While confidence calibration is essential for trustworthy decision-making in safety-critical applications, the robustness of calibrated GNN…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

More Yap Less Meaning: Uncovering Self-Improvement Behavior in SLMs

Recently, language models have made rapid progress across various domains and applications. However, their capability for self-improvement,…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

FlashCP: Load-Balanced Communication-Efficient Context Parallelism for LLM Training

Context parallelism (CP) is essential for training large-scale, long-context language models, as it partitions sequences to reduce memory o…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Adaptive Loss Balancing for Noise-Robust GRPO in Generative Recommendation

Reinforcement learning (RL) presents a promising avenue for enhancing generative recommendation beyond supervised imitation, leveraging rew…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

PIPE-Cypher: Automatic Enterprise Benchmark Generation for Text-to-Cypher Systems

Enterprise property graphs vary widely in schema structure, internal terminology, domain assumptions, governance constraints, and user inte…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

STELLAR: Spatio-Temporal Environmental Learning with Latent Alignment and Refinement for Long-Tailed Species Distribution Modeling

Joint Species Distribution Modeling (JSDM) is a key enabler for biodiversity monitoring and conservation planning. However, accurate JSDM f…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Seeing is Believing: Aligning Prompt Rewriting with Visual Anchors for Text-to-Image Generation

Despite the impressive capabilities of text-to-image (T2I) models, an intent-generation gap often persists due to the brevity and ambiguity…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

Projecting the Emerging Mindset of SWE Agent by Launching a Wild Code Understanding Journey

Software engineering agents (SWE agents) increasingly work through tool-mediated trajectories in real repositories, yet their behavior rema…

2026-06-09 13:00 JSTarXiv cs.AIロボティクス

ActProbe: Action-Space Probe for Early Failure Detection of Generative Robot Policies

Generative robot policies fail unpredictably at deployment: they hesitate at critical moments, drift off-task, or commit to unrecoverable a…

2026-06-09 13:00 JSTarXiv cs.AIロボティクス研究/論文

GEAR-VLA: Learning Geometry-Aware Action Representations for Generalizable Robotic Manipulation

Vision-Language-Action (VLA) models achieve strong benchmark performance but still struggle in real-world deployment with unseen objects, b…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成ロボティクス

When Video Misreads: Closed-Loop Distillation of Reading Heuristics for Exploratory Manipulation Trace QA

Exploratory manipulation often turns an apparent failed attempt into the key evidence for what to do next. For example, a robot pulls a loc…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

EinSort: Sorting is All We Need for Tensorizing LLM

Tensor networks provide efficient representations for compressing large neural networks. By carefully designing shapes and topologies, they…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Calibration of Structured Ignorance Certificates for Diagnosing Unknown Unknowns in Reasoning Models

Large language models frequently fail in a characteristic way: rather than acknowledging ignorance, they produce fluent but incorrect answe…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Auditable Graph-Guided Root Cause Analysis for Kubernetes Incidents

Kubernetes incidents are diagnosed reliably only when a root-cause system's reported gains come from incident evidence rather than scenario…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Reinforcement Learning for Flow-Matching Policies with Density Transport

We present an online reinforcement learning (RL) algorithm for fine-tuning flow-matching policies in continuous-control problems. Our key i…

2026-06-09 13:00 JSTarXiv cs.AIエージェントロボティクス

HARBOR: A Harness Framework for Agentic Robot Reinforcement Learning

Reinforcement learning (RL) has become a powerful paradigm for robot learning, particularly in sim-to-real settings, but its broader adopti…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Tyan-WP: A Wind Power Foundation Model for Ultra-Short-Term Probabilistic Forecasting

Global wind power capacity, especially in China, is booming, with new farms spanning diverse terrains and climates. The industry urgently n…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

A retrieval conditioned rebinding circuit for dynamic entity tracking in large language models

To interpret context correctly and retrieve relevant information, large language models must bind entities to their attributes and update t…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Sample-Efficient LLM-Based Detection of Malicious Web Server Logs with Forensically Explainable Reasoning

Forensic analysis of web server logs demands both accurate detection and human-readable explanations that can satisfy legal requirements. W…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

Reconstructing Synthetic SDO/AIA 193 A EUV Images from He I 10830 A Observations with Diffusion Model Translator

Routine full-disk EUV imaging has been available only since the modern era, such as SOHO and SDO. To extend EUV coronal context into earlie…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成ロボティクス

FiberTune: Preserving Action-Fiber Visual Residuals in Vision-Language-Action Fine-Tuning

Action-supervised fine-tuning of vision-language-action (VLA) policies fits demonstrations effectively but constrains only the directions t…

2026-06-09 13:00 JSTarXiv cs.AIロボティクス

Latent Diffusion Policy: Shaping Latent Spaces for Diffusion-Based Robotic Manipulation

Diffusion-based visuomotor policies operating directly in raw action spaces conflate scene comprehension with trajectory generation within…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Data Agents Under Attack: Vulnerabilities in LLM-Driven Analytical Systems

Data agents integrate LLM-driven reasoning with relational data access, executable analytical tools, and multi-step workflow orchestration,…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

BioVid: Autoregressive Video Generation with Biological Behavior Semantic Comprehension

Existing video generation frameworks treat sequence duration as an externally prescribed parameter -- fixed frame counts or text prompts --…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Lost in the Flow with Code Talkers: Unveiling the Instruction-Tuning Tax of Large Language Models in Code Tasks

AI coding assistants have significantly improved developer productivity by automatically suggesting code that aligns with user intent, and…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

Activation Steering Induces Emergent Misalignment: A More Comprehensive Evaluation

Activation steering has emerged as a popular inference-time technique for modulating the behavior of large language models (LLMs). By const…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Agentic Search for Counterfactual Recourse under Fixed LLM Budgets

Counterfactual recourse aims to provide actionable feature changes that would alter an unfavorable decision made by a predictive model. In…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

Structuring agentic AI for HPC code modernization

Modernization of legacy scientific codes is often necessary to keep up with the ever-evolving changes in the compute resource ecosystem. Pa…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

SNR-ST-Mix: Sample-specific Neighborhood Regression Mixup for Augmented Spatial Transcriptomics Imputation with Deep Neural Network

Purpose: Spatial transcriptomics (ST) enables gene expression measurements within the tissue context. However, these measurements are often…

2026-06-09 13:00 JSTarXiv cs.AIロボティクス

Hybrid Neural Network and Conventional Controller Approach for Robust Control of Highly Unstable Systems: Application to Tilt-Rotor Control

Multirotors are widely used in applications ranging from surveillance to precision agriculture, yet conventional designs remain limited by…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Deep Active Re-Labeling: Toward Noise-Resilient Annotation Efficiency

While Deep Active Learning (DAL) effectively reduces human annotation costs, its efficacy is constrained by human annotation errors. This i…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

APEX4: Efficient Pure W4A4 LLM Inference via Intra-SM Compute Rebalancing

W4A4 quantization promises full utilization of INT4 Tensor Cores, yet group dequantization overhead on CUDA Cores has driven existing syste…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

RadOT-Eval: Auditable Structured-Evidence Transport for Radiology Report Evaluation

Automatic evaluation is critical for high-stakes text generation, where errors often involve omitted findings, hallucinated content, polari…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成ハードウェア/半導体

TeamHerald@CHIPSAL 2026: Hate Speech Detection and Sentiment Analysis of Nepali Memes using Transformer-based Architectures and Ensemble Learning

The analysis of internet memes in the Nepali language is complicated by frequent code-mixing and a lack of established baseline resources.…

2026-06-09 13:00 JSTarXiv cs.AIロボティクス

Unifying Object-Centric World Models and Diffusion Policy: A Hierarchical Framework for Multi-Stage Robotic Tasks

Visual world models have shown great potential in learning complex system dynamics. Recent advancements leverage these models as transition…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

How Many Counterfactuals Does It Take? Probing VLM Hallucinations Through Circuits and Causal Effects

Visual Language Models (VLMs) are known to produce hallucinated predictions that are not grounded in visual evidence, yet existing approach…

2026-06-09 13:00 JSTarXiv cs.AIハードウェア/半導体ビジネス/資金調達

Evaluating AI Investment Strategies

We study the problem of auditing a black-box algorithmic decision-maker from observable inputs and outputs alone. Our main result is an exa…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

AI-Augmented Closed-Loop Quality Engineering: A Reference Architecture for Continuous Software Quality Intelligence

The quality of software engineering is still under a challenge due to disjointed processes between requirements, testing, and production, w…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Scaling Decision-Focused Learning to Large Problems with Lagrangian Decomposition

Decision-focused learning has shown great promise for addressing predict-then-optimize problems, particularly in the presence of under-spec…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Governance Controls for AI-Generated Test Artifacts in Autonomous Software Testing

Artificial Intelligence (AI) and Large Language Models (LLMs) are increasingly used in autonomous software testing; however, AI-generated t…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Knowledge Graphs and Reasoning LLMs for Finding Simple Yet Effective Transcriptomic Perturbation Predictors

Predicting the effect of an unseen gene knockout perturbation on transcriptomic gene expression remains a highly challenging problem for vi…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

BLM-SGAN: Bidirectional Language Modeling for Semantic-Spatial Text-to-Image Generation

Despite the success of image generation from text descriptions, it still faces challenges that are difficult to overcome in domains such as…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Intrinsic Selection and Particle Resampling for Inference-Time Scaling Beyond Domain Verifiability

Inference-Time Scaling (ITS) has largely succeeded in verifiable domains like math and coding, where cheap verification enables scalable ou…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

sGPO: Trading Inference FLOPs for Training Efficiency in RLVR

Standard Reinforcement Learning with Verifiable Rewards (RLVR) training allocates a fixed rollout budget to every query, without regard for…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

Intelligent Character Recognition of Handwritten Forms with Deep Neural Networks

The automatic processing of handwritten forms remains a challenging task, wherein detection and subsequent classification of handwritten ch…

2026-06-09 13:00 JSTarXiv cs.AIロボティクスビジネス/資金調達研究/論文

Benchmarking Vision-Language-Action Models on SO-101: Failure and Recovery Analysis

Vision-Language-Action (VLA) models have demonstrated strong generalization in robotic manipulation, yet existing evaluations are primarily…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Cheap Reward Hacking Detection

A small transformer encoder is trained to map Terminal-Wrench trajectories onto a unit sphere where embedding distance approximates the $L_…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成エージェントビジネス/資金調達

A multi-agent system for spine MRI report generation from multi-sequence imaging

Spinal pathology is a leading cause of pain and disability worldwide. Spine MRI is central to clinical evaluation, yet its interpretation r…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Few-shot Class-variable Incremental Audio Classification via Prototype Adaptation and Pseudo Class-variable Training

In the task of few-shot class-incremental audio classification, the number of classes is assumed to always increase without considering the…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

Failure-Aware Refinement of Vision-Language Model for Lithography Defect Detection

Semiconductor lithography inspection requires reliable detection of small pattern defects such as bridge, burr, pinch, and contamination. I…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

PolyBuild: An End-to-End Method for Polygonal Building Contour Extraction from High-Resolution Remote Sensing Images

Extracting building polygon contours from high-resolution remote sensing images is a fundamental task for various mapping applications. How…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

From Statute to Control Flow: Span-Grounded Deontic Trees for Defeasible Scope Parsing

Rule-following agents tasked with executing policies and regulations often fail via Silent Scope Omission (SSO): a model applies a general…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

PAI: Preserving Amplitude Information in Representation-Based Time-Series Anomaly Detection

Representation-based time-series anomaly detection algorithms significantly outperform other methods on diverse anomaly detection tasks. Ho…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Report on CHIIR 2026 Workshop on Generative AI and Academic Search (GAI&AS)

This report summarizes the CHIIR 2026 Workshop on Generative AI and Academic Search (GAI\&AS), which examined how GenAI is reshaping academ…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

PACT: Learning Diverse Diagnostic Strategies via Privileged Synthesis and Branch Consensus

Clinical diagnosis requires flexible use of multiple reasoning paradigms under incomplete patient information. Existing LLM-based medical a…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

NutriMLLM: Multimodal Large Language Models for Dietary Micronutrient Analysis

Comprehensive estimation of dietary micronutrients from food images could improve clinical nutrition care, but training such models require…

2026-06-09 13:00 JSTarXiv cs.AIエージェント研究/論文

Hardening Agent Benchmarks with Adversarial Hacker-Fixer Loops

Agent benchmarks score submissions with outcome verifiers that are typically hand-written and brittle, leaving them open to reward hacking.…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

CARE: A Conformal Safety Layer for Medical Summarization

Large language models (LLMs) are increasingly used for medical summarization, but their outputs can omit medically important information an…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成エージェントロボティクス

SpaceVLN: A Zero-Shot Vision-and-Language Navigation Agent with Online Spatial Cognitive Memory and Reasoning

Vision-and-Language Navigation in continuous environments requires agents to understand the spatial structure of previously unseen environm…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Sustainability and Artificial Intelligence: Necessary, Challenging, and Promising Intersections

Both digital economy and digital technology researchers increasingly recognize the need to better address the role that artificial intellig…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Understanding Quantization-Aware Training: Gradients at Quantized Weights Bias to the Low-Loss Basin

Post-training quantization (PTQ) converts a trained full-precision model into low-bit weights without task-level retraining, while quantiza…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

TLDR: Compressing Audio Tokens for Efficient Autoregressive Text-to-Speech

Codec-based autoregressive (AR) speech language models have achieved strong text-to-speech (TTS) quality by modeling speech as sequences of…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

SafeRun: Enabling Determinism in LLM Planning for Running

Large Language Models enable flexible natural-language planning but remain unreliable in determinism-critical domains due to their probabil…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成ロボティクス

ATM: Action-Consistency Transfer Matrix for Diagnosing and Improving Latent World Models

Latent world models are increasingly used for control and goal-conditioned planning, yet assessing whether their learned representations ar…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

TRIAGE: Dialectical Reasoning for Explainable Risk Prediction on Irregularly Sampled Medical Time Series with LLMs

Clinical early warning systems built on electronic health records, in which clinical observations are recorded as irregularly sampled medic…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

BareWave: Waveform-Native Flow-Matching Text-to-Speech

Removing intermediate representations and separately trained decoding stages has become an important direction in generative modeling. In t…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

INFUSER: Influence-Guided Self-Evolution Improves Reasoning

Self-evolution offers a scalable path to stronger reasoning: a pretrained language model improves itself with only minimal external supervi…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

Stage-1 Controls the Entropy Regime, Not the Outcome

Two-stage post-training -- a Stage-1 warm-start (supervised fine-tuning, SFT, or on-policy distillation, OPD) followed by Stage-2 reinforce…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

See More, Think Deeper: Query-Expanded Visual Evidence and Answer-Clue Guided Reflection for Long Video Understanding

Recent advances in Video Large Language Models (Video-LLMs) have enabled performance on long-video understanding tasks. However, existing m…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

OnlyDense: Reduced-Order Modeling for Lagrangian simulation

In science and engineering, Lagrangian simulation methods such as Smooth Particle Hydrodynamics (SPH) or Material Point Method (MPM) are of…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

A Unifying Lens on Reward Uncertainty in RLHF

Reinforcement learning from human feedback (RLHF) is bottlenecked by \emph{reward hacking}, where the policy exploits errors in a proxy rew…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

FlashMemory-DeepSeek-V4: Lightning Index Ultra-Long Context via Lookahead Sparse Attention

Conventional LLMs keep the full KV cache loaded during decoding, causing a severe GPU memory bottleneck for ultra-long context serving. In…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Context-Fractured Decomposition Attacks on Tool-Using LLM Agents: Exploiting Artifact Provenance Gaps

Tool-using LLM agents interact with the world through actions that persist state in artifacts (e.g., workspace files or logs). Consequently…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Context Rot in AI-Assisted Software Development: Repurposing Documentation Consistency for AI Configuration Artifacts

Developers increasingly provide AI coding assistants with persistent context through configuration files such as CLAUDE.md, AGENTS.md, and…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Addressing Market Regime Changes and Heavy-Tailed Returns in Portfolio Optimization via Bayesian VAR and Elliptical Black-Litterman

Deep reinforcement learning (DRL) frameworks for portfolio optimization have shown promise for their ability to learn allocation rules dyna…

2026-06-09 13:00 JSTarXiv cs.AIハードウェア/半導体

Hybridizing Equilibrium Propagation with Ising Machines for Efficient Energy-Based Learning

The rapid evolution of artificial intelligence has led to substantial advances in deep neural networks. Nonetheless, conventional GPU-based…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Optimizing Energy-based Neural Network Training with Coherent Ising Machine

While Ising machines serve as advanced physical solvers for the Ising model,enabling applications in combinatorial optimization and neural…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

Autonomous Incident Resolution at Hyperscale: An Agentic AI Architecture for Network Operations

Cloud network infrastructure at hyperscale presents unique operational challenges where traditional human-driven incident response cannot k…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

An Enhanced Geometric-Spectral Feature Learning Framework for Airborne Multispectral Point Cloud Classification

Multispectral point cloud (MPC) is composed of 3D spatial-spectral information, which holds tremendous potential for accurate land-cover cl…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Unveiling Privacy Risks in Multi-modal Large Language Models: Task-specific Vulnerabilities and Mitigation Challenges

Privacy risks in text-only Large Language Models (LLMs) are well studied, particularly their tendency to memorize and leak sensitive inform…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成ロボティクス

From USD Scenes to Knowledge Graphs: Zero-Shot Ontology Grounding with LLMs

Constructing knowledge graphs from 3D simulation scenes is essential for robot task reasoning, but the key bottleneck, grounding scene obje…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Steganography Without Modification: Hidden Communication via LLM Seeds

We demonstrate that widely deployed Large Language Model (LLM) inference stacks harbor a steganographic channel that requires no modificati…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

Decoding Pedestrian Crossing Intention from Egocentric Vision via Vision Language Models

Egocentric vision offers a first-person view of human perception and decision making, yet its potential for traffic-safety prediction remai…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

SEF-CLGC at SemEval-2026 Task 11: Logical Notation Impact on Language Model Performance

This paper revisits our pipeline called Syllogistic Evaluation Framework-Common Logic Grammar Construction (SEF-CLGC). We combine formal lo…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Unified Energy for Invariant and Independent Decoding in Diffusion Language Models

Diffusion Language Models (DLMs) enable parallel text generation by iteratively denoising a full sequence, offering attractive flexibility…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Crop Recommendation and Agricultural Query Answering System Using Spatio-Temporal Graph Neural Networks and Hybrid Retrieval Augmentation

This paper presents a unified system designed to support precision agriculture by integrating advanced weather prediction, crop recommendat…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

CANS: Accelerating Multiuser Collaborative Edge Inference via Cooperative Autodidactic NeuroSurgeon

Recently, mobile edge computing (MEC)-enabled collaborative deep neural network (DNN) inference has emerged as a promising approach for del…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

Culturally-Adapted Red-Teaming Across East and Southeast Asian Contexts: A Methodological and Comparative Analysis

Multilingual safety evaluation of large language models (LLMs) has predominantly relied on direct translation (DT) of English benchmarks in…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Pretrained, Frozen, Still Leaking: Auditing Cross-Encoder Attribute Transfer in EEG Foundation Models

EEG foundation-model releases are usually audited one endpoint at a time: raw-reconstruction, membership inference, identity linkage, or DP…

2026-06-09 13:00 JSTarXiv cs.AIハードウェア/半導体

Resource-aware Computation-Communication Overlap for multi-GPU ML Workloads

The rapid growth of large-scale machine learning (ML) has made distributed training across multiple GPUs a fundamental component of modern…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Trustworthy Smart Fabs via Professional Proxies: Scaling Safe and Sustainable by Design (SSbD) through Industrial Data Spaces

The convergence of the 2026 European Union Safe and Sustainable by Design (SSbD) framework, Corporate Sustainability Due Diligence Directiv…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

End-to-End Training for Discrete Token LLM based TTS System

Recent state-of-the-art (SOTA) text-to-speech (TTS) systems typically adopt a cascaded pipeline consisting of a speech tokenizer, an autore…

2026-06-09 13:00 JSTarXiv cs.AIエージェントロボティクス

Self-Paced Curriculum Reinforcement Learning for Autonomous Superbike Racing in Simulation

Autonomous Racing has seen remarkable progress through deep Reinforcement Learning (RL), primarily for four-wheeled vehicles. However, moto…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成ロボティクス

EgoTactile: Learning Grasp Pressure for Everyday Objects from Egocentric Video

Estimating full-hand grasp pressure from egocentric video is critical for immersive VR and robotic manipulation, yet dense tactile sensing…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

Proposal Refinement for Few-Shot Object Detection

Few-shot object detection has gained widely attention in recent years. Some excellent algorithms have been proposed to handle this task. Ho…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

BSTabDiff: Block-Subunit Diffusion Priors for High-Dimensional Tabular Data Generation

High-Dimensional Low-Sample Size (HDLSS) tabular domains (e.g., omics) are characterized by $n \ll m$, where $n$ = number of samples, and $…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Physics-Guided Sequence-Based Generative Framework for Acoustic Metamaterial Inverse Design

Acoustic metamaterial (AMM) inverse design is particularly challenging for broadband target responses due to acoustic dispersion: a structu…

2026-06-09 13:00 JSTarXiv cs.AIハードウェア/半導体

Internalizing Geometric Law: Learning from Solver Residuals for Precision-Critical Generation

Large Language Models frequently hallucinate in precision-critical domains such as technical diagramming and mechanical design, where outpu…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Brain-Prompt Injection: A Route-Safety Audit for BCI-LLM Agents

BCI-to-agent pipelines turn decoded neural activity into an authorization channel for tool-use agents, exposing a new attack surface we cal…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

A Universal Dense Football Event Representation Based on TabTransformer

Football event data constitute a rich spatiotemporal source for quantitative analysis of player actions in team sports. These datasets cont…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Conan-embedding-v3: Fusing Modality-Specific Models for Omni-Modal Embedding

Omni-modal retrieval promises a single embedding space for text, image, video, document, and audio inputs, but building such a unified retr…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

Beyond Humans: Multispecies Animal Face Recognition Using Transfer Learning

Individual animal recognition can be useful in the search for lost or stolen pets, the tracking of individuals of endangered species, and t…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

PhysScene: A Scene Graph Dataset for Scientific Visual Reasoning in Physics Experiments

Scene Graphs (SGs) provide structured representations of visual scenes by modeling objects and their pairwise relationships. Despite recent…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Scaling Neural Network Verification with Tensor Parallelism and Fully Sharded Data Parallelism

Formal neural network verification -- proving that a network satisfies safety properties for \emph{all} inputs in a specified domain -- is…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Reasoning Arena: Trace Tournaments When Verifiable Rewards Fall Short

Reinforcement learning with verifiable rewards (RLVR) has become a leading paradigm for improving the reasoning ability of large language m…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成ロボティクス

Real-time body pose non-verbal communication with a consistency-based reliability measure

Body movement communicates intent at distances and in conditions where neither the face, nor speech can be captured. We study the recogniti…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

SAILS: Surrogate-based Analysis of Interactions via Local Effect Smooths

Feature interactions drive much of the predictive power of machine learning models, yet existing explanation methods only detect and quanti…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Can Data Work be Reparative?

We present an ethnographic study of an alternative approach to data work, developed by a civic-tech initiative that builds datasets for tra…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

AI Assurance in UK Defence: Challenges in Operationalising JSP 936

This report examines practical challenges in operationalising JSP 936 Part 1 for AI assurance in UK Defence. Using a structured interpretiv…

2026-06-09 13:00 JSTarXiv cs.AIロボティクス

Harness Engineering for Physical AI: Robot Middleware Is the Harness Layer

Robot middleware faces a new role in the era of Physical AI. Learned policies, planners, and vision-language-action (VLA) models now enter…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Context-Aware Deep Learning for Defect Classification in Atomic-Resolution STEM

Artificial intelligence is rapidly advancing materials characterization, yet most applications in electron microscopy rely solely on image…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

LargeMonitor: Monitoring Online Task-Free Continual Learning via Large Pretrained Models

Online task-free continual learning (TFCL) requires intelligent agents to sequentially accumulate knowledge from an unbounded, non-stationa…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

A Finetuned SpeechLLM for Joint Multi-Granular L2 Assessment and Natural-Language Rationales

Automated L2 speech assessment can assign proficiency labels, but often lacks interpretability. We propose a rubric-guided SpeechLLM for mu…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Memory Beyond Recall: A Dual-Process Cognitive Memory System for Self-Evolving LLM Agents

Long-term memory for an LLM agent is more than retrieving the right passage at the right time. Current memory systems collapse belief revis…

2026-06-09 13:00 JSTarXiv cs.AIロボティクス

Targeting World Models to Compromise Robot Learning Pipelines

World models have recently seen a rapid growth in both their popularity and capability as more data efficient tools for generating robot tr…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Closing the Prior-Posterior Loop: Self-Reflective Molecular Design with Analysis-Driven LLM Iteration

Can a general-purpose large language model design molecules with the precision of a seasoned chemist? Current LLM-based frameworks answer t…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Emergence of Context Characteristics Sensitivity in Large Language Models

During instruction fine-tuning (IFT), large language models (LLMs) learn to follow instructions by using the provided context to answer a q…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Model Poisoning Against Federated Model Adaptation with Chain of Bit-Flips

Federated Learning (FL) allows a set of clients to collectively train a global model without sharing local training data. Giving the respon…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

SecureClaw: Clawing Back Control of LLM Agents

Tool-using large language model (LLM) agents face two distinct security failures: unauthorized external actions and exposure of sensitive p…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

FuseFSS: Efficient Secure LLM Inference with Function Secret Sharing

Two-server secure inference allows a client to query a hosted large language model (LLM) without revealing prompts or embeddings. Recent GP…

2026-06-09 13:00 JSTarXiv cs.AIロボティクス

Safe-RULE: Safe Reinforcement UnLEarning

Offline safe reinforcement learning (Safe RL) enables policy learning without online interactions, making it suitable for safety-critical s…

2026-06-09 13:00 JSTarXiv cs.AIロボティクス

CT-VAM: A Cerebello-Thalamic-Inspired Vision-Action Model for Efficient Visuomotor Control

Vision-language-action models have shown strong promise for robot manipulation, yet raw language is primarily needed to specify task intent…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Seeing the Hivemind: A Consensus-Aware Interaction Technique for Mitigating AI Homogenization

People are increasingly using AI for creative tasks such as writing. While adoption continues to grow, this form of use risks undermining i…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

I Was Scrolling and Then I Saw a Pregnant Strawberry

AI minidramas (also known as fruit dramas) are short, algorithmically distributed generative AI video series featuring anthropomorphized ch…

2026-06-09 13:00 JSTarXiv cs.AIハードウェア/半導体

Closure-Validated Circuit Discovery in Attention Heads: Co-activation Proposes, Ablation Disposes

Interpretability increasingly treats groups of components, not individual units, as the basic object, and proposes to find them by clusteri…

2026-06-09 13:00 JSTarXiv cs.AIエージェントロボティクス

Shape Formation for the Cooperative Transportation of Arbitrary Objects Using Multi-Agent Reinforcement Learning

Cooperative object transportation is essential in numerous domains, including industrial to domestic services. A popular transportation str…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

AGENTSERVESIM: A Hardware-aware Simulator for Multi-Turn LLM Agent Serving

Multi-turn LLM agents interleave model calls with external tool invocations, shifting serving from stateless request processing to stateful…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Powering the Future of AI: Navigating the Trade-offs for Europe's Energy Transition and Net-Zero Goals

The rapid expansion of AI globally has led to the proliferation of energy-intensive hyperscale data centres (DCs), making them as a structu…

2026-06-09 13:00 JSTarXiv cs.AIロボティクス

ReCoVLA: VLM-Guided Reward Compilation for Failure Recovery in Vision-Language-Action Policies

Vision-language-action (VLA) policies provide strong priors for language-conditioned manipulation, but remain brittle in off-nominal states…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

ATN3D: Density-Aware LiDAR-Radar Early 3D Object Detection Under Extreme Sparsity

3D object detection is the backbone of perception for automated vehicles (AV) and broader intelligent transportation systems applications.…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

FMplex: Model Virtualization for Serving Extensible Foundation Models

Foundation models (FMs) are increasingly used as backbones for downstream tasks across language, vision, time-series, and multimodal applic…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

Do Video Foundation Models Understand Intuitive Physics? A Layerwise Probing Analysis

We study whether pretrained video foundation models encode intuitive-physics information in their frozen representations, and how this info…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

ArtiFact: A Large-Scale Multi-Modal Cultural Heritage Dataset

Multi-modal data management has emerged as a central research topic in the database community, spanning data integration, semantic query pr…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Muon Learns More Robust and Transferable Features than Adam

Muon has recently emerged as a state-of-the-art optimizer for pretraining Large Language Models (LLMs) and vision classifiers. Despite its…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

End-to-End Context Compression at Scale

Long-context language model inference is bottlenecked by memory, as the KV cache grows with context length. Recent techniques to compress t…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Visual Prompting Meets Feature Reconstruction-Based Anomaly Detection with Dual-Teacher Supervision

Recent Anomaly Detection methods achieve perfect detection and segmentation scores on well-established datasets, such as MVTec. However, ma…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Transition-Based Digital Twin Modelling for Alzheimer's Disease under Sparse Longitudinal Data

Alzheimer's disease (AD) progression is highly heterogeneous and is typically observed through sparse and irregular longitudinal data, posi…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

MeCo: One-Step MeanFlow-based Corrector for Multi-Channel Speech Separation

While discriminative models for multi-channel speech separation excel in reference-based metrics, they often exhibit suboptimal human liste…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

An 84-Format Numeric Catalog with Bit-Exact Conformance Vectors: A Vendor-Neutral Reference for FP8, BF16, MXFP4, and Microscaling Formats

Numeric format proliferation in machine learning hardware -- FP8 (E4M3 and E5M2), BF16, MXFP4, microscaling block formats, and dozens of re…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

Observability for Delegated Execution in Agentic AI Systems

Delegation-scoped execution is not identifiable from standard observables: audit logs and execution traces can be identical under multiple…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Learning to Attack and Defend: Adaptive Red Teaming of Language Models via GRPO

AI red teaming must continually adapt to evolving attackers and defenders. Reinforcement learning offers a promising approach to discoverin…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

Hybrid Robustness Verification for Spatio-Temporal Neural Networks

With AI increasingly deployed in safety-critical systems, providing formal robustness guarantees for the underlying models is essential. Ex…

2026-06-09 13:00 JSTarXiv cs.AIロボティクス

Difference-Aware Retrieval Policies for Imitation Learning

Parametric imitation learning via behavior cloning can suffer from poor generalization to out-of-distribution states due to compounding err…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Preserving Plasticity in Continual Learning via Dynamical Isometry

Continual training of deep neural networks under non-stationarity often leads to a progressive loss of plasticity, eventually limiting furt…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Data Synthesis and Parameter-Efficient Fine-Tuning for Low-Resource NMT: A Case Study on Q'eqchi' Mayan

Neural machine translation for digitally low-resource Indigenous languages is often hindered by extreme data scarcity, prompting reliance o…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Who Earns the Safety? Intervention-Aware Quantum Predictive Control with Safety Attribution

Hard safety filters are increasingly placed downstream of learned controllers to guarantee constraint satisfaction at run time. Yet a filte…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

FASE: Fast Adaptive Semantic Entropy for Code Quality

Multi-agent code generation offers a promising paradigm for autonomous software development by simulating the human software engineering li…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Bandits for Efficient Experimentation: Adapting to Control Group, Preferences, and Context Drifts

We consider a variant of the linear contextual stochastic multi-armed bandits, where the learner must provide recommendations to a group of…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Topological Neural Operators

We introduce Topological Neural Operators (TNOs), a principled framework for operator learning on cell complexes that lifts neural operator…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成ロボティクス

AHA-WAM:Asynchronous Horizon-Adaptive World-Action Modeling with Observation-Guided Context Routing

World-action models have emerged as a promising paradigm for robot manipulation, jointly modeling visual scene dynamics and actions to inje…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

PTL-Diffusion: Manifold-Aware Diffusion with Periodic Terminal Laws

Standard diffusion models typically use a single time-homogeneous Gaussian terminal distribution as the reference law for generation. While…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

An Agency-Transferring Model-Free Policy Enhancement Technique

Training reinforcement learning (RL) policies from scratch is costly: it requires careful reward and environment design, extensive tuning,…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成エージェント研究/論文

OmniGameArena: A Unified UE5 Benchmark for VLM Game Agents with Improvement Dynamics

Vision-language model (VLM) agents are increasingly deployed in interactive game environments. Yet game benchmarks for VLM agents typically…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

A Survey on Large Language Model-Based Game Agents

Game environments provide rich, controllable settings that stimulate many aspects of real-world complexity. As such, game agents offer a va…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

TQA-Bench: Evaluating LLMs for Multi-Table Question Answering

The advance of large language models (LLMs) has unlocked great opportunities in complex multi-modal data management tasks, particularly in…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

IDEQ -- Improving Diffusion Models for the Traveling Salesman Problem (TSP) by Leveraging the Structure of the Solution Space

We investigate diffusion models to solve the Traveling Salesman Problem. Building on the recent DIFUSCO and T2TCO approaches, we propose ID…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成ロボティクス研究/論文

HA-VLN 2.0: An Open Benchmark and Leaderboard for Human-Aware Navigation in Discrete and Continuous Environments with Dynamic Multi-Human Interactions

Vision-and-Language Navigation (VLN) has been studied mainly in either discrete or continuous spaces, with little attention to dynamic, cro…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Can Global XAI Methods Reveal Injected Behaviours in LLMs? SHAP vs Rule Extraction vs RuleSHAP

Large language models (LLMs) can amplify misinformation, undermining societal goals such as the UN SDGs. We study three documented drivers…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成エージェント研究/論文

FieldWorkArena: Agentic AI Benchmark for Real Field Work Tasks

This paper introduces FieldWorkArena, a benchmark for agentic AI targeting real-world field work. With the recent increase in demand for ag…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Modeling the Diachronic Evolution of Legal Norms: An LRMoo-Based, Component-Level, Event-Centric Approach to Legal Knowledge Graphs

Representing the temporal evolution of legal norms is a critical challenge for automated processing. While foundational frameworks exist, t…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Sound and Complete Neurosymbolic Reasoning with LLM-Grounded Interpretations

Large language models (LLMs) have demonstrated impressive capabilities in natural language understanding and generation, but exhibit proble…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Discovering heuristics in a complex SAT solver with large language models

The Satisfiability problem (SAT) is fundamental in computational complexity theory and has a wide range of industrial applications. Optimiz…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

CLPO: Curriculum Learning meets Policy Optimization for LLM Reasoning

Online reinforcement learning with verifiable rewards (RLVR) has become an effective paradigm for improving the reasoning abilities of larg…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

MixReasoning: Switching Modes to Think

Reasoning models enhance performance by tackling problems in a step-by-step manner, decomposing them into sub-problems and exploring long c…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

MatSciBench: Benchmarking the Reasoning Ability of Large Language Models in Materials Science

Large Language Models have shown strong scientific reasoning ability, but their performance on materials science problems remains less stud…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

AlphaOPT: Formulating Optimization Programs with Self-Improving LLM Experience Library

Optimization modeling underlies critical decision-making across industries, yet remains difficult to automate: natural-language problem des…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

TempoBench: Evaluating Temporal Causal Reasoning in Large Language Models

Temporal reasoning involves understanding how systems evolve over time through input-driven state transitions. A key aspect is temporal cau…

2026-06-09 13:00 JSTarXiv cs.AIエージェントロボティクス

QuickLAP: Quick Language-Action Preference Learning for Semi-Autonomous Agents

Robots must learn from both what people do and what they say, but either modality alone is often incomplete: physical corrections are groun…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

Knowing How to Edit: Reliable Evaluation Signals for Diagnosing and Optimizing Prompts at Query Level

Prompt optimization has become a central mechanism for eliciting strong performance from LLMs, and recent work has made substantial progres…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

A Geometric Unification of Concept Learning with Concept Cones

Two traditions of interpretability have evolved side by side but seldom spoken to each other: Concept Bottleneck Models (CBMs), which presc…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

A Geometric Theory of Cognition for Machine Intelligence

Developing artificial agents that unify representation, memory, adaptation, and prediction remains a fundamental challenge in artificial in…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

MAR:Multi-Agent Reflexion Improves Reasoning Abilities in LLMs

LLMs have shown the capacity to improve their performance on reasoning tasks through reflecting on their mistakes, and acting with these re…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

ReTreVal: Reasoning Tree with Validation and Cross-Problem Memory for Large Language Models

Every existing inference-time reasoning framework discards all failure context at problem boundaries, leaving a model solving problem 500 n…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Thinking-Based Non-Thinking: Solving the Reward Hacking Problem in Training Hybrid Reasoning Models via Reinforcement Learning

Large reasoning models (LRMs) have attracted much attention due to their exceptional performance. However, their performance mainly stems f…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Dynamic Distributed Constraint Optimization and Metareasoning for Continual, Large-Scale Satellite Operations

As Earth-observing satellite constellations grow in size and capability, distributed onboard control offers a pathway to novel responses an…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Payoff scaling shapes cooperation in LLM agents across languages

Large language models (LLMs) are increasingly deployed as autonomous agents that negotiate, coordinate, and act on behalf of users. Whether…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Language-based Trial and Error Falls Behind in the Era of Experience

While Large Language Models (LLMs) excel in language-based agentic tasks, their applicability to unseen, nonlinguistic environments (e.g.,…

2026-06-09 13:00 JSTarXiv cs.AIエージェント研究/論文

TAME: A Trustworthy Test-Time Evolution of Agent Memory with Systematic Benchmarking

Test-time evolution of agent memory represents a pivotal paradigm for advancing AGI, as it strengthens complex reasoning through experience…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

Weak-Driven Learning: How Weak Agents make Strong Agents Stronger

As post-training optimization becomes central to improving large language models, we observe a persistent saturation bottleneck: once model…

2026-06-09 13:00 JSTarXiv cs.AIエージェント研究/論文

Web Agents Should Use Typed Actions Instead of Click-Based Browsing

This position paper argues that building a reliable agentic Web requires shifting from low-level interaction primitives to typed actions su…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成エージェント

NoRD: A Data-Efficient Vision-Language-Action Model that Drives without Reasoning

Vision-Language-Action (VLA) models are advancing autonomous driving by replacing modular pipelines with unified end-to-end architectures.…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

2-Step Agent: A Framework for the Interaction of a Decision Maker with AI Decision Support

Predictions from ML models support human decision making in several fields, including high-stakes ones such as healthcare and the judiciary…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

An Alternative Trajectory for Generative AI

The generative artificial intelligence (AI) ecosystem is undergoing rapid transformations that threaten its sustainability. As models trans…

2026-06-09 13:00 JSTarXiv cs.AIエージェントハードウェア/半導体

IRAM-Omega-Q: A Computational Framework for Uncertainty Regulation in Adaptive Agents

Adaptive agents operating under uncertainty must do more than optimize task outputs: they must maintain a workable internal state under noi…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Reflection in the Dark: Exposing and Escaping the Black Box in Reflective Prompt Optimization

Automatic prompt optimization (APO) has emerged as a powerful paradigm for improving LLM performance without manual prompt engineering. Ref…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Counterfactual Credit Policy Optimization for Multi-Agent Collaboration

Collaborative multi-agent large language models (LLMs) can solve complex reasoning tasks by decomposing roles, but reinforcement learning f…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Signals Are Not States: Neuro-Symbolic Safeguards for Culturally Aware Classroom AI

Classroom AI systems increasingly infer high-level educational states such as engagement, confusion, collaboration, participation, and inst…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

MC-CPO: Mastery-Conditioned Constrained Policy Optimization for Pedagogically Safe Intelligent Tutoring Systems

Intelligent tutoring systems increasingly rely on reinforcement learning to personalise instruction, yet optimising for observable engageme…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

EvoMaster: A Foundational Evolving Agent Framework for Agentic Science at Scale

The convergence of large language models and agents is catalyzing a new era of scientific discovery: Agentic Science. While the scientific…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

The Topological Dual of a Dataset: A Logic-to-Topology Encoding for AlphaGeometry-Style Data

AlphaGeometry represents a milestone in neuro-symbolic reasoning, yet its architecture faces a log-linear scaling bottleneck within its sym…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Explainable AML Triage with LLMs: Evidence Retrieval and Counterfactual Checks

Anti-money laundering (AML) transaction monitoring generates large volumes of alerts that must be rapidly triaged by investigators under st…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Deconstructing Superintelligence: Identity, Self-Modification and Diff\'erance

Self-modification is routinely treated as constitutive of artificial superintelligence (\textbf{SI}), yet modification is a relative action…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond

As AI systems move from generating text to accomplishing goals through sustained interaction, the ability to model environment dynamics bec…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Correct Is Not Enough: Training Reasoning Planners with Executor-Grounded Rewards

Reinforcement learning with verifiable rewards has become a common way to improve explicit reasoning in large language models, but final-an…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

Executable World Models for ARC-AGI-3 in the Era of Coding Agents

We evaluate an initial coding-agent system for ARC-AGI-3 in which the agent maintains an executable Python world model, verifies it against…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Engagement Process: Rethinking the Temporal Interface of Action and Observation

Task completion in digital and physical environments increasingly involves complex temporal interaction, where actions and observations unf…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Goal-Oriented Reasoning for RAG-based Memory in Conversational Agentic LLM Systems

LLM-based conversational AI agents struggle to maintain coherent behavior over long horizons due to limited context. While RAG-based approa…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

ASH: Agents that Self-Hone via Embodied Learning

Long-horizon embodied tasks remain a fundamental challenge in AI, as current methods rely on hand-engineered rewards or action-labeled demo…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

ANNEAL: Adapting LLM Agents via Governed Symbolic Patch Learning

LLM-based agents can recover from individual execution errors, yet they repeatedly fail on the same fault when the underlying process knowl…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

CatalyticMLLM: A Graph-Text Multimodal Large Language Model for Catalytic Materials

Property prediction and inverse structural design of catalytic materials are typically modeled as two independent tasks: the former predict…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

When Tabular Foundation Models Meet Strategic Tabular Data: A Prior Alignment Approach

Tabular foundation models based on pretrained prior-data fitted networks~(PFNs) have shown strong generalization on diverse tabular tasks,…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

Beyond Rational Illusion: Behaviorally Realistic Strategic Classification

Strategic classification(SC) studies the interaction between decision models and agents who strategically manipulate their features for fav…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Playing Devil's Advocate: Off-the-Shelf Persona Vectors Rival Targeted Steering for Sycophancy

We study the effect of different persona on \textbf{sycophancy}: model's agreement with users even when the user is incorrect. The standard…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

MBABench: Evaluating LLM Agents on End-to-End Spreadsheet Tasks in Finance

LLM agents are increasingly expected to carry out end-to-end workflows, producing complete artifacts from high-level user instructions. To…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Advancing Mathematics Research with AI-Driven Formal Proof Search

Large language models (LLMs) increasingly excel at mathematical reasoning, but their unreliability limits their utility in mathematics rese…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

LGMT: Logic-Grounded Metamorphic Testing for Evaluating the Reasoning Reliability of LLMs

Large Language Models (LLMs) achieve strong performance on logical reasoning benchmarks, yet their reliability remains uncertain. Existing…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

CUA-Gym: Scaling Verifiable Training Environments and Tasks for Computer-Use Agents

Reinforcement learning with verifiable rewards (RLVR) has driven breakthroughs in domains such as math, tool-use, and software engineering,…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Neural Scalable Symbolic Search Framework for Complex Logical Queries with Multiple Free Variables

Complex Query Answering (CQA) is a fundamental knowledge representation and reasoning task over incomplete knowledge graphs (KGs). Answerin…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Quantifying and Optimizing Simplicity via Polynomial Representations

Deep networks often exhibit a preference for "simple" solutions, and such a simplicity bias is widely believed to play a key role in genera…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成エージェント

VESTA: Visual Exploration with Statistical Tool Agents

Fitting quantitative models to data is a central step in scientific workflows, yet it remains one of the least automated. Recent agent-base…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

ReSkill: Reconciling Skill Creation with Policy Optimization in Agentic RL

Agentic reinforcement learning (RL) enables LLM agents to improve continuously from environment rewards, yet the resulting policies do not…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

WorldCoder-Bench: Benchmarking Physically Grounded 3D World Synthesis

Large language models (LLMs) are increasingly asked not only to write static interfaces, but to construct executable interactive worlds fro…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

ChatHealthAI: Aligning Electronic Health Record Representations with Large Language Models for Grounded Clinical Reasoning

Large language models (LLMs) exhibit strong natural-language reasoning abilities for clinical decision support, but struggle to effectively…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

The Shadow Price of Reasoning: Economic Perspective on Optimal Budget Allocation for LLMs

Inference-time scaling has emerged as a critical avenue for enhancing Large Language Models' performance, yet real-world deployment is cons…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Trivium: Temporal Regret as a First-Class Objective for Causal-Memory Controllers

Many current agentic systems and LLM pipelines correct mistakes by optimizing outcome reward. This addresses only the what of failure: when…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

MIRAGE: Mobile Agents with Implicit Reasoning and Generative World Models

Mobile agents are increasingly expected to operate everyday applications from screenshots and language goals, where reliable control requir…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成エージェントビジネス/資金調達

Entropy-Based Evaluation of AI Agents: A Lightweight Framework for Measuring Behavioral Patterns

AI agents are commonly evaluated using task success, reward, latency, and cost. These metrics are useful, but they often miss important asp…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

A Pre-Registered Causal Partition of Self-Consistency Elicitation and Reward Design in RLVR

Reinforcement learning from verifiable rewards (RLVR) improves reasoning even when the reward signal is spurious -- assigning credit to the…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

Learning Visual Spatial Planning from Symbolic State via Modality-Gap-Aware Self-Distillation

While vision-language models excel at general multimodal understanding, they still struggle with visual spatial planning. We attribute this…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

Towards Healthy Evolution: Exploring the Role and Mechanisms of Human-Agent Interaction in Self-Evolving Systems

Self-evolving agents improve through continual self-play and self-generated learning signals, but autonomous evolution can also cause capab…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

An Infectious Disease Spread Simulation Based on Large Language Model Decision Making

Modelling individual decision-making during infectious disease outbreaks is crucial for understanding behavioural dynamics and informing ef…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Humans' ALMANAC: A Human Collaboration Dataset of Action-Level Mental Model Annotations for Agent Collaboration

Recent advances in LLM agents have enabled complex cognitive capabilities, such as multi-step reasoning, planning, and tool use, that incre…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Large Models for Time Series and Spatio-Temporal Data: A Survey and Outlook

Temporal data, including time series and spatio-temporal data, are pervasive in real-world applications. Generated in massive volumes by ph…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Toward autocorrection of chemical process flowsheets using large language models

The process engineering domain widely uses Process Flow Diagrams (PFDs) and Process and Instrumentation Diagrams (P&IDs) to represent proce…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Investigating the Histogram Loss in Regression

It is becoming increasingly common in regression to train neural networks that model the entire distribution even if only the mean is requi…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Strategic Integration of Artificial Intelligence in the C-Suite: The Role of the Chief AI Officer

The integration of Artificial Intelligence (AI) into corporate strategy has become critical for organizations seeking to maintain competiti…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Discovering Data Structures: Nearest Neighbor Search and Beyond

We propose a general framework for end-to-end learning of data structures. Our framework adapts to the underlying data distribution and pro…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Graph-to-SFILES: Control structure prediction from process topologies using generative artificial intelligence

Control structure design is an important but tedious step in P&ID development. Generative artificial intelligence (AI) promises to reduce P…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Complement or substitute? How AI increases the demand for human skills

Artificial Intelligence (AI) is transforming the nature of work, yet there is limited empirical evidence on how it affects demand for human…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

FIT-Print: Towards False-claim-resistant Model Ownership Verification via Targeted Fingerprint

Model fingerprinting has emerged as a crucial mechanism for safeguarding the intellectual property of open-source models, offering a non-in…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Dealing with Annotator Disagreement in Hate Speech Classification

Hate speech detection is a crucial task, especially on social media where harmful content can spread quickly. Collecting social media conte…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

Deep Tree Tensor Networks

Originating in quantum physics, tensor networks (TNs) have been widely adopted as exponential machines and parametric decomposers for recog…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Audio-FLAN: An Instruction-Following Dataset for Unified Audio Understanding and Generation of Speech, Music, and Sound

Recent advancements in audio tokenization have significantly enhanced the integration of audio capabilities into large language models (LLM…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Rule-based autocorrection of Piping and Instrumentation Diagrams (P&IDs) on graphs

A piping and instrumentation diagram (P&ID) is a central reference document in chemical process engineering. Currently, chemical engineers…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

LoTUS: Large-Scale Machine Unlearning with a Taste of Uncertainty

We present LoTUS, a novel Machine Unlearning (MU) method that eliminates the influence of training samples from pre-trained models, avoidin…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

Brain2Text Decoding Model Reveals the Neural Mechanisms of Visual Semantic Processing

Decoding sensory experiences from neural activity to reconstruct human-perceived visual stimuli and semantic content remains a challenge in…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Hyperflux: Pruning Reveals Importance

Network pruning is used to reduce inference latency and power consumption in large neural networks. However, most methods focus on empirica…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

Robust Renal Mass Segmentation on CT: A Validation Study of an AI-Based Framework

Renal mass segmentation has important potential to enhance the clinical workflow, especially in settings requiring quantitative assessments…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Harmonia: End-to-End RAG Serving Optimization

Retrieval-Augmented Generation (RAG) improves the reliability of large language models by integrating external knowledge, but serving RAG p…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

ePC: Fast and Deep Predictive Coding in Digital Simulation

Predictive Coding (PC) offers a brain-inspired alternative to backpropagation for neural network training, described as a physical system m…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

ACTIVE-o3: Empowering MLLMs with Active Perception via Pure Reinforcement Learning

Active vision, also known as active perception, refers to actively selecting where and how to look in order to gather task-relevant informa…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching

Autoregressive Models (ARMs) have long dominated the landscape of Large Language Models. Recently, a new paradigm has emerged in the form o…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Failure by Interference: Language Models Make Balanced Parentheses Errors When Faulty Mechanisms Overshadow Sound Ones

Despite remarkable advances in coding capabilities, language models (LMs) still struggle with simple syntactic tasks such as generating bal…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

AMix-1: A Pathway to Test-Time Scalable Protein Foundation Model

We introduce AMix-1, a powerful protein foundation model built on Bayesian Flow Networks and empowered by a systematic training methodology…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Learning Task Mixtures from Task Affinities: A Probabilistic Graphical Model for Supervised Fine-Tuning

Supervised fine-tuning performance for large language models depends strongly on how training budget is distributed across a heterogeneous…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

CLONE: A 3DGS-Based Closed-Loop Differentiable Optimization Framework for Single-Image Normal Estimation

We propose CLONE, a 3DGS-based Closed-Loop differentiable Optimization framework for single-image Normal Estimation. The core idea is to co…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

Unsupervised Partner Design Enables Robust Ad-hoc Teamwork

We introduce Unsupervised Partner Design (UPD), a population-free multi-agent reinforcement learning method for robust ad-hoc teamwork. UPD…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

In-Context Reinforcement Learning via Communicative World Models

Reinforcement learning (RL) agents often struggle to generalize to new tasks and contexts without updating their parameters, mainly because…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Discovering Expert-Level Nash Equilibrium Algorithms with Large Language Models

Designing polynomial-time algorithms for approximate Nash equilibria (ANE) with provable worst-case guarantees is a fundamental open proble…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

Video Understanding by Design: How Datasets Shape Video Models

Research in video understanding has advanced rapidly, driven by increasingly diverse datasets and more powerful model architectures. While…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

I-Segmenter: Integer-Only Vision Transformer for Efficient Semantic Segmentation

Vision Transformers (ViTs) have recently achieved strong results in semantic segmentation, yet their deployment on resource-constrained dev…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Decoupling the "What" and "Where" With Polar Coordinate Positional Embeddings

The attention mechanism in a Transformer architecture matches key to query based on both content -- the what -- and position in a sequence…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Understanding Benchmark Language Under Weakened Formal Semantics

State-of-the-art NLP benchmarks require interpretation of natural language that specifies conditions, procedures, and exceptions, often rel…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Generation Properties of Stochastic Interpolation under Finite Training Set

This paper investigates the theoretical behavior of generative models under finite training populations. Within the stochastic interpolatio…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

SecureVibeBench: Benchmarking Secure Vibe Coding of AI Agents via Reconstructing Vulnerability-Introducing Scenarios

Large language model-powered code agents are rapidly transforming software engineering, yet the security risks of their generated code have…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

VFEM: Visual Feature Empowered Multivariate Time Series Forecasting with Cross-Modal Fusion

Large time series foundation models often adopt channel-independent architectures to handle varying data dimensions, but this design ignore…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

Projection and Quantisation: A Unifying View of Learning to Hash, from Random Projections to the RAG Era

Approximate nearest neighbour (ANN) search underpins large-scale retrieval, increasingly within the retrieval-augmented generation pipeline…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Large Language Models for Imbalanced Classification: Diversity makes the difference

Oversampling is one of the most widely used approaches for addressing imbalanced classification. The core idea is to generate additional mi…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Efficient Onboard Vision-Language Inference in UAV-Enabled Low-Altitude Economy Networks via LLM-Enhanced Optimization

The rapid advancement of Low-Altitude Economy Networks (LAENets) has enabled a variety of applications, including aerial surveillance, envi…

2026-06-09 13:00 JSTarXiv cs.AIハードウェア/半導体

TAO: Tolerance-Aware Optimistic Verification for Floating-Point Neural Networks

Neural networks increasingly run on hardware outside the user's control (cloud GPUs, inference marketplaces). Yet ML-as-a-Service reveals l…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

PLAGUE: Plug-and-play framework for Lifelong Adaptive Generation of Multi-turn Exploits

Large Language Models (LLMs) are improving at an exceptional rate. With the advent of agentic workflows, multi-turn dialogue has become the…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

SmartMixed: A Two-Phase Training Strategy for Adaptive Activation Function Learning in Neural Networks

The choice of activation function plays a critical role in neural networks, yet most architectures still rely on fixed, uniform activation…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Learning Quantized Continuous Controllers for Integer Hardware

Deploying continuous-control reinforcement learning policies on embedded hardware requires meeting tight latency and power budgets. Small F…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

Correcting Mean Bias in Text Embeddings: A Refined Renormalization with Training-Free Improvements on MMTEB

We find that current sentence-embedding models produce outputs with a consistent bias: every embedding $e$ decomposes as $\tilde e + \mu$,…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

SMART: Shot-Aware Multimodal Video Moment Retrieval with Audio-Enhanced MLLM

Video Moment Retrieval is a task in video understanding that aims to localize a specific temporal segment in an untrimmed video based on a…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

AttnRegDeepLab: A Two-Stage Decoupled Framework for Interpretable Embryo Fragmentation Grading

Assessing embryo fragmentation is crucial for predicting IVF success, yet manual grading is prone to subjectivity, and existing AI models s…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

SAGE: Shape-Adapting Gated Experts for Adaptive Histopathology Image Segmentation

The significant variability in cell size and shape continues to pose a major obstacle in computer-assisted cancer detection on gigapixel Wh…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

MedVision: Benchmarking Quantitative Medical Image Analysis

Current vision-language models (VLMs) in medicine are primarily designed for categorical question answering (e.g., "Is this normal or abnor…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

SVRG and Beyond via Posterior Correction

Stochastic Variance Reduced Gradient (SVRG) and its variants aim to speed-up training by using gradient corrections. Originally proposed ov…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Developing Distance-Aware Physics-Constrained Probabilistic Frameworks for Industrial Prognostics

Development of reliable and physically interpretable probabilistic frameworks for industrial prognostics remain nascent, and existing liter…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Unambiguous Representations in Neural Networks: An Information-Theoretic Approach to Intentionality

Representations pervade our daily experience, from letters representing sounds to bit strings encoding digital files. While such representa…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

FADTI: Fourier and Attention Driven Diffusion for Multivariate Time Series Imputation

Multivariate time series imputation is fundamental in applications such as healthcare, traffic forecasting, and biological modeling, where…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

Collaborative Edge-to-Server Inference for Vision-Language Models

We propose a collaborative edge-to-server inference framework for vision-language models (VLMs) that reduces communication cost while maint…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Exploring the Effect of Basis Rotation on NQS Performance

Neural Quantum States (NQS) are powerful variational representations of quantum many-body wavefunctions, yet their performance depends sens…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

GenTSE: Enhancing Target Speaker Extraction via a Coarse-to-Fine Generative Language Model

Language Model (LM)-based generative modeling has emerged as a promising direction for TSE, offering potential for improved generalization…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Supracompetitive Pricing Under AI Monoculture

When competing sellers delegate pricing to a shared AI model, such as a large language model, correlated recommendations combined with perf…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Adversarial Instance Generation and Robust Training for Neural Combinatorial Optimization with Multiple Objectives

Deep reinforcement learning (DRL) has shown great promise in addressing multi-objective combinatorial optimization problems (MOCOPs). Never…

2026-06-09 13:00 JSTarXiv cs.AIロボティクス

Vision-Based Early Fault Diagnosis and Self-Recovery for Strawberry Harvesting Robots

Strawberry-harvesting robots faced challenges such as poor visual perception, gripper misalignment, empty grasp/misgrasp, and slippage, whi…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

A large-scale nanocrystal database with aligned synthesis and properties enabling generative inverse design

The synthesis of nanocrystals has been highly dependent on trial-and-error, due to the complex correlation between synthesis parameters and…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

One if by Land, Two if by Sea, Three if by Four Seas, and More to Come -- Values of Perception, Prediction, Communication, and Common Sense in Decision Making

This work aims to rigorously define the values of perception, prediction, communication, and common sense in decision making. The defined q…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

How Context Shapes Truth: Geometric Transformations of Statement-level Truth Representations in LLMs

Large Language Models (LLMs) often encode whether a statement is true as a vector in their residual stream activations. These vectors, also…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Revisiting Training Scale: An Empirical Study of Token Count, Power Consumption, and Parameter Efficiency

Research in machine learning has questioned whether increases in training token counts reliably produce proportional performance gains in l…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

DYCP: Dynamic Context Pruning for Long-Form Dialogue with LLMs

Large Language Models (LLMs) increasingly operate over long-form dialogues with frequent topic shifts. While recent LLMs support extended c…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

MMR-GRPO: Accelerating GRPO-Style Training through Diversity-Aware Reward Reweighting

Group Relative Policy Optimization (GRPO) has become a standard approach for training mathematical reasoning models; however, its reliance…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

A Comparative Study of Student Perspectives on Technical Writing Feedback Quality: Evaluating LLMs, SLMs, and Humans in Computer Science Topics

To address the scalability of feedback in computer science while mitigating the privacy and cost limitations of commercial Large Language M…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Multimodal Generative Engine Optimization: Rank Manipulation for Vision-Language Model Rankers

Vision-Language Models (VLMs) integrate visual and textual knowledge into unified representations that increasingly underpin modern retriev…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

XCR-Bench: Benchmarking Cross-Cultural Reasoning in LLMs via Culture-Specific Items and Hall's Triad

Cross-cultural competence in large language models (LLMs) requires understanding and adapting Culture-Specific Items (CSIs) across varying…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

The Flexibility Trap: Rethinking the Value of Arbitrary Order in Diffusion Language Models

Diffusion Large Language Models (dLLMs) break the rigid left-to-right constraint of traditional LLMs, enabling token generation in arbitrar…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

CURE: Curriculum-guided Multi-task Training for Reliable Anatomy Grounded Report Generation

Medical vision-language models can automate the generation of radiology reports but struggle with accurate visual grounding and factual con…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Meeting SLOs, Slashing Hours: Automated Enterprise LLM Optimization with OptiKIT

Enterprise LLM deployment faces a critical scalability challenge: organizations must optimize models systematically to scale AI initiatives…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成ビジネス/資金調達

Comparative evaluation of training strategies using partially labelled datasets for segmentation of white matter hyperintensities and stroke lesions in FLAIR MRI

White matter hyperintensities (WMH) and ischaemic stroke lesions (ISL) are key imaging biomarkers of cerebral small vessel disease (SVD) de…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Mobility-Embedded POIs: Learning What A Place Is and How It Is Used from Human Movement

Recent progress in geospatial foundation models highlights the importance of learning general-purpose representations for real-world locati…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

More Bang for the Buck: Improving the Inference of Large Language Models at a Fixed Budget using Reset and Discard (ReD)

The performance of large language models (LLMs) on verifiable tasks is usually measured by pass@k, the probability of answering a question…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Mechanistic Data Attribution: Tracing the Training Origins of Interpretable LLM Units

While Mechanistic Interpretability has identified interpretable circuits in LLMs, their causal origins in training data remain elusive. We…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

UA-DCM: Uncertainty-aware Causal Decision Making via Effect Bound Decomposition

Causal inference from observational data can provide strong evidence for finding the best action in a decision-making scenario without havi…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

MEnvAgent: Scalable Polyglot Environment Construction for Verifiable Software Engineering

The evolution of Large Language Model (LLM) agents for software engineering (SWE) is constrained by the scarcity of verifiable datasets, a…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

VideoGPA: Distilling Geometry Priors for 3D-Consistent Video Generation

While recent video diffusion models (VDMs) produce visually impressive results, they fundamentally struggle to maintain 3D structural consi…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

How Hyper-Datafication Impacts the Sustainability Costs in Frontier AI

Large-scale data has fuelled the success of frontier artificial intelligence (AI) models over the past decade. This expansion has relied on…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

DIVERGE: Diversity-Enhanced RAG for Open-Ended Information Seeking

Existing retrieval-augmented generation (RAG) systems often assume that each query has a single correct answer. This assumption overlooks o…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Reward Shaping for (Inference-Time) Alignment: A Stackelberg Game Perspective

Existing alignment methods directly use the reward model learned from user preference data to optimize an LLM policy, subject to KL regular…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Performative Learning Theory

Performative predictions influence the very outcomes they aim to forecast. We study performative predictions that affect a sample (e.g., on…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Variational Speculative Decoding: Rethinking Draft Training from Token Likelihood to Sequence Acceptance

Speculative decoding accelerates inference for (M)LLMs, yet a training-decoding discrepancy persists: while existing methods optimize singl…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

Implementing Grassroots Logic Programs with Multiagent Transition Systems and AI (Full Version)

Grassroots Logic Programs (GLP) is a concurrent logic programming language in which logic variables are partitioned into paired readers and…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Generative Reasoning Re-ranker

Recent studies increasingly explore Large Language Models (LLMs) as a new paradigm for recommendation systems due to their scalability and…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

When Benign Inputs Lead to Severe Harms: Eliciting Unsafe Unintended Behaviors of Computer-Use Agents

Although computer-use agents (CUAs) hold significant potential to automate increasingly complex OS workflows, they can demonstrate unsafe u…

2026-06-09 13:00 JSTarXiv cs.AIビジネス/資金調達

Kunlun: Establishing Scaling Laws for Massive-Scale Recommendation Systems through Unified Architecture Design

Deriving predictable scaling laws that govern the relationship between model performance and computational investment is crucial for design…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Cosmo3DFlow: Wavelet Flow Matching for Spatial-to-Spectral Compression in Reconstructing the Early Universe

Reconstructing the early universe from the evolved present-day universe is a challenging and computationally demanding problem in modern as…

2026-06-09 13:00 JSTarXiv cs.AIロボティクス

Transforming Police-Car Swerving for Mitigating Isolated Stop-and-Go Traffic Waves: A Practice-Oriented Jam-Absorption Driving Strategy

Stop-and-go traffic waves, a major form of freeway congestion, impose severe and persistent adverse impacts, including reduced traffic effi…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

On the Complexity of Offline Reinforcement Learning with $Q^\star$-Approximation and Partial Coverage

We study offline reinforcement learning under $Q^\star$-approximation and partial coverage, a setting that motivates practical algorithms s…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Know More, Know Clearer: A Meta-Cognitive Framework for Knowledge Augmentation in Large Language Models

Knowledge augmentation has significantly enhanced the performance of Large Language Models (LLMs) in knowledge-intensive tasks. However, ex…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Prescriptive Scaling Reveals the Evolution of Language Model Capabilities

Machine learning model performance improvements tend to arise from competition and application. For deployment, we consider prescriptive sc…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Condition-Gated Reasoning for Context-Dependent Biomedical Question Answering

Current biomedical question answering (QA) systems often assume that medical knowledge applies uniformly, yet real-world clinical reasoning…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Training-Free Intelligibility-Guided Observation Addition for Noisy ASR

Automatic speech recognition (ASR) degrades severely in noisy environments. Although speech enhancement (SE) front-ends effectively suppres…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Scaling Search Relevance: Augmenting App Store Ranking with LLM-Generated Judgments

Large-scale commercial search systems optimize for relevance to drive successful sessions that help users find what they are looking for. T…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

A Mixed Diet Makes DINO An Omnivorous Vision Encoder

Pre-trained vision encoders like DINOv2 have demonstrated exceptional performance on unimodal tasks. However, we observe that their feature…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

From Conflict to Consensus: Boosting Medical Reasoning via Multi-Round Agentic RAG

Large Language Models (LLMs) exhibit high reasoning capacity in medical question-answering, but their tendency to produce hallucinations an…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

CodeTaste: Can LLMs Generate Human-Level Code Refactorings?

LLM coding agents can generate working code, but their solutions often accumulate complexity, duplication, and architectural debt. Human de…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

POET-X: Memory-efficient LLM Training by Scaling Orthogonal Transformation

Efficient and stable training of large language models (LLMs) remains a core challenge in modern machine learning systems. To address this…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

Context Over Compute Human-in-the-Loop Outperforms Iterative Chain-of-Thought Prompting in Interview Answer Quality

Behavioral interview evaluation using large language models presents unique challenges that require structured assessment, realistic interv…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

RetroReasoner: A Reasoning LLM for Strategic Retrosynthesis Prediction

Retrosynthesis prediction aims to identify reactants that can synthesize a given product molecule. Although molecular large language models…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

How Transformers Reject Wrong Answers: Rotational Dynamics of Factual Constraint Processing

When a decoder-only transformer is forced to process matched correct and incorrect single-token continuations of a factual query, the two p…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

CHIMERA-Bench: A Benchmark Dataset for Epitope-Specific Antibody Design

Computational antibody design has seen rapid methodological progress, with dozens of deep generative methods proposed in the past three yea…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

AgroOmni: A Large-Scale Multi-view Agricultural Dataset for Cross-Scale Multimodal Reasoning

Modern agricultural data is sourced from diverse platforms and spans multiple spatial scales, ranging from ground-level close-up photograph…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Component Ablation for Efficient Hybrid Language Model Architectures: Performance, Resilience, and Compression Implications

Hybrid language models combine softmax attention with linear-time sequence mechanisms such as state-space or linear-attention layers, but t…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

DecepGPT: Schema-Driven Deception Detection with Multicultural Datasets and Robust Multimodal Learning

Multimodal deception detection aims to identify deceptive behavior by analyzing audiovisual cues for forensics and security. In these high-…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

Vision Hopfield Memory Networks for Image Recognition

Recent vision backbones, such as Transformer families and state-space models like Mamba, have achieved remarkable progress on image recogni…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Train at Moving Edge: Online-Verified Prompt Selection for Efficient RL Training of Large Reasoning Model

Reinforcement learning (RL) has become essential for post-training large language models (LLMs) in reasoning tasks. While scaling rollouts…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

UnWeaving the knots of GraphRAG -- turns out VectorRAG is almost enough

One of the key problems in Retrieval-augmented generation (RAG) systems is that chunk-based retrieval pipelines represent the source chunks…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Automated Framework to Evaluate and Harden LLM System Instructions against Encoding Attacks

System Instructions in Large Language Models (LLMs) are commonly used to enforce safety policies, define agent behavior, and protect sensit…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

Distributional Open-Ended Evaluation of LLM Cultural Value Alignment Based on Value Codebook

As LLMs are globally deployed, aligning their cultural value orientations is critical for safety and user engagement. However, existing ben…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Securing Retrieval-Augmented Generation: A Taxonomy of Attacks, Defenses, and Future Directions

Retrieval-augmented generation (RAG) extends large language models (LLMs) with external knowledge, but this access path also introduces sec…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

SatIR: Scalable High-Recall Constraint-Satisfaction-Based Information Retrieval for Clinical Trials Matching

Many important retrieval problems are not merely problems of semantic similarity, but problems of constraint satisfaction: a retrieved item…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Muon$^2$: Boosting Muon via Adaptive Second-Moment Preconditioning

Muon has emerged as a promising optimizer for large-scale foundation model pre-training by exploiting the matrix structure of neural networ…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Resilient Write: A Six-Layer Durable Write Surface for LLM Coding Agents

LLM-powered coding agents increasingly rely on tool-use protocols such as the Model Context Protocol (MCP) to read and write files on a dev…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Capacity-Controlled Global Attention for Graph Transformers

Global self-attention drives modern graph transformers, yet the softmax at its core imposes a structural constraint rarely examined directl…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

Multilingual Training and Evaluation Resources for Vision-Language Models

Vision Language Models (VLMs) achieved rapid progress in the recent years. However, despite their growth, VLMs development is heavily groun…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Watts-per-Intelligence Part II: Algorithmic Catalysis

We develop a thermodynamic theory of algorithmic catalysis within the watts per intelligence framework, identifying reusable computational…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

Knee-xRAI: An Explainable AI Framework for Automatic Kellgren-Lawrence Grading of Knee Osteoarthritis

Grading knee osteoarthritis (KOA) on plain radiographs is poorly reproducible across readers. A single-grade disagreement on the Kellgren-L…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Speech Enhancement Based on Drifting Models

We propose Speech Enhancement based on Drifting Models (DriftSE), a novel generative framework that formulates denoising as an equilibrium…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

RAS: a Reliability Oriented Metric for Automatic Speech Recognition

Automatic speech recognition systems often produce confident yet incorrect transcriptions under noisy or ambiguous conditions, which can be…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Skill Retrieval Augmentation for Agentic AI

As large language models (LLMs) evolve into agentic problem solvers, they increasingly rely on external, reusable skills to handle tasks be…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Simple Self-Conditioning Adaptation for Masked Diffusion Models

Masked diffusion models (MDMs) generate discrete sequences by iterative denoising under an absorbing masking process. In standard masked di…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

When Do Diffusion Models learn to Generate Multiple Objects?

Text-to-image diffusion models achieve impressive visual fidelity, yet they remain unreliable in multi-object generation. Despite extensive…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

DynamicPO: Dynamic Preference Optimization for Recommendation

In large language model (LLM)-based recommendation systems, direct preference optimization (DPO) effectively aligns recommendations with us…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Learning Behavioral Signals from Encrypted Smartphone Network Traffic

Human behavior is challenging to measure continuously at scale, yet traces of daily routines and well-being may be reflected in interaction…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Kernel Affine Hull Machines as Compute-Efficient Encoders for Frozen Semantic Spaces

Transformer-based semantic encoders are effective for retrieval, but in many deployments the recurring bottleneck is online query encoding…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Neuron-Anchored Rule Extraction for Large Language Models via Contrastive Hierarchical Ablation

A central goal of explainable AI is to express large language model (LLM) decision logic symbolically and ground it in internal mechanisms.…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Self-Mined Hardness for Safety Fine-Tuning

Safety fine-tuning of language models typically requires a curated adversarial dataset. We take a different approach: score each candidate…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

APEX: Large-scale Multi-task Aesthetic-Informed Popularity Prediction for AI-Generated Music

Music popularity prediction has attracted growing research interest, with relevance to artists, platforms, and recommendation systems. Howe…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

NavOne: One-Step Global Planning for Vision-Language Navigation on Top-Down Maps

Existing Vision-Language Navigation (VLN) methods typically adopt an egocentric, step-by-step paradigm, which struggles with error accumula…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

MinMax Recurrent Neural Cascades

We introduce MinMax Recurrent Neural Cascades (MinMax RNCs), a class of recurrent neural networks built from a novel form of recurrence ove…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

From Detection to Recovery: Operational Analysis on LLM Pre-training with 504 GPUs

Large-scale AI training is now fundamentally a distributed systems problem, and hardware failures have become routine operating conditions…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

CalBench: Evaluating Coordination-Privacy Trade-offs in Multi-Agent LLMs

Personal AI assistants are beginning to act as delegates with access to calendars, inboxes, and user preferences. Calendar scheduling makes…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

Quantifying Rodda and Graham Gait Classification from 3D Markerless Kinematics derived from a Single-view Video in a Heterogeneous Pediatric Clinical Cohort

Cerebral Palsy (CP) is a neurological disorder of movement and the most common cause of lifelong physical disability in childhood. Approxim…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Improving the Performance and Learning Stability of Parallelizable RNNs Designed for Ultra-Low Power Applications

Sequence learning is dominated by Transformers and parallelizable recurrent neural networks (RNNs) such as state-space models, yet learning…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

High-Rate Quantized Matrix Multiplication II

This is the second part of the work investigating quantized matrix multiplication (MatMul). In part I we considered the case of calibration…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Margin-Adaptive Confidence Ranking for Reliable LLM Judgement

Jung et al. (2025) introduce a hypothesis testing framework for guaranteeing agreement between large language models (LLMs) and human judgm…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Ghosted Layers: Unconstrained Activation Alignment for Recovering Layer-Pruned LLMs

Layer pruning removes entire Transformer decoder blocks from large language models, but introduces a mismatch between the hidden state rece…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成ビジネス/資金調達

Evaluating Design Video Generation: Metrics for Compositional Fidelity

Generative video models are increasingly used in design animation tasks, yet no standardized evaluation framework exists for this domain. U…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Full Attention Strikes Back: Transferring Full Attention into Sparse within Hundred Training Steps

Long-context inference in large language models is bottlenecked by the quadratic cost of full attention. Existing efficient alternatives of…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

WhiteTesseract: Reframing the Interpretation of Cultural Heritage through XR and Conversational AI

Cultural heritage exhibitions often struggle to sustain attention and support reflective engagement. Physical exhibitions rely on fixed int…

2026-06-09 13:00 JSTarXiv cs.AIハードウェア/半導体

LEAP: Learnable End-to-End Adaptive Pruning of Large Language Models

Unstructured sparsity is now natively accelerated by recent GPU kernels and dataflow hardware, shifting the bottleneck from inference execu…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

ConflictRAG: Detecting and Resolving Knowledge Conflicts in Retrieval Augmented Generation

Retrieval-Augmented Generation (RAG) systems implicitly assume mutual consistency among retrieved documents -- an assumption that frequentl…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Post-Trained MoE Can Skip Half Experts via Self-Distillation

Mixture-of-Experts (MoE) scales language models efficiently through sparse expert activation, and its dynamic variant further reduces compu…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Diagnosing Multi-step Reasoning Failures in Black-box LLMs via Stepwise Confidence Attribution

Large Language Models have achieved strong performance on reasoning tasks with objective answers by generating step-by-step solutions, but…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

FormalASR: End-to-End Spoken Chinese to Formal Text

Automatic speech recognition (ASR) systems are typically optimized for verbatim transcription, which preserves disfluencies, filler words,…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Causal Unlearning in Collaborative Optimization: Exact and Approximate Influence Reversal under Adversarial Contributions

Federated learning systems must support data deletion requests to comply with privacy regulations, yet retraining from scratch after each d…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

DySink: Dynamic Frame Sinks for Autoregressive Long Video Generation

Autoregressive long video generation often adopts bounded-memory streaming for efficiency, typically combining local windows for short-term…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

CrossVLA: Cross-Paradigm Post-Training and Inference Optimization for Vision-Language-Action Models

Vision-Language-Action (VLA) models have rapidly converged on a small set of architectural patterns: discrete-token autoregression (e.g. Op…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

DeltaBox: Scaling Stateful AI Agents with Millisecond-Level Sandbox Checkpoint/Rollback

LLM-powered AI agents require high-frequency state exploration (e.g., test-time tree search and reinforcement learning), relying on rapid c…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成ビジネス/資金調達

Learning to Evaluate: Cost-Effective Model Evaluation on Unlabeled Data with Meta-Learning

The rapid advancement of machine learning has led to an unprecedented expansion of model ecosystems, making it increasingly difficult to as…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

How Many Tools Should an LLM Agent See? A Chance-Corrected Answer

Before an LLM agent can use a tool, a retrieval system must decide which candidate tools to show to the agent. How long should that shortli…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Riemannian-Manifold Steering: Geometry-Aware Generative Autoencoders for Label-Free Steering

Steering a language model - intervening on its internal activations to change downstream behaviour - has recently expanded beyond linear in…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Polynomial Context-Truncation Sensitivity in Autoregressive Language Models: Sequential Wyner-Ziv Bounds for KV Cache Compression

We study the rate-distortion limits of online KV cache compression in autoregressive language models, formulating it as sequential Wyner-Zi…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

The Strongest Teacher Is Not Always the Best Teacher: Student-Centric Answer Selection

LLM training increasingly relies on teacher-generated supervision, from synthetic responses to reasoning traces and tool-use demonstrations…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Locality-Aware Redundancy Pruning for LLM Depth Compression

Large language models are known to contain representational redundancy across network depth, making depth pruning an effective approach for…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Pruning and Distilling Mixture-of-Experts into Dense Language Models

Mixture-of-Experts (MoE) is now the dominant architecture for frontier language models, yet it requires all expert parameters to be loaded…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Efficient and Scalable Provenance Tracking for LLM-Generated Code Snippets

Large language models (LLMs) for code completion and generation are increasingly used in software development, yet they may reproduce train…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

S3Mem: Structured Spatiotemporal Scene-Event Memory for Long-Horizon Interactive Question Answering

Long-horizon memory question answering often requires sparse evidence from heterogeneous histories, including events, object states, visual…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Mechanistic origins of catastrophic forgetting: why RL preserves circuits better than SFT?

Fine-tuning large language models (LLMs) frequently induces catastrophic forgetting of prior capabilities. Recent work has shown that reinf…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

MOOSE-Copilot: A Web-Based Interactive Assistant for Unified Exploratory and Fine-Grained Scientific Hypothesis Discovery

Large language models (LLMs) show remarkable potential in scientific hypothesis discovery. However, existing approaches face two critical l…

2026-06-09 13:00 JSTarXiv cs.AIロボティクス

BORA: Bridging Offline Reinforcement Learning and Online Residual Adaptation for Real-World Dexterous VLA Models

Vision-Language-Action (VLA) models have emerged as a promising paradigm for grounding visual-language understanding into real-world roboti…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Exploring Autonomous Agentic Data Engineering for Model Specialization

Large Language Models (LLMs) have demonstrated strong performance on general tasks, while often struggling to adapt to specialized domains…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

Diffusion Image Generation with Explicit Modeling of Data Manifold Geometry

Image generative models aim to sample data points from the underlying data manifold, a task that requires learning and decoding a dense, lo…

2026-06-09 13:00 JSTarXiv cs.AIロボティクス

Continuous Reasoning for Vision-Language-Action

Natural language is a powerful reasoning medium for language and vision-language models, but it is mismatched to the granularity of continu…

2026-06-09 13:00 JSTarXiv cs.AIエージェント

Beyond Independent Manipulation: Individual Fairness-aware Strategic Classification with Peer Imitation

Strategic classification (SC) investigates scenarios where agents manipulate their features to obtain favorable decisions from predictive m…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

MENTIS: What Belief Changes Under Alignment? Measuring Multi-Scale Latent Torsion in Language Models

Preference alignment has substantially improved the observable behavior of large language models, yet it remains unclear what alignment cha…

2026-06-09 13:00 JSTarXiv cs.AIロボティクスハードウェア/半導体

Crazyflow: An Accurate, GPU-Accelerated, Differentiable Drone Simulator in JAX

High-quality, large-scale synthetic data from simulations is becoming a cornerstone for pushing the capabilities of robot algorithms. While…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Defenses & Enablers For Skill Injection Attacks on Terminal Based Agents

Large language model (LLM) agents increasingly rely on reusable skills i.e. documents describing task-specific procedures. However, this in…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Easier to Mislead Than to Correct: Harmful and Beneficial Revision in LLM Conformity

Large language models are increasingly used in multi-agent systems, where they see and respond to other agents' answers. A key risk is conf…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Argument Collapse: LLMs Flatten Long-Form Public Debate

As LLMs are increasingly used to draft public-facing arguments, they may flatten public debate by repeatedly introducing the same polished,…

2026-06-09 13:00 JSTarXiv cs.AIロボティクス

See Less, Specify More: Visual Evidence Budgets for Generalizable VLAs

Generalization remains a central bottleneck for vision-language-action (VLA) models: under distractors, appearance shifts, and semantically…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成ロボティクス

Cosmos 3: Omnimodal World Models for Physical AI

We introduce Cosmos 3, a family of omnimodal world models designed to jointly process and generate language, image, video, audio, and actio…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Calibration Data Trade-offs Across Capability Dimensions: Why Multi-Source Mixing Matters for High-Sparsity LLM Pruning

Post-training pruning compresses large language models to high sparsity using a small unlabelled calibration set, and recent work has concl…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Position: Deployed Reinforcement Learning should be Continual

Reinforcement Learning (RL) has received increasing attention and adoption in real-world use cases. Most of these systems follow a train-th…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

Incremental Sheaf Cohomology on Cellular Complexes: O(1)-in-n Lazy Edit Processing under Bounded Local Geometry

We present an algorithmic framework for incremental maintenance of first sheaf cohomology $H^1(X; \mathcal{F})$ on dynamically evolving 1-d…

2026-06-09 13:00 JSTarXiv cs.AI画像/動画生成

An Empirical Study of Data Scale, Model Complexity, and Input Modalities in Visual Generalization

Modern deep neural networks usually have large parameter scales and nonlinear hierarchical structures, and they have achieved strong perfor…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI

Multi-SPIN: Multi-Access Speculative Inference for Cooperative Token Generation at the Edge

Speculative inference (SPIN) was originally developed as an efficient architecture to accelerate Large Language Models (LLMs). In this work…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

An Empirical Audit of Input Encoders for Multi-Channel Signal Transformers

Transformers consuming multi-channel scalar signals must embed $C$ simultaneous values into one $d_{\text{model}}$-dimensional vector per t…

2026-06-09 13:00 JSTarXiv cs.AI研究/論文

GOTabPFN: From Feature Ordering to Compact Tokenization for Tabular Foundation Models on High-Dimensional Data

We investigate how to make small tabular foundation models effective for High-Dimensional, Low-Sample Size (HDLSS) tabular prediction witho…

2026-06-09 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Emotion-Aware Image Generation from Korean Diary Text via LLM-based Prompt Translation and LoRA Fine-Tuning

T2I models cannot effectively capture sentiment from various types of text, including diaries, as they primarily focus on visual object-rel…

2026-06-09 13:00 JSTarXiv cs.AIハードウェア/半導体

OPRD: On-Policy Representation Distillation

On-policy distillation (OPD) supervises the student only in output space by matching next-token probabilities. This output-only paradigm ha…

2026-06-09 13:00 JSTarXiv cs.AIエージェントロボティクス

HANDOFF: Humanoid Agentic Task-Space Whole-Body Control via Distilled Complementary Teachers

For a humanoid robot to be deployed in the real world, the choice of command space (i.e., the interface between task planning and whole-bod…

2026-06-09 12:10 JSTITmedia AI+その他

AI合成写真で近大入試の確認をすりぬけ、替え玉受験対策に「生体認証システム」は必要か

元塾講師による近畿大入試を巡る替え玉事件で、大阪地検は６月8日、教え子に成りすまし受験した英検の結果を用いて近大に出願したとして、偽計業務妨害罪などで大阪市浪速区大国の元塾講師、野口瑞希容疑者（35）を起訴した。

2026-06-09 12:00 JSTITmedia AI+その他

AIに使われる設計者、AIを使う設計者

どちらになるかは、あなた次第――。

2026-06-09 11:07 JSTITmedia AI+その他

Google「AI Plus」4割値下げ、月725円に　ストレージ倍増、価格攻勢でシェア拡大へ

同社は上位プランの「AI Pro」や「AI Ultra」でも値下げや特典追加を相次いで実施しており、「Plus」の改定で全3プランのてこ入れが一巡した。

2026-06-09 10:56 JSTTechCrunch AIその他

Why Apple’s slow-and-steady AI bet is starting to look pretty smart

Can Apple's new AI glow up put to bed accusations that it's losing an all-important industry race?

2026-06-09 10:25 JSTITmedia AI+LLM/生成AIビジネス/資金調達

OpenAIがIPO申請を発表　時期は未定

ChatGPTを手がける米OpenAIは6月8日（現地時間）、米国での新規株式公開（IPO）を内密に申請していたことを発表した。

2026-06-09 09:45 JSTTechCrunch AIビジネス/資金調達

Mercor’s Brendan Foody calls out Sequoia, accusing it of ‘dual-pricing’ valuation tricks

Sequoia is just one of the top firms that sells same equity at two different prices.

2026-06-09 09:33 JSTITmedia AI+規制/政策

Apple、EU当局を批判「どの解決策も受け入れず」　「Siri AI」EUのiPhone・iPadで提供見送り

EU側が求めるデジタル市場法（DMA）への対応について、Appleが提示した解決策を欧州委員会がすべて拒否したためという。

2026-06-09 08:00 JSTITmedia AI+その他

コンサルの品質、なぜ「バラつく」のか？　ガートナーが背景を解説

コンサルティングサービスを利用する国内企業のうち、「期待以上」の成果を実感する企業は半数に満たない。最大の不満要因である「品質のバラつき」はなぜ発生するのか。顧客企業が取るべき手立てとは。

2026-06-09 07:41 JSTTechCrunch AILLM/生成AIビジネス/資金調達

As OpenAI files for IPO, Sam Altman’s eye-scanning company is doing layoffs, report says

Tools for Humanity, Sam Altman's identity verification company, is reportedly struggling to generate revenue and will downsize its staff.

2026-06-09 07:39 JSTTechCrunch AIその他

Apple’s WWDC AI demos looked more real after $250M false ad settlement

The vibe of Apple's 2026 WWDC keynote felt like a spouse proudly listing all the honey-do-list items tackled. One subtle example: the many…

2026-06-09 06:29 JSTTechCrunch AILLM/生成AIビジネス/資金調達

OpenAI files confidentially for IPO, following Anthropic

The filing comes a little more than a week after its main rival, Anthropic, also filed to go public, ramping up the race between the two AI…

2026-06-09 06:15 JSTTechCrunch AIその他

Apple plays catch-up at WWDC

Apple spent much of its WWDC keynote highlighting fixes, performance improvements, and long-requested features before unveiling its upgrade…

2026-06-09 05:53 JSTTechCrunch AIその他

Apple bets cheaper AI will woo small developers

As AI experimentation grows more expensive, Apple is waiving cloud API costs for developers with fewer than 2 million first-time App Store…

2026-06-09 04:41 JSTTechCrunch AIその他

WWDC 2026: Everything announced on Siri AI, iOS 27, Apple Intelligence and more

Apple primarily made the case for an improved experience with its longstanding Siri assistant, which like most other announcements had a he…

2026-06-09 03:48 JSTTechCrunch AIその他

Apple just taught your iPhone to finish your sentences, your photos, and your workflows

Apple is adding new AI-powered features to Safari, Shortcuts, and Password apps.

2026-06-09 03:45 JSTTechCrunch AILLM/生成AI

Apple will let you build workflows using AI in its new Shortcuts app

Shortcuts gets an AI upgrade, letting you describe the workflow you want in a prompt.

2026-06-09 03:38 JSTTechCrunch AIその他

Apple’s Image Playground doesn’t suck anymore

Apple's AI image generator is getting a makeover that could make it more competitive.

2026-06-09 03:36 JSTTechCrunch AIその他

Apple’s Photos app is getting new AI editing features

A new spatial "Reframe" feature will let users use AI to adjust perspectives.

2026-06-09 03:33 JSTTechCrunch AIその他

Apple gives Siri its own dedicated app

Siri is finally getting its own app.

2026-06-09 03:23 JSTTechCrunch AIその他

Apple is fixing the headache of splitting the bill with its new Siri in Camera feature

"If you're grabbing a bite with friends and point your iPhone at the bill, then [you can] select what you ordered to split the tab with App…

2026-06-09 02:56 JSTTechCrunch AIその他

Apple’s long-awaited AI Siri overhaul is finally here

The idea behind the new "Siri AI" is to turn the assistant from a voice controlled assistant into an AI companion that can do a lot more.

2026-06-09 02:27 JSTITmedia AI+LLM/生成AIビジネス/資金調達

OpenAIが上場へ　SpaceX・Anthropicに続きIPO申請

米OpenAIは、米国証券取引委員会にIPO申請したと発表した。

2026-06-09 01:51 JSTITmedia AI+LLM/生成AI

個人向け「Gemini」値下げ　「Google AI Plus」が月額1200円→725円に　ストレージも倍増

米Googleは、AIサービスの個人向けサブスクリプションプラン「Google AI Plus」を値下げすると発表した。月額を1200円から725円に引き下げる。

2026-06-09 00:49 JSTTechCrunch AIその他

Amazon now lets you design custom merch using AI

A new feature in the Amazon Shopping app allows users to generate designs with Alexa, then print them on products like T-shirts, hoodies, a…

2026-06-09 00:34 JSTTechCrunch AIその他

WWDC 2026: What to expect, from Siri’s highly anticipated revamp to Apple Intelligence and iOS 27

Apple's WWDC nears: Here's what you can look forward to.

2026-06-08（303件）

2026-06-08 23:00 JSTOpenAILLM/生成AI

Confidential submission of draft S-1 to the SEC

OpenAI confirms a confidential S-1 submission to the SEC and has not yet determined timing for further action.

2026-06-08 19:14 JSTITmedia AI+その他

「Siri AI」新登場　「Apple Intelligence」大幅刷新、Googleと共同開発　年内に英語版

Appleは6月8日（現地時間）のWWDCで新たな音声アシスタント「Siri AI」を発表。Googleと提携して開発したマルチモーダルのAIモデル「Apple Foundation Model」を基盤に「Apple Intelligence」をアップデートし、「Siri」に組…

2026-06-08 18:48 JSTITmedia AI+LLM/生成AI

パナソニックエナジー、28年度に売上高2兆円目指す　AIデータセンター向けに主力転換

パナソニックホールディングス傘下で電池事業を担うパナソニックエナジーが2028年度に売上高2兆円規模を目指す中期方針を明らかにした。達成すれば25年度から約1兆円増の大幅な成長となる。生成AIの普及で電力需要が増えるデータセンター向け蓄電システムを成長の柱に据え、26～28年度…

2026-06-08 13:20 JSTITmedia AI+その他

「業務の前提そのもの」をどうアップデートする？　IBMが説く「AXの要件」を考察

IBMが企業のAXにおける新たな指針として「AIオペレーティングモデル」を打ち出した。その内容から、企業がAXに向けて取り組むべき要件を探る。

2026-06-08 13:00 JSTITmedia AI+LLM/生成AIエージェント

「AI＝質問」は遅れてる　エージェント型AI「Claude Cowork」、組織展開に向けた管理機能を拡充

Anthropicは、AIエージェントによる業務支援機能「Claude Cowork」を全ての有料プランで一般提供すると発表した。組織全体への展開に向けた管理機能も同時に拡充する。

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Detecting and Mitigating Bias by Treating Fairness as a Symmetry Operation

Machine learning systems deployed in high stakes socioeconomic settings routinely display bias. We formalize bias as a symmetry breaking op…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

DiBS: Diffusion-Informed Branch Selection

Sudoku is a representative constraint satisfaction problem that requires global structural reasoning under strict discrete constraints. The…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

SafeGene: Reusable Adapters for Transferable Safety Alignment

Open-weight LLMs are increasingly fine-tuned into customized assistants, but downstream fine-tuning can weaken safety alignment and make mo…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Lean4Agent: Formal Modeling and Verification for Agent Workflow and Trajectory

Equipping Large Language Models (LLMs) to execute reliable multi-step workflows has become a central challenge in artificial intelligence.…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

CrowdMath: A Dataset of Crowdsourced Mathematical Research Discussions

Large language models have made substantial progress on mathematical reasoning, but existing benchmarks typically evaluate well-specified p…

2026-06-08 13:00 JSTarXiv cs.AIエージェントビジネス/資金調達

Attack Selection in Agentic AI Control Evaluations Meaningfully Decreases Safety

An attacker that strategically chooses when to attack is much harder to catch than one that attacks indiscriminately. AI control is a safet…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

CARVE-Q: Quantum-Proposed, Classically Certified Interactive Driving Repair

The critical question after a correct driving veto is not only whether a maneuver is unsafe, but whether the blocked interaction admits a l…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

Position: Don't Just "Fix it in Post": A Science of AI Must Study Training Dynamics

What would it mean to have a scientific understanding of AI? Models are not static objects: they are snapshots of time-evolving processes s…

2026-06-08 13:00 JSTarXiv cs.AIハードウェア/半導体

Accelerated Fourier SAT (AFSAT): Fully Realising a GPU-based Symmetric Pseudo-Boolean SAT Solver

We present Accelerated Fourier SAT (AFSAT), a GPU-accelerated solver for pseudo-Boolean satisfiability based on continuous local search (CL…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

A Study of Parallel Continuous Local Search

We study parallel Continuous Local Search (CLS) as a solution approach for Boolean satisfiability problems with symmetric pseudo-Boolean (P…

2026-06-08 13:00 JSTarXiv cs.AIロボティクス

AEGIS: A Backup Reflex for Physical AI

Long-horizon robot manipulation tends to fail gradually: one bad step degrades the state, and the policy spirals into a basin from which it…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

A Geometric Account of Activation Steering through Angle-Norm Decomposition

Linear activation steering has gained popularity as a simple and empirically effective way to control language model behavior. More recentl…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIエージェント

OpenSkill: Open-World Self-Evolution for LLM Agents

Self-evolving agents requires adaptation after deployment, but existing approaches assume a usable learning loop, such as curated skills, s…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIエージェント

AdMem: Advanced Memory for Task-solving Agents

Large Language Models (LLMs) show promise as tool-using agents but remain limited in long-horizon tasks that require remembering, organizin…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Evidence-Based Intelligent Diagnostic and Therapeutic Visualization System with Large Language Models: Multi-Turn Interaction and Multimodal Treatment Plan Generation

Aim: Existing AI-assisted traditional Chinese medicine diagnostic tools suffer from opaque reasoning processes, passive interaction, and li…

2026-06-08 13:00 JSTarXiv cs.AIエージェント

Workflow-to-Skill: Skill Creation via Routing-Workflow-Semantics-Attachments Decomposition

Large language model agents increasingly rely on Skills to encode procedural knowledge, yet high-quality Skills remain costly to hand-write…

2026-06-08 13:00 JSTarXiv cs.AIエージェント

Declarative Skills for AI Agents in Knowledge-Grounded Tool-Use Workflows

We study orchestration mechanisms for tool-using AI agents in realistic customer-service workflows over an unstructured knowledge base. We…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

Quantum-Inspired Trace-Augmented Evidence Selection for Reasoning over Structured Hypothesis Spaces

Large language models (LLMs) now solve a wide range of expert-level exams at or above human level, yet remain brittle on specialised, evide…

2026-06-08 13:00 JSTarXiv cs.AIエージェントビジネス/資金調達

Accounting for Context: Shaping Moral Credences for Value Alignment

Ensuring that agent behaviours are aligned with human moral values inevitably raises the problem of how to account for the plurality of mor…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Exploring Agentic Tool-Calling Decisions via Uncertainty-Aligned Reinforcement Learning

Large language model (LLM)-based agents often make suboptimal tool-use decisions, including unsupported tool invocation and hallucinated di…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Teaching the Way, Not the Answer: Privileged Tutoring Distillation for Multimodal Policy Optimization

Recent post-training methods, particularly Reinforcement Learning with Verifiable Rewards (RLVR), have significantly enhanced the reasoning…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIエージェントロボティクス

The Sim-to-Real Gap of Foundation Model Agents: A Unified MDP Perspective

Foundation model agents are increasingly deployed for real-world decision-making, but suffer from the sim-to-real gap. While robotics and c…

2026-06-08 13:00 JSTarXiv cs.AIエージェント

StainFlow: Entity-Stain Tracking and Evidence Linking for Process Rewards in GUI Agents

Reinforcement Learning (RL) has become a promising approach for improving GUI Agents in long-horizon, stochastic digital environments, but…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成

Hierarchical Semantic-Constrained Heterogeneous Graph for Audio-Visual Event Localization

Open-vocabulary audio-visual event localization (OV-AVEL) jointly models audio-visual cues to recognize and temporally localize events, inc…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Front-to-Attractors: Modifying the Front-to-Front Heuristic in Bidirectional Search

Heuristics play a central role in the performance of bidirectional search algorithms, which commonly rely on two main classes. Front-to-end…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

DyCon: Dynamic Reasoning Control via Evolving Difficulty Modeling

Recent advances in Large Reasoning Models (LRMs) demonstrate remarkable performance improvements by iteratively reflecting, exploring, and…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Beyond Post-hoc Explanation: Toward Glassbox AI via Probabilistic Mediation

Large language models are rapidly becoming infrastructural components in high-stakes institutional settings, including public administratio…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Think Fast: Estimating No-CoT Task-Completion Time Horizons of Frontier AI Models

Many efforts to ensure frontier AI models are safe rely on monitoring their chain-of-thought (CoT) reasoning. If models become able to perf…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

TOPSIS-RAD: Ranking According to Desires

Traditional TOPSIS derives its reference points -- the Positive Ideal Solution ($PIS$) and Negative Ideal Solution ($NIS$) -- from the obse…

2026-06-08 13:00 JSTarXiv cs.AIエージェント研究/論文

DuMate-DeepResearch: An Auditable Multi-Agent System with Recursive Search and Rubric-Grounded Reasoning

Deep Research (DR) has emerged as a new agentic paradigm to tackle complex, open-ended research tasks, demanding systems that can iterative…

2026-06-08 13:00 JSTarXiv cs.AIエージェントビジネス/資金調達

Off-Policy Evaluation with Strategic Agents via Local Disclosure

We study off-policy evaluation (OPE) under strategic behavior where decision subjects (or agents) respond to a decision maker's policy by s…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

Online Pandora's Box for Contextual LLM Cascading

Motivated by Large Language Model (LLM) cascading, we propose an online contextual Pandora's Box model for adaptively querying and selectin…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

Act As a Real Researcher: A Suite of Benchmarks Evaluating Frontier LLMs and Agentic Harnesses in Research Lifecycle

As foundation models advance and agent scaffolding becomes increasingly sophisticated, agents have demonstrated remarkable proficiency in c…

2026-06-08 13:00 JSTarXiv cs.AIエージェント

How AI Agents Reshape Knowledge Work: Autonomy, Efficiency, and Scope

Frontier AI systems are bridging the gap between intelligence and utility by shifting from conversational assistants to autonomous agents t…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

Zero-Shot Embedding Drift Detection: A Lightweight Defense Against Prompt Injections in LLMs

Prompt injection attacks have become an increasing vulnerability for LLM applications, where adversarial prompts exploit indirect input cha…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIエージェント

When Does Multi-Agent Collaboration Help? An Entropy Perspective

Multi-agent systems (MAS) have emerged as a prominent paradigm for leveraging large language models (LLMs) to tackle complex tasks. However…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Trading Engagement for Sustainability: Carbon-Aware Re-ranking for E-commerce Recommendations

E-commerce recommender systems strongly influence which products users consider and purchase, yet sustainability signals such as Product Ca…

2026-06-08 13:00 JSTarXiv cs.AIエージェント

Autonomous heterogeneous catalyst discovery with a self-evolving multi-agent digital twin

Theoretical heterogeneous catalysis promises rapid catalyst discovery, yet computational and machine-learning predictions often deviate fro…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

Human Adults and LLMs as Scientists: Who Benefits from Active Exploration?

A long-standing finding in the causal learning literature is that adults struggle to identify conjunctive causal rules, where an effect req…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成

A Geometric Gaussian Mixture Representation of Plane Curves

We introduce a user defined probabilistic polygonal representation for plane curves. Given a curve, we select vertices on the curve and con…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Which Anatomy Matters Under Limited Labels? A Data-Efficient Anatomy-Aware Benchmark for Cardiac Pathology Prediction

Numerous medical imaging problems must be solved under limited labels and constrained compute, yet it remains unclear whether performance g…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

FP8 is All You Need (Part 1): Debunking Hardware FP64 as the HPC Holy Grail

Conventional HPC dogma holds that native hardware FP64 silicon is the irreducible foundation of scientific computing -- the "holy grail" of…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

DxPTA: An Architecture Design Space Exploration with Optical Dataflow-guided Strategy for HW/SW Co-Design of Photonic Transformer Accelerators

Transformer-based networks have emerged as prominent AI models with state-of-the-art performance, which potentially pave the way toward art…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

P-Cast Precision in FP8 Attention: Sink-Induced Collapse and the Optimality of S=2^8

FP8 (E4M3) acceleration for attention computation offers significant throughput gains, but the 3-bit mantissa introduces precision challeng…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Agentic Large Language Models for Automated Structural Analysis of 3D Frame Systems

Large language models (LLMs) have emerged as powerful foundation models with strong reasoning capabilities across domains. Beyond reactive…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Attention Consistent Longitudinal Medical Visual Question Answering Guided by Vision Foundation Models

Longitudinal medical visual question answering (VQA) requires reasoning about anatomical differences between an image of a current time poi…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成

Attention-Guided Autoencoder Fusion for Insulator Defect Detection Using UAV Transmission-Line Imaging

Automated defect detection in high-voltage transmission-line insulators remains challenging due to severe class imbalance, large scale vari…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

Synthetic Benchmarks Overstate Forward-Forward Scaling: Real-Data Limits of Layer-Local Training

Forward-Forward (FF) learning [Hinton, 2022] replaces backpropagation with strictly layer-local goodness updates. Recent FF-CNN work has na…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Coordinated optimization of departure sequencing and section-track allocation in railway short-term concentrated departure scenarios based on qubo and hybrid quantum algorithms

This study examines the coordinated optimization of departure sequencing and section-track allocation in railway short-term concentrated de…

2026-06-08 13:00 JSTarXiv cs.AIエージェント

Queen-Bee Agents: A BeeSpec-Centered Architecture for Governed Enterprise MCP Orchestration

Enterprise agent systems increasingly need to connect large language models to private tools, internal knowledge, and Model Context Protoco…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

FAIR-Calib: Frontier-Aware Instability-Reweighted Calibration for Post-Training Quantization of Diffusion Large Language Models

Diffusion Large Language Models (dLLMs) refine tokens iteratively but commit them irreversibly, leading to a "stability lag" where early de…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Geometric Second-Order Feature Correlation Learning for Self-Supervised Speech Emotion Recognition

Self-supervised learning (SSL) yields powerful, context-rich representations for speech emotion recognition (SER), yet aggregating these re…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Multi-Scale Feature Attention Network for Polymer Classification using THz Dual-Comb Spectroscopy

Reliable polymer identification is essential for ensuring the quality and safety of recycled plastics, yet conventional sorting and spectro…

2026-06-08 13:00 JSTarXiv cs.AIエージェント

IRAF: Interference-Resilient Adaptive Fusion for Noise-Robust End-to-End Full-Duplex Spoken Dialogue Systems

Full-duplex spoken dialogue models allow voice agents to listen and speak concurrently, enabling natural interaction with real-time overlap…

2026-06-08 13:00 JSTarXiv cs.AIエージェント研究/論文

MacArena: Benchmarking Computer Use Agents on an Online macOS Environment

Computer-use agents (CUAs) operate graphical user interfaces (GUIs) through vision and control primitives, and their capabilities have adva…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

AI-Driven Test Case Generation from Natural Language Requirements: A Survey of Techniques and Research Gaps

Software testing is critical for verifying that systems meet specified requirements, yet remains among the most time-consuming and expensiv…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

WAV: Multi-Resolution Block Residual Routing for Deep Decoder-Only Transformers

Residual connections are central to training deep Transformers, but standard PreNorm residual streams aggregate sublayer updates with fixed…

2026-06-08 13:00 JSTarXiv cs.AIエージェント

NTILC: Neural Tool Invocation via Learned Compression

Agentic tool-calling language models depend on large registries of callable APIs, functions, and local actions. Placing full tool specifica…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

MalTree: Tracing Malware Evolution from Embeddings at Scale

Malware detection remains largely reactive: machine learning models trained on known samples degrade as threats evolve. Understanding evolu…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Generative Models Erode Human Temporal Learning Through Market Selection

We argue that modern generative models create structural risks for knowledge and cultural production at current, sub-AGI capability levels.…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成

Direct 3D-Aware Object Insertion via Decomposed Visual Proxies

Object insertion aims to seamlessly composite a reference object into a specified region of a background image. Recent diffusion-based meth…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

Re-Centering Humans in LLM Personalization

Despite growing interest, most evaluations of large language models' (LLMs') personalization abilities have relied on synthetic data. It re…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

FIGMA: Towards FIne-Grained Music retrievAl

Retrieving music using natural language descriptions has improved with contrastive audio-text models such as CLAP, but current systems rema…

2026-06-08 13:00 JSTarXiv cs.AIロボティクス

ChronoForest: Closed-Loop Multi-Tree Diffusion Planning for Efficient Bridge Search and Route Composition

How can we plan long-horizon routes that reach designated goals, visit required waypoints, and remain short when only short-horizon offline…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成ロボティクス

What Matters When Cotraining Robot Manipulation Policies on Everyday Human Videos?

Human video datasets used for cotraining robot manipulation policies largely consist of curated demonstrations where motions are orchestrat…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

How Language Models Fail: Token-Level Signatures of Committed and Persistent Reasoning Failures

Failures in language model reasoning emerge through distinct processes that leave identifiable signatures in the reasoning trace. We charac…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIエージェント

CAF-Gen: A Multi-Agent System for Enriching Argumentation Structures

Formalizing complex reasoning from natural text is one of the central challenges in computational linguistics. It requires systems to under…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成

Inside the Visual Mind: Neuroscience-Motivated Concept Circuits for Interpreting and Steering Vision Transformers

Despite high accuracy, Vision Transformer (ViT) predictions can be driven by spurious cues, raising the need to understand their inner work…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

HKJudge: A Legal Discourse-Annotated Corpus for Interpreting What Courts Find, How They Reason, and What They Rule

Court judgments are central to legal practice and jurisprudence, yet discourse analysis of Hong Kong judgments has received limited attenti…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

The Geography of Algorithmic Judgment: LLM Intermediaries, Place Identity, and Racial Steering in Housing Search

Large language models (LLMs) are rapidly assuming an intermediary role in housing search through the integration of listing platforms withi…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

MMBU: A Massive Multi-modal Biomedical Understanding Benchmark to Probe the Perception Capabilities of Vision-Language Models

Vision and language models (VLMs) hold immense promise to transform biomedical imaging workflows, from detecting lesions in chest X-rays to…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

Data-Efficient Autoregressive-to-Diffusion Language Models via On-Policy Distillation

We study the transformation of autoregressive models (ARLMs) into diffusion language models (DLMs). Rather than pretraining from scratch, p…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

Does Topic Sentiment Cause Perceived Ideology? Comparing Human and LLM Annotations in Political News Articles

We ask whether topic sentiment has a causal effect on perceived political ideology, and whether the answer depends on who assigns the ideol…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

ShallowBench: Benchmarking Generative Drug Design Models on Shallow-Pocket Targets

While generative AI models have demonstrated remarkable success in structure-based drug design, they predominantly rely on deep binding poc…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

MSAIC-Net: A Multi-Scale Attention and Imbalance-Aware Contrastive Network for ECG-Based Myocardial Substrate Abnormality Detection

Myocardial substrate abnormalities, such as myocardial scar and myocardial infarction (MI), are associated with adverse cardiovascular outc…

2026-06-08 13:00 JSTarXiv cs.AIロボティクス

SCOUT: Semantic scene COverage via Uncertainty-guided Traversal

Robots that operate over extended periods should not merely visit space; they should progressively understand it. Yet most 3D scene graph p…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

Multilingual Multi-Speaker Unit Vocoders: A Systematic Analysis of Discrete Speech Representations

Discrete speech units obtained via k-means clustering of self supervised embeddings entangle phonetic, speaker, and language information, c…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

HybridCodec: Fast Dual-Stream, Semantically Enhanced Neural Audio Codec

The popularity of neural audio codecs as speech tokenizers has surged with the advent of Multimodal Large Language Models. New codec archit…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

Evidence Graph Consistency in Retrieval-Augmented Generation: A Model-Dependent Analysis of Hallucination Detection

Retrieval-Augmented Generation (RAG) reduces but does not eliminate hallucination in large language models. Existing detection methods rely…

2026-06-08 13:00 JSTarXiv cs.AIロボティクス

AxisGuide: Grounding Robot Action Coordinate System in RGB Observations for Robust Visuomotor Manipulation

Visuomotor manipulation policies trained via large-scale behavior cloning have achieved strong semantic scene understanding, yet often fail…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Optimal Rates for Generalization of Gradient Descent Methods with Deep Neural Networks

Recent progress has been made in understanding the statistical generalization performance of gradient descent methods for overparameterized…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Generalization in Deep Neural Networks: Minimax Rates for Gradient Methods

Understanding the generalization performance of over-parameterized neural networks has become a central topic in deep learning theory. Whil…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

Mind the Gap: Bridging Behavioral Silos with LLMs in Multi-Vertical Recommendations

In multi-vertical e-commerce platforms like DoorDash, relatively newer product verticals such as grocery and retail present a significant o…

2026-06-08 13:00 JSTarXiv cs.AIエージェント研究/論文

What Your Posts Reveal: A Benchmark and Agentic Framework for User-Level Privacy Leakage on Social Media

Public social media posts can reveal private information through weak cues scattered across text, images, or metadata. Such leakage is ofte…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Exploring Reinforcement Learning for Fluid Transitions Between Clinical Mental Healthcare and Everyday Wellness Support

Mental health struggles wax and wane, yet clinical and wellness interventions typically operate separately, causing frequent breakdowns at…

2026-06-08 13:00 JSTarXiv cs.AIロボティクス

Lane Change Trajectory Planning for Personalized Driving Comfort and Mobility Efficiency

Lane changing entails simultaneous longitudinal and lateral motions that affect driving comfort and mobility efficiency. Because these moti…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成

Breaking the Lock-in: Diversifying Text-to-Image Generation via Representation Modulation

Recent text-to-image models built on large-scale Transformer backbones and flow-based objectives deliver strong text-image alignment and hi…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIエージェント

SCALE: Scalable Cross-Attention Learning with Extrapolation for Agentic Workflow Scheduling

Agentic Large Language Model (LLM) systems decompose complex tasks into workflow Directed Acyclic Graphs (DAGs) whose primitives must be sc…

2026-06-08 13:00 JSTarXiv cs.AIエージェント

PandaAI: A Practical Agent CQ2 for Neuro-symbolic Data Analysis And Integrated Decision-Making in Quantitative Finance

While deep learning has excelled in various domains, its application to sequential decision-making in finance remains challenging due to th…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

Progress-SQL: Improving Reinforcement Learning for Text-to-SQL via Progressive Rewards

Reinforcement learning has recently shown promise in improving large language models for Text-to-SQL generation, yet existing methods typic…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Hearing the Unspoken: Language Model Priors for Acoustic Adversarial Attacks

Automatic Speech Recognition (ASR) systems operating in real-time settings must process acoustic input under strict temporal constraints, w…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成エージェントロボティクス

Think Like a Pilot: Fine-Grained Long-Horizon UAV Navigation

Language-guided UAV agents must execute long-horizon semantic instructions while producing smooth, physically feasible continuous flight co…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

LLM Agent-Assisted Reverse Engineering with Quantitative Readability Metrics

Automatic decompilers produce functionally correct but often unreadable C code. This paper addresses one stage of the reverse engineering w…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

Characterize Then Distill: Mechanistic Reasoning in Large Output Spaces

Modern reasoning models offer surprisingly strong zero-shot performance on challenging multi-label tasks that require selecting a small set…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成

MotionEnhancer: Leveraging Video Diffusion for Motion-Enhanced Vision-Language Models

The new era has witnessed a remarkable capability to extend Vision-Language Models (VLMs) for tackling tasks of video understanding. While…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Modeling Nonlinear Feature Interactions with Product-Unit Residual Networks

Understanding nonlinear feature interactions is crucial in science and engineering, yet standard multilayer perceptrons (MLPs) often captur…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成ロボティクス

EgoPressDiff: Multimodal Video Diffusion for Egocentric UV-Domain Hand-Pressure Estimation

Estimating hand-surface contact pressure from an egocentric view is crucial for AR/VR devices, robotic imitation, and ergonomic analysis. E…

2026-06-08 13:00 JSTarXiv cs.AIロボティクス

Neuro-Symbolic Learning for Long-Horizon Task Planning Under Complex Logical Constraints

Task planning often suffers from severe efficiency bottlenecks when robots must reason over long-horizon action sequences under complex log…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成

FreeAnimate: Training-Free Human Image Animation with Preview-Guided Denoising

Human Image Animation has seen significant advancements, primarily driven by diffusion models. However, existing methods typically demand s…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成

Beyond Skeletons: Learning Animation Directly from Driving Videos with Same2X Training Strategy

Human image animation aims to generate a video from a static reference image, guided by pose information extracted from a driving video. Ex…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

EASE-TTT: Evidence-Aligned Selective Test-Time Training for Long-Context Question Answering

Long-context question answering (QA) remains challenging for smaller language models even when answer-bearing evidence is already present i…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

SpectCount: Spectrotemporal Counting via Synthetic Signals Improves Large Audio Language Models

Large audio language models (LALMs) extend large language models with an audio encoder and large-scale audio data. However, the scarcity of…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

ThinkBooster: A Unified Framework for Seamless Test-Time Scaling of LLM Reasoning

Test-time compute (TTC) scaling has emerged as a powerful paradigm for improving large language model (LLM) reasoning by allocating additio…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

The Fine-Tuning Trap: Evaluating Negative Transfer and the Role of PEFT in Sub-1B Mathematical Reasoning

Deploying Small Language Models (SLMs) on edge devices requires efficient fine-tuning strategies that adapt models to new tasks without deg…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI規制/政策研究/論文

Didact: A Cross-Domain Capability Discovery System for Defence

Policymakers in defence and defence-aligned sectors must monitor rapidly evolving research alongside sector priorities relevant to operatio…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

SS-TPT: Stability and Suitability-Guided Test-Time Prompt Tuning for Adversarially Robust Vision-Language Models

Vision-language models (VLMs) such as CLIP achieve strong zero-shot recognition but remain highly fragile under adversarial perturbations.…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

Auditing Training Data in Domain-adapted LLMs: LoRA-MINT

We present LoRA-MINT, a new methodology for Membership Inference Test (MINT) applied to recent Large Language Models (LLMs) fine-tuned for…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成

When is 3D Worth It? A Resource-Performance Frontier for CNNs and Transformers in Lung CT

Three-dimensional models are widely assumed preferable for volumetric medical imaging, yet their practical value depends on whether perform…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

OpenHalDet: A Unified Benchmark for Hallucination Detection across Diverse Generation Scenarios

Hallucination detection is essential for the reliable deployment of large language models (LLMs). However, existing evaluations face two co…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成

DaX: Learning General Pathology Representations Across Scales

Computational pathology requires visual representations that transfer across diverse clinical endpoints and remain robust to variation in m…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Don't Pause: Streaming Video-Language Synchrony for Online Video Understanding

Online Video Large Language Models (Video-LLMs) have advanced toward seamless human-AI interaction through frame-by-frame processing and pr…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

DataEvolver: Automatic Data Preparation for Large Language Models through Multi-Level Self-Evolving

High-quality training data is essential to large language models (LLMs) and typically requires extensive and costly manual curation. Existi…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

A Geometric View for Understanding Concept Learning and Neuron Interpretation in Sparse Autoencoders

We propose a unified mathematical framework for a geometric understanding of concept learning and neuron interpretation in sparse autoencod…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Towards Unified Song Generation and Singing Voice Conversion with Accompaniment Co-Generation

While song generation and singing voice conversion (SVC) have evolved significantly, they have long been developed isolated: the former lac…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

Phonetic Error Analysis of Raw Waveform Acoustic Models

We analyse error patterns of raw waveform acoustic models on TIMIT phone recognition beyond the overall phone error rate (PER). PER is deco…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

Never Seen Before: Benchmarking Genuine Zero-Shot Composed Image Retrieval with Consistent Video-Sourced Datasets

Zero-Shot Composed Image Retrieval (ZS-CIR) aims to retrieve a target image based on a query composed of a reference image and a relative c…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成

STREAM: Stochastic Riemannian Flow Matching with Anisotropic Decoder for Digital Histopathology Image Generation

Synthetic histopathology image generation addresses critical challenges in computational pathology, including patient privacy and the growi…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIエージェント

TRACE: Trajectory Reasoning through Adaptive Cross-Step Evidence Aggregation for LLM Agents

Autonomous LLM agents can pursue hidden malicious objectives through sequences of individually benign actions, making sabotage difficult to…

2026-06-08 13:00 JSTarXiv cs.AIエージェント研究/論文

SlimSearcher: Training Efficiency-Aware Web Agents via Adaptive Reward Gating

Deep research agents have demonstrated remarkable capabilities in complex information-seeking tasks, yet this power comes at a steep comput…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

dots.tts Technical Report

We present dots.tts, a 2B-parameter continuous autoregressive text-to-speech (TTS) foundation model that models speech in a continuous late…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

On the Geometry of On-Policy Distillation

On-policy distillation (OPD) is increasingly used to improve large language model reasoning, but its training dynamics remain poorly unders…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

MetaConfigurator: AI-Assisted RDF Authoring from JSON Data

Scientific workflows increasingly generate structured JSON data that is easy to exchange but difficult to interpret consistently across sys…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成

GP-Adapter: Gaussian Process CLIP-Adapter for Few-Shot Out-of-Distribution Detection

We propose GP-Adapter, a training-free framework that augments CLIP (Contrastive Language-Image Pre-training) with Gaussian Process (GP) un…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

DIFFRACT: Neuralized Utility Maximization for Wireless Networks by Differentiable Programming

Next-generation wireless networks, including satellite-to-Open RAN systems, demand agile and intelligent resource management capable of han…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

OffQ: Taming Structured Outliers in LLM Quantization by Offsetting

Low-bit quantization has been widely adopted to accelerate the inference of large language models (LLMs) by significantly reducing computat…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

Native3D: End-to-End 3D Scene Generation via Unified Mesh-Texture Modeling and Semantic Alignment

This paper presents Native3D, the first end-to-end 3D scene generation framework that completely bypasses 2D intermediate representations.…

2026-06-08 13:00 JSTarXiv cs.AIエージェント

The Three-Ring Architecture: Governing Agents in the Era of On-Platform Organisations

The current phase of enterprise AI deployment faces a structural failure: organisations are acquiring agentic capability without the infras…

2026-06-08 13:00 JSTarXiv cs.AIビジネス/資金調達研究/論文

REMEDI: A Benchmark for Retention and Unlearning Evaluation in Multi-label Clinical Disease Inference

Language models trained for clinical disease inference are trained on patient data, which may include sensitive and private information, an…

2026-06-08 13:00 JSTarXiv cs.AIエージェント

From Privacy to Workflow Integrity: Communication-Graph Metadata in Autonomous Agent Interoperability

Agent-interoperability protocols such as A2A and MCP standardize what agents say to one another, but assume address-based transport over HT…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達研究/論文

UrduMMLU: A Massive Multitask Benchmark for Urdu Language Understanding

Meaningful multilingual evaluation must test models in the target language and educational context. Urdu, spoken by more than 230 million p…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Textual Supervision Enhances Geospatial Representations in Vision-Language Models

Geospatial understanding is a critical yet underexplored dimension in the development of machine learning systems for tasks such as image g…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

RETROSPECT: RETROsynthesis via Sequential Prediction, and Chemically Transformed-ranking

Single-step retrosynthesis needs both accurate first-ranked suggestions and candidate lists that are rich enough for downstream selection.…

2026-06-08 13:00 JSTarXiv cs.AIエージェントロボティクス

An Abstract Architecture for Explainable Autonomy in Hazardous Environments

Autonomous robotic systems are being proposed for use in hazardous environments, often to reduce the risks to human workers. In the immedia…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成

DualGate-Net: A Prior-Gated Dual-Encoder Framework for Histopathology Cell Detection

Cell detection in histopathology images strongly depends on surrounding tissue context, where visually similar cells may belong to differen…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

DEFINED: A Data-Efficient Computational Framework for Fine-Grained Creativity Assessment in Debate Scenarios

Human creativity has emerged as a critical competency in the era of large language models. Assessing creativity in complex, open-ended envi…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

When Large Language Models Fail in Healthcare: Evaluating Sensitivity to Prompt Variations

Large Language Models (LLMs) are increasingly used in healthcare for tasks such as clinical question answering, diagnosis support, and repo…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成エージェントロボティクス

Beyond Waypoints: A Trajectory-Centric Waypointing Paradigm for Vision-Language Navigation

Vision-Language Navigation in Continuous Environments (VLN-CE) requires agents to follow natural-language instructions while navigating in…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

AI Sovereignty: A Qualitative Model of Strategic Competition as AI Becomes an Instrument of National Power

AI sovereignty is the extent to which a nation independently controls its artificial intelligence (AI) technologies. The race toward ever-m…

2026-06-08 13:00 JSTarXiv cs.AI規制/政策

Where Rectified Flows Leak: Characterising Membership Signals Along the Interpolation Path

Understanding what generative models retain from training data remains challenging, with implications for copyright and privacy. Beyond ver…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

Acoustic Cue Alignment in Audio Language Models for Speech Emotion Recognition

Instruction-following audio language models (ALMs) can be augmented with explicit acoustic cues, yet it remains unclear whether such cues a…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成

CULTURESCORE: Evaluating Cultural Faithfulness in Video Generation Models

As video generation models like Veo 3.1 and LTX-2 advance, their ability to accurately represent diverse global cultures remains a critical…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

SV-Detect: AI-generated Text Detection with Steering Vectors

Detecting machine-generated text is especially difficult under distribution shift, such as transfer across domains, source models, and edit…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Hierarchical Certified Semantic Commitment for Byzantine-Resilient LLM-Agent Collaboration

Byzantine collaboration among large-language-model agents requires a finality-control primitive: given delivered stochastic, structured nat…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

A Temporal Spatial Minimax Rate for Smoothly-Varying Distributions in Wasserstein Space

We study the minimax rate of estimating a future value $\mu_{t_n+h}$ of a curve $t\mapsto\mu_t$ in the $2$-Wasserstein space $\mathcal{P}_2…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

SleepExplain: Explainable Non-Rapid Eye Movement and Rapid Eye Movement Sleep Stage Classification from EEG Signal

Classification of sleep stages is one of the most important diagnostic approaches for a variety of sleep-related disorders. Electroencephal…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

A robust PPG foundation model using multimodal physiological supervision

Photoplethysmography (PPG), a non-invasive measure of changes in blood volume, is widely used in both wearable devices and clinical setting…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成研究/論文

Mitosis Detection in the Wild: Multi-Tumor and Context-Aware Generalization in the MIDOG 2025 Challenge

Automated mitosis detection is a well-established task in computational pathology. While previous benchmarks focused on scanner-induced dom…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIエージェントビジネス/資金調達

Do Coding Agents Deceive Us? Detecting and Preventing Cheating via Capped Evaluation with Randomized Tests

A growing failure mode in agent evaluation and training is that models can achieve high evaluation scores by exploiting shortcuts instead o…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成

Impact of Synthetic Lesional MR Images in Automated Focal Cortical Dysplasia Detection in Low-Data Scenarios

Background and Purpose: Automated detection of focal cortical dysplasia (FCD) requires large volumes of voxelwise lesion-delineated MRI dat…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

A Comprehensive Anatomy of Human and DeepSeek-R1 LLM Mathematical Reasoning

The emergence of "Aha moments" in large language models, particularly DeepSeek-R1-0120, has raised the question of whether these systems ge…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Socratic-SWE: Self-Evolving Coding Agents via Trace-Derived Agent Skills

LLM-driven software engineering agents have become a central testbed for real-world language-model capability, yet their training remains l…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

The Masked Advantage: Uncovering Local-Language Access to Cultural Knowledge in LLMs

Large language models are increasingly used to answer culturally grounded questions across languages, yet it remains unclear whether local…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成研究/論文

Watch, Remember, Reason: Human-View Video Understanding with MLLMs

Video understanding is being rapidly transformed by multimodal large language models (MLLMs), as research moves from short clips to long, m…

2026-06-08 13:00 JSTarXiv cs.AIエージェントロボティクス

Re-imagining ISO 26262 in the Age of Autonomous Vehicles: Enhancing Controllability through Transferability and Predictability

The ISO 26262 standard defines functional safety for road vehicles through risk assessments based on Severity, Exposure, and Controllabilit…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

TEVI: Text-Conditioned Editing of Visual Representations via Sparse Autoencoders for Improved Vision-Language Alignment

Vision-language models such as CLIP are highly useful for diverse tasks due to their shared image-text embedding space. Despite this, the i…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

PaperFlow: Profiling, Recommending, and Adapting Across Daily Paper Streams

Scientific paper recommendation is typically evaluated as static ranking over a fixed candidate set, yet real scientific reading unfolds as…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成エージェントロボティクス

Planning-aligned Token Compression for Long-Context Autonomous Driving

Monolithic vision-action models represent an emerging paradigm in autonomous driving. However, this architecture produces token sequences t…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Whisper Hallucination Detection and Mitigation via Hidden Representation Steering and Sparse AutoEncoders

Whisper, a widely adopted ASR model, is known to suffer from hallucinations - coherent transcriptions generated for non-speech audio entire…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Graph Neural Network leveraging Higher-order Class Label Connectivity for Heterophilous Graphs

Node classification in graph neural networks (GNNs) has been widely applied in various fields of graph analysis. GNNs achieve high-accuracy…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

Supervision versus Demonstration-Based In-Context Learning for Multiword Expression Classification

Turkish idiomatic light verb constructions (LVCs) are challenging for multiword expression processing because they often share the same sur…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Twelve quick tips for designing AI-driven HPC workflows

High-performance computing (HPC) clusters remain the backbone of large-scale scientific computation, traditionally executing deterministic,…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

Sparse Subspace-to-Expert Sharing for Task-Agnostic Continual Learning

Continual learning in Large Language Models (LLMs) is hindered by the plasticity-stability dilemma, where acquiring new capabilities often…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成エージェント

MemDreamer: Decoupling Perception and Reasoning for Long Video Understanding via Hierarchical Graph Memory and Agentic Retrieval Mechanism

Current Vision-Language Models struggle with hours-long videos because processing full-length visual sequences induces prohibitive token ex…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

How reliable are LLMs when it comes to playing dice?

We investigate the probabilistic reasoning capabilities of large language models through a controlled benchmarking study on discrete probab…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

LLM-Guided Search for Deletion-Correcting Codes

Finding deletion-correcting codes of maximum size has been an open problem for over 70 years, even for a single deletion. We adapt FunSearc…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

Finding the Minimal Parameter Budget for Implicit Reasoning: A Data Complexity Driven Scaling Law for Language Models

Reasoning is a core capability of language models (LMs), yet it remains unclear how much model capacity is necessary to support reasoning d…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

ChemQuests: A Curated Chemistry Question-Answer Database Extracted from ChemRxiv papers

The rapid expansion of chemistry literature poses significant challenges for researchers seeking to efficiently access domain-specific know…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIエージェント

EVA: Evolving Semantic Adversaries for Red-Teaming GUI Agents Against Environmental Injection Attacks

Graphical User Interface (GUI) agents powered by Multimodal Large Language Models (MLLMs) are increasingly deployed yet vulnerable to Envir…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Exploring Flow-Lenia Universes with a Curiosity-driven AI Scientist: Discovering Diverse Ecosystem Dynamics

We present a curiosity-driven AI scientist method for discovering system-level dynamics in Flow-Lenia, a continuous cellular automaton (CA)…

2026-06-08 13:00 JSTarXiv cs.AIエージェント

Model Context Protocols in Adaptive Transport Systems: A Survey

The rapid expansion of interconnected devices, autonomous systems, and AI applications has created severe fragmentation in adaptive transpo…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Understanding Generative Recommendation with Semantic IDs from a Model-scaling View

Recent advancements in generative models have allowed the emergence of a promising paradigm for recommender systems (RS), known as Generati…

2026-06-08 13:00 JSTarXiv cs.AIエージェント研究/論文

Small Language Model Agents Enable Efficient and High-Quality Knowledge Mining

At the core of Deep Research is knowledge mining, the task of extracting structured information from massive unstructured text in response…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

MHA-RAG: Improving Efficiency, Accuracy, and Consistency by Encoding Exemplars as Soft Prompts

Adapting Foundation Models to new domains with limited training data is challenging and computationally expensive. While prior work has dem…

2026-06-08 13:00 JSTarXiv cs.AIエージェント

Agentic Physical AI toward a Domain-Specific Foundation Model for Energy Systems: A Case Study on Nuclear Reactor Control

The prevailing paradigm in AI for physical systems: scaling general-purpose foundation models toward universal multimodal reasoning, confro…

2026-06-08 13:00 JSTarXiv cs.AIロボティクス

CHDP: Cooperative Hybrid Diffusion Policies for Reinforcement Learning in Parameterized Action Space

Hybrid action space, which combines discrete choices and continuous parameters, is prevalent in domains such as robot control and game AI.…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

TSAQA: Time Series Analysis Question And Answering Benchmark

Time series data are integral to critical applications across domains such as finance, healthcare, transportation, and environmental scienc…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成エージェント

Dual Latent Memory for Visual Multi-agent System

While Visual Multi-Agent Systems (VMAS) promise to enhance comprehensive abilities through inter-agent collaboration, empirical evidence re…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

MACD: Model-Aware Contrastive Decoding via Counterfactual Data

Video language models (Video-LLMs) are prone to hallucinations, generating plausible but ungrounded content when visual evidence is weak, a…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

VALUEFLOW: Toward Pluralistic and Steerable Value-based Alignment in Large Language Models

Aligning Large Language Models (LLMs) with the diverse spectrum of human values remains a central challenge: preference-based methods often…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

LLM-Augmented Digital Twin for Policy Evaluation in Short-Video Platforms

Short-video platforms are closed-loop, human-in-the-loop ecosystems where platform policy, creator incentives, and user behavior co-evolve.…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

LuMamba: Latent Unified Mamba for Electrode Topology-Invariant and Efficient EEG Modeling

Electroencephalography (EEG) enables non-invasive monitoring of brain activity across clinical and neurotechnology applications, yet buildi…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

D5P4: Partition Determinantal Point Process for Diversity in Parallel Discrete Diffusion Decoding

Discrete diffusion models are promising alternatives to autoregressive approaches for text generation, yet their decoding methods remain un…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Multi-Agent Reasoning with Consistency Verification Improves Uncertainty Calibration in Medical MCQA

Miscalibrated confidence scores are a practical obstacle to deploying AI in clinical settings. A model that is always overconfident offers…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Design Once, Deploy at Scale: Template-Driven ML Development for Large Model Ecosystems

Modern computational advertising platforms typically rely on recommendation systems to predict user responses, such as click-through rates,…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook

Latent space is rapidly emerging as a native substrate for language-based models. While modern systems are still commonly understood throug…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Don't Make the LLM Read the Graph: Make the Graph Think

We investigate whether explicit belief graphs improve LLM performance in cooperative multi-agent reasoning. Through 3,000+ controlled trial…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIエージェント

To Call or Not to Call: A Framework to Assess and Optimize LLM Tool Calling

Agentic AI architectures augment LLMs with external tools, unlocking strong capabilities. However, tool use is not always beneficial; some…

2026-06-08 13:00 JSTarXiv cs.AIエージェント

Beyond the Black Box: Interpretability of Agentic AI Tool Use

AI agents are promising for high-stakes enterprise workflows, but dependable deployment remains limited because tool-use failures are diffi…

2026-06-08 13:00 JSTarXiv cs.AIエージェント

Robust Instruction Compliance in Cooperative Multi-Agent Reinforcement Learning

Multi-agent reinforcement learning (MARL) in real-world use cases may need to adapt to external natural language instructions that interrup…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Insights Generator: Systematic Corpus-Level Trace Diagnostics for LLM Agents

Diagnosing failures in LLM agents remains largely manual. Practitioners inspect a small subset of execution traces, form ad-hoc hypotheses,…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Latent-space Attacks for Refusal Evasion in Language Models

Safety-aligned language models are trained to refuse harmful requests, yet refusal behavior can be suppressed by steering their internal re…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

CORE: Contrastive Reflection Enables Rapid Improvements in Reasoning

Language models can use verifiable rewards to improve at a wide variety of reasoning tasks. However, both parametric (e.g. RLVR) and non-pa…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

From "Weak" Signals to Strong Models: Preference Delta Aggregation with LoRA Merging

Training strong large language models (LLMs) requires high-quality supervision, which is often scarce. Recent work shows that paired prefer…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

Diagnosing LLM Arbitration Behavior over Pre-evidence Epistemic States in RAG-based Fact-Checking

In RAG-based fact-checking, LLMs are increasingly used as verifiers to check given claims against retrieved evidence. Their parametric know…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

A Negative Result on Cross-Model Activation Transfer in a Pythia Multi-Hop Setting

Recent work shows that language models can transmit behavioural traits through hidden signals in generated data during training. We ask whe…

2026-06-08 13:00 JSTarXiv cs.AIエージェント研究/論文

SentinelBench: A Benchmark for Long-Running Monitoring Agents

AI agents are increasingly asked to carry out work that spans minutes, hours, or longer. Yet the default model of agent behavior is continu…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

Beyond Output Matching: Preserving Internal Geometry in NVFP4 LLM Distillation

Demand for low-precision inference, including NVFP4-based approaches, has grown as large language models are increasingly deployed in laten…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIエージェント研究/論文

SubtleMemory: A Benchmark for Fine-Grained Relational Memory Discrimination in Long-Horizon AI Agents

Persistent AI assistants, such as OpenClaw, accumulate large collections of related memories over long-term interactions. As these memories…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Should You Use Your Large Language Model to Explore or Exploit?

We evaluate the ability of the current generation of large language models (LLMs) to help a decision-making agent facing an exploration-exp…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Towards Efficient and Exact Forgetting Services in Pre-Trained-Model-based Continual Learning

In Continual Learning (CL), using a Pre-Trained Model (PTM) as the feature extractor has become a popular practice. Accompanied by analytic…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Bounded-Abstention Pairwise Learning to Rank

Ranking systems influence decision-making in high-stakes domains like health, education, and employment, where they can have substantial ec…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

MoDA: Modulation Adapter for Fine-Grained Visual Grounding in Instructional MLLMs

Multimodal Large Language Models (MLLMs) have achieved remarkable success in instruction-following tasks by integrating pretrained visual e…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

CoQuIR: A Comprehensive Benchmark for Code Quality-Aware Information Retrieval

Code retrieval is essential in modern software development, as it boosts code reuse and accelerates debugging. However, current benchmarks…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Telling stories, making Hanzi: AI-assisted co-creation with elderly migrants in urban China

This paper explores how older migrants in urban China can record stories that everyday language and design often miss. We ran two co-creati…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

Database Normalization via Dual-LLM Self-Refinement

Database normalization is crucial to preserving data integrity. However, it is time-consuming and error-prone, as it is typically performed…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI研究/論文

Limitations of Normalization in Attention Mechanism

This paper investigates the limitations of the normalization in attention mechanisms. We begin with a theoretical framework that enables th…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

A Mechanism-Coupled Split Window Network for Medium- to High-Resolution Land Surface Temperature Retrieval

Land surface temperature (LST) is a fundamental physical variable in land-atmosphere interactions, surface energy budgets, and climate proc…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

Standard vs. Modular Sampling: Best Practices for Reliable LLM Unlearning

A conventional LLM Unlearning setting consists of two subsets -"forget" and "retain", with the objectives of removing the undesired knowled…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

MVCL-DAF++: Enhancing Multimodal Intent Recognition via Prototype-Aware Contrastive Alignment and Coarse-to-Fine Dynamic Attention Fusion

Multimodal intent recognition (MMIR) suffers from weak semantic grounding and poor robustness under noisy or rare-class conditions. We prop…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成

Scalable GANs with Transformers

Scalability has driven recent advances in generative modeling, yet its principles remain underexplored for adversarial learning. We investi…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Proxy Reconstruction Pre-training for Ramp Flow Prediction at Highway Interchanges

Interchanges are crucial nodes for vehicle transfers between highways, yet the lack of real-time ramp detectors creates blind spots in traf…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

SWE-IF: Aligning Code Evaluation with Human Preference

Large Language Models (LLMs) have catalyzed vibe coding, where users leverage LLMs to generate and iteratively refine code through natural…

2026-06-08 13:00 JSTarXiv cs.AIエージェント

Robust Driving Control for Autonomous Vehicles: An Intelligent General-sum Constrained Adversarial Reinforcement Learning Approach

Deep reinforcement learning (DRL) has demonstrated remarkable success in developing autonomous driving policies. However, its vulnerability…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成エージェントロボティクス

MatterDoor: Sampling Zero-shot Spatio-semantic Priors using Generative Models

Autonomous robots often view rooms only partially, through a doorway, where the walls and scene structure hide the geometry and task-releva…

2026-06-08 13:00 JSTarXiv cs.AIエージェント

ReclAIm: A Multi-Agent Framework for Monitoring and Correcting Performance Decline in Medical Imaging AI

Purpose: To develop and evaluate a multi-agent framework (ReclAIm) for automated monitoring, detection, and correction of performance decli…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

LoRA-DA: Data-Aware Initialization for Low-Rank Adaptation via Asymptotic Analysis

LoRA has become a widely adopted method for PEFT, and its initialization methods have attracted increasing attention. However, existing met…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

On the importance of multiple training seeds for evaluating machine unlearning

Machine unlearning aims to remove the influence of certain data points from a trained model without costly retraining. Most practical unlea…

2026-06-08 13:00 JSTarXiv cs.AIエージェント

Towards Iterative End-to-End Software Development: A Feature-Driven Multi-Agent Framework

Recent advances in large language model agents offer the promise of automating end-to-end software development from natural language requir…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Measuring Agents in Production

LLM-based agents already operate in production across many industries, yet we lack an understanding of what technical methods make deployme…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成

Calibrating Uncertainty for Zero-Shot Adversarial CLIP

CLIP delivers strong zero-shot classification but remains highly vulnerable to adversarial attacks. Prior adversarial fine-tuning work prim…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

RePo: Language Models with Context Re-Positioning

In-context learning is fundamental to modern Large Language Models (LLMs); however, prevailing architectures impose a rigid and fixed conte…

2026-06-08 13:00 JSTarXiv cs.AIエージェント研究/論文

It's a TRAP! Task-Redirecting Agent Persuasion Benchmark for Web Agents

Web-based agents powered by large language models are increasingly used for tasks such as email management or professional networking. Thei…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

Analysing Differences in Persuasive Language in LLM-Generated Text: Uncovering Stereotypical Gender Patterns

Large language models (LLMs) are increasingly used for everyday communication tasks, including drafting interpersonal messages intended to…

2026-06-08 13:00 JSTarXiv cs.AIエージェント研究/論文

Autonomous computational catalysis through an agentic research system

Autonomous agents are beginning to transform scientific research from tool-assisted workflows toward self-sustaining discovery processes. C…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

E2Former-V2: On-the-Fly Equivariant Attention with Linear Activation Memory

Equivariant Graph Neural Networks (EGNNs) have become a widely used approach for modeling 3D atomistic systems. However, mainstream archite…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIエージェント

Just-In-Time Reinforcement Learning: Continual Learning in LLM Agents Without Gradient Updates

While Large Language Model (LLM) agents excel at general tasks, they inherently struggle with continual adaptation due to the frozen weight…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成

Enhancing Video Representations with Spatiotemporal-Semantic Residual to Mitigate Hallucinations in Video Large Multimodal Models

Although Video Large Multimodal Models have achieved strong performance in video understanding, they still suffer from hallucination. Exist…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Learning to Execute Graph Algorithms Exactly with Graph Neural Networks

Understanding what graph neural networks can learn, especially their ability to learn to execute algorithms, remains a central theoretical…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

Rethinking Genomic Modeling Through Optical Character Recognition

Recent genomic foundation models largely adopt large language model architectures that treat DNA as a one-dimensional token sequence. Howev…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Step-Wise Refusal Dynamics in Autoregressive and Diffusion Language Models

Diffusion language models (DLMs) have recently emerged as a competitive alternative to autoregressive (AR) models, offering parallel decodi…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIハードウェア/半導体

Extracting Recurring Vulnerabilities from Black-Box LLM-Generated Software

LLMs are increasingly used for code generation, but their outputs often follow recurring templates that can induce predictable vulnerabilit…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

Endogenous Resistance to Activation Steering in Language Models

Large language models can recover mid-generation from task-misaligned activation steering, producing explicit verbal restarts (e.g., ``wait…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成

The Geometry of Representational Failures in Vision Language Models

Vision-Language Models (VLMs) exhibit puzzling failures in multi-object visual tasks, such as hallucinating non-existent elements or failin…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成

Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models

Despite the success of multimodal contrastive learning in aligning visual and linguistic representations, a persistent geometric anomaly, t…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

Discovering Interpretable Algorithms by Decompiling Transformers to RASP

Recent work has shown that the computations of Transformers can be simulated in the RASP family of programming languages. These findings ha…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

TokaMind: A Multi-Modal Transformer Foundation Model for Tokamak Plasma Dynamics

We present TokaMind, to our knowledge the first open-source foundation model for tokamak plasma dynamics, based on a Multi-Modal Transforme…

2026-06-08 13:00 JSTarXiv cs.AIエージェントロボティクス研究/論文

ScenicRules: An Autonomous Driving Benchmark with Multi-Objective Specifications and Abstract Scenarios

Developing autonomous driving systems for complex traffic environments requires balancing multiple objectives, such as avoiding collisions,…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Position: A Dynamical Systems Perspective is Needed to Advance Time Series Modeling

Time series (TS) modeling has come a long way from early statistical, mainly linear, approaches to the current trend in TS foundation model…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

TRUE: A Trustworthy Unified Explanation Framework for Large Language Model Reasoning

Large language models (LLMs) have demonstrated strong capabilities in complex reasoning tasks, yet their decision-making processes remain d…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成

Forecasting as Rendering: A 2D Gaussian Splatting Framework for Time Series Forecasting

Time series forecasting remains a challenging problem due to the intricate entanglement of intra-period fluctuations and inter-period trend…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Training for Technology: Adoption and Productive Use of Generative AI in Legal Analysis

Can targeted user training unlock the productive potential of generative artificial intelligence in professional settings? We study this qu…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Benchmarking Language Modeling for Lossless Compression of Full-Fidelity Audio

Autoregressive "language" models (LMs) trained on raw waveforms can be repurposed for lossless audio compression, but prior work is limited…

2026-06-08 13:00 JSTarXiv cs.AIハードウェア/半導体

VeriHGN: Heterogeneous Graph-Based Congestion Prediction for Chip Layout Verification

As Very Large Scale Integration (VLSI) designs continue to scale in size and complexity, layout verification has become a central challenge…

2026-06-08 13:00 JSTarXiv cs.AIエージェント

EvoClaw: Evaluating AI Agents on Continuous Software Evolution

With AI agents increasingly deployed as long-running systems, it becomes essential to autonomously construct and continuously evolve custom…

2026-06-08 13:00 JSTarXiv cs.AIビジネス/資金調達

$\mathrm{ECI}_{\mathrm{sem}}$: Semantic Residual Effective Contrastive Information for Evaluating Hard Negatives

Hard-negative source selection for dense retrieval is usually decided only after fine-tuning and downstream evaluation. We propose $\mathrm…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Evaluating AI-based Scientific Knowledge Synthesis with Epidemiological Systematic Reviews

Systematic literature reviews (SLRs) are a demanding and high-stakes form of scientific knowledge synthesis that remains underspecified as…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成ロボティクス

Chameleon: Control-Indexed Prospective Memory for Visuomotor Manipulation

Robots often observe information that determines a future action long before that action is executed. In a shell game, for example, a robot…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

Stable Reasoning, Unstable Responses: Mitigating LLM Deception via Stability Asymmetry

As Large Language Models (LLMs) expand in capability and application scope, their trustworthiness becomes critical. A vital risk is intrins…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

CountsDiff: A Diffusion Model on the Natural Numbers for Generation and Imputation of Count-Based Data

Diffusion models have excelled at generative tasks for both continuous and token-based domains, but their application to discrete ordinal d…

2026-06-08 13:00 JSTarXiv cs.AIエージェント研究/論文

SW-$A^2$-Bench: Benchmarking Autonomous Software Agent Generation for Agentic Web

The Agentic Web is emerging as a paradigm in which autonomous software agents interact with online resources and with each other to accompl…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

OGA-AID: Clinician-in-the-loop AI Report Drafting Assistant for Multimodal Observational Gait Analysis in Post-Stroke Rehabilitation

Gait analysis is essential in post-stroke rehabilitation but remains time-intensive and cognitively demanding, especially when clinicians m…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIエージェント

More Capable, Less Cooperative? When LLMs Fail At Zero-Cost Collaboration

Large language model (LLM) agents increasingly coordinate in multi-agent systems, yet we lack an understanding of where and why cooperation…

2026-06-08 13:00 JSTarXiv cs.AIロボティクス

ViVa: A Video-Generative Value Model for Robot Reinforcement Learning

Vision-language-action (VLA) models have advanced robot manipulation through large-scale pretraining, but real-world deployment remains cha…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AIビジネス/資金調達

MCERF: Advancing Multimodal LLM Evaluation of Engineering Documentation with Enhanced Retrieval

Engineering rulebooks and technical standards contain multimodal information like dense text, tables, and illustrations that are challengin…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

Self-Consistency from Only Two Samples: CoT-PoT Ensembling for Efficient LLM Reasoning

Self-consistency (SC) is a popular technique for improving the reasoning accuracy of large language models by aggregating multiple sampled…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

RAVEN: Retrieval-Augmented Vulnerability Exploration Network for Memory Corruption Analysis in User Code and Binary Programs

Large Language Models (LLMs) have demonstrated remarkable capabilities across various cybersecurity tasks, including vulnerability classifi…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

Automatic Causal Fairness Analysis with LLM-Generated Reporting

AutoML, intended as the process of automating the application of machine learning to real-world problems, is a key step for AI popularisati…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

InvEvolve: Evolving White-Box Inventory Policies via Large Language Models with Performance Guarantees

We study how large language models can be used to generate inventory policies in online settings with non-stationary demand. Our work is mo…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

MidSteer: Optimal Affine Framework for Steering Generative Models

Steering intermediate representations has emerged as a powerful strategy for controlling generative models, particularly in post-deployment…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

MACS: Modality-Aware Capacity Scaling for Efficient Multimodal MoE Inference

Mixture-of-Experts Multimodal Large Language Models (MoE MLLMs) suffer from a significant efficiency bottleneck during Expert Parallelism (…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

COF26: A new on-top functional for multiconfiguration pair-density functional theory

Multiconfiguration pair-density functional theory (MC-PDFT) provides an efficient and accurate framework for computing electronic energies…

2026-06-08 13:00 JSTarXiv cs.AIエージェント

Superintelligent Retrieval Agent: The Next Frontier of Agentic Retrieval

Retrieval-augmented agents are increasingly the interface to large knowledge bases, yet most treat retrieval as a black box: they issue exp…

2026-06-08 13:00 JSTarXiv cs.AIエージェント

Debugging the Debuggers: Failure-Anchored Structured Recovery for Software Engineering Agents

Software engineering agents are increasingly deployed in evaluable engineering environments, yet post-failure recovery remains costly, manu…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

CHoE: Cross-Domain Heterogeneous Graph Prompt Learning via Structure-Conditioned Experts

Heterogeneous Graph Prompt Learning (HGPL)has emerged as a promising paradigm for bridging the gap between the objectives of pre-training f…

2026-06-08 13:00 JSTarXiv cs.AIエージェント

Rethinking Code Review in the Age of AI: A Vision for Agentic Code Review

Code review has evolved for decades, from informal peer checking to today's pull request (PR) workflows, yet it remains a largely manual an…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Automated Root-Cause Subclassification and No-Code Fix Generation for Invalid Bug Reports

Issues faced when using software are reported in the form of bug reports. However, many bug reports are invalid, meaning they do not requir…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成

Focus-then-Context: Subject-Centric Progressive Visual Token Reduction for Vision-Language Models

Vision-Language Models (VLMs) face a bottleneck of prohibitive computational costs arising from massive visual token sequences during infer…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成

ActQuant: Sub-4-bit Action-Guided Quantization for Vision-Language-Action Models

Vision-Language-Action (VLA) models exhibit remarkable action generation for embodied intelligence, but their heavy compute make deployment…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Scale When Needed: Adaptive Neuron-level Mixed Precision Quantization Aware Training

Deploying deep neural networks on resource-constrained 6G edge devices demands aggressive compression with minimal accuracy loss. Quantizat…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Autoregression-Free Neural Operators for Time-Dependent PDEs

Neural operators learn mappings from function-dependent inputs to solutions, providing an effective framework for solving partial different…

2026-06-08 13:00 JSTarXiv cs.AIハードウェア/半導体

Fine-Tuning and Serving Gemma 4 31B on Google Cloud TPU: A Technical Comparison with GPU Baselines

We present the first end-to-end demonstration of fine-tuning and serving Google's Gemma 4 31B model on TPU hardware, providing an empirical…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI

Do Language Models Need Sleep? Offline Recurrence for Improved Online Inference

Transformer-based large language models are increasingly used for long-horizon tasks; however, their attention mechanism scales poorly with…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Evolving Features vs Evolving Entire Trees with GP for Interpretable Survival Analysis

Survival analysis concerns the task of predicting the time until an event occurs. Often used in the medical field, survival analysis deals…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Linear Ordering Problem: Time for a Change

The Linear Ordering Problem (LOP) is a fundamental combinatorial optimization problem with important applications in areas such as economic…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Beyond Tool Adoption: A Practical Five-Stage Developmental Continuum for AI Literacy in Higher Education

Artificial intelligence (AI) literacy is increasingly recognized as a foundational competency for all university graduates. Yet students' e…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

MOSS-Audio Technical Report

MOSS-Audio is a unified audio-language model for speech, environmental sound, and music understanding, supporting audio captioning, time-aw…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Building Better Activation Oracles

Activation Oracles (AOs) are promising methods for interpreting residual stream activations. However, current AOs face important issues, su…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Perplexity Can Miss SAE Feature Damage Under Quantization

Quantization is a standard path to deploying large language models, and quantized models are typically judged acceptable when perplexity or…

2026-06-08 13:00 JSTarXiv cs.AIエージェント研究/論文

OpenAgenet / OAN White Paper: Open Infrastructure for Trusted Agent Interconnection

OpenAgenet, abbreviated as OAN, is an open infrastructure project for trusted Agent interconnection. It addresses a problem that becomes vi…

2026-06-08 13:00 JSTarXiv cs.AIエージェント研究/論文

OpenAgenet / OAN Yellow Paper: Technical Architecture for Trust-Governed Resource Identity and Discovery

This yellow paper describes the technical architecture of OpenAgenet / OAN. OAN is a protocol-neutral trust layer for open Agent interconne…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Local Guidance, Global Impact: Gaussian-Reshaped Trust Region Unlocks Behavior Transitions

While Proximal Policy Optimization (PPO) demonstrates strong performance in stationary settings, we show that its standard optimization par…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Optimizing Explicit Unit-Distance Lower-Bound Certificates

The 2026 disproof of Erd\H{o}s's unit-distance conjecture and Sawin's quantitative refinement show that the maximum number $u(n)$ of unit d…

2026-06-08 13:00 JSTarXiv cs.AI研究/論文

Spectral Scaling Laws of Muon

Orthonormalized update rules have rapidly become a leading choice of optimizer for training large language models, with recent open-source…

2026-06-08 13:00 JSTarXiv cs.AILLM/生成AI画像/動画生成

MorphoQuant: Modality-Aware Quantization for Omni-modal Large Language Models

Conventional Post-Training Quantization (PTQ) methods struggle with 4-bit Omni-modal Large Language Models (OLLMs) due to the extreme distr…

2026-06-08 13:00 JSTarXiv cs.AI画像/動画生成

Selective Coupling of Decoupled Informative Regions: Masked Attention Alignment for Data-Free Quantization of Vision Transformers

Data-Free Quantization (DFQ) addresses data security concerns by synthesizing samples, without accessing real data. It has garnered increas…

2026-06-08 13:00 JSTarXiv cs.AIエージェント