Generative AI for Connected and Autonomous Vehicles

This book is a practical, applications-first guide to generative AI for connected and autonomous vehicle (CAV) systems, from large language models and multimodal generative models through prompt engineering, retrieval, tool use, agentic systems and protocols, fine-tuning, simulation, evaluation, and safety. It is designed for graduate courses, advanced undergraduate electives, and industry self-study. The hands-on labs are designed to run locally with open-source software and modest computing resources.

Order from Wiley →

Hands-On Labs

The companion labs are the core of the book's hands-on track: self-contained Jupyter notebooks that run locally on a laptop (CPU, Apple Silicon, or CUDA) using open-source tools. They are released under the MIT license, so readers and instructors can clone, run, and adapt them freely.

40+ hands-on labs across the book

Local-first CPU, Apple Silicon, or CUDA

No API keys open-source models running locally

MIT licensed clone, run, and adapt

A sample of what the labs build, grounded in connected and autonomous vehicle scenarios:

Ch 3 Predict vehicle trajectories with diffusion models
Ch 5 Answer questions about a driving scene with a vision-language model
Ch 8 Build a retrieval-augmented generation pipeline for context grounding
Ch 9 Drive vehicle actions through LLM function calling
Ch 15 Negotiate a platoon maneuver between agents over the A2A protocol
Ch 16 Coordinate a work-zone scenario with MCP and A2A together
Ch 17 Fine-tune a small model on a CAV task
Ch 22 Add a runtime safety guardrail in a driving simulator

Labs on GitHub: available soon

The lab repository will be published on GitHub when the book is released. The link will go live then.

Quickstart once the repository is available:

git clone https://github.com/han-kt/genai-for-cav-labs.git
cd genai-for-cav-labs
uv sync          # in a chosen lab directory
uv run jupyter lab

Each lab's README covers setup and prerequisites in detail.

What's Inside

A look at the book's full scope, to help readers and instructors decide whether it fits their needs. It is organized into seven parts spanning 22 chapters, grouped into four conceptual layers that progress from generative and multimodal models, through the model interface and agentic systems, to adaptation and validation. Each chapter pairs concepts with a CAV-grounded application and a runnable lab. Chapters are largely self-contained, so the structure below also maps the reading and teaching paths that follow.

Parts & Chapters

The seven parts and their chapters, grouped into the book's four conceptual layers.

Layer	Part	Chapters
Generative & Multimodal Models	Part 1Foundations	1. Introduction to Generative AI for CAV Systems 2. Large Language Models
Generative & Multimodal Models	Part 2Multimodal Representation	3. Diffusion and Flow Matching 4. GANs, VAEs, and Autoregressive Models 5. Vision-Language Models 6. Video Large Language Models
Model Interface	Part 3Language-Based Interaction	7. Prompt Engineering 8. Retrieval-Augmented Generation 9. Function Calling and Tool Use
Agentic Systems & Protocols	Part 4Agentic Intelligence	10. GenAI Agent Architectures 11. Multi-Agent Systems 12. Orchestration Frameworks
Agentic Systems & Protocols	Part 5Communication Protocols	13. Communication Protocols 14. Model Context Protocol (MCP) 15. Agent-to-Agent (A2A) Protocols 16. Capability Composition with MCP and A2A
Adaptation & Validation	Part 6Adaptation and Deployment	17. Supervised Fine-Tuning 18. Reinforcement Learning Fine-Tuning 19. Knowledge Distillation
Adaptation & Validation	Part 7Validation and Safety	20. Simulation Platforms 21. Evaluation of Generative AI Systems 22. Ethical and Safety Considerations

Reading Pathways

Different readers can enter the material along different paths:

Graduate / Research

Read all seven parts in sequence for comprehensive theoretical and practical coverage; the best fit for coursework and research.

Upper-Division Undergraduate

An applied track focused on using generative models: Parts 1, 3, 4, 5, and 7 (language interaction, agents, and the protocols that connect them). Part 2 (multimodal models) is optional enrichment; Part 6 (model fine-tuning and adaptation) is an advanced option.

Industry Practitioner

Foundations, then language interaction, agentic systems and protocols, and adaptation and validation; consult Part 2 as specific applications require it.

Sample Course Schedules

Two example one-semester (15-week) layouts are shown below. Additional course schedules and editable syllabus templates are available to adopting instructors.

Graduate Seminar: Full Sequence

Wk	Topic	Ch.
1	Introduction to Generative AI for CAVs	1
2	Large Language Models	2
3	Diffusion and Flow Matching	3
4	GANs, VAEs, Autoregressive Models	4
5	Vision-Language and Video Models	5–6
6	Prompt Engineering	7
7	Retrieval-Augmented Generation	8
8	Function Calling and Tool Use	9
9	GenAI Agent Architectures	10
10	Multi-Agent Systems, Orchestration	11–12
11	Communication Protocols, MCP	13–14
12	A2A and Capability Composition	15–16
13	Supervised and RL Fine-Tuning	17–18
14	Knowledge Distillation, Simulation	19–20
15	Evaluation and Ethics; project	21–22

Undergraduate: LLM-Systems Track

Wk	Topic	Ch.
1	Introduction to Generative AI for CAVs	1
2	Large Language Models	2
3	How LLMs Generate	2
4	Prompt Engineering	7
5	Retrieval-Augmented Generation	8
6	Function Calling and Tool Use	9
7	Review and midterm; project kickoff	–
8	GenAI Agent Architectures	10
9	Multi-Agent Systems	11
10	Orchestration Frameworks	12
11	Connecting Agents: MCP	14
12	Agent-to-Agent Communication	15
13	Simulation Platforms	20
14	Evaluation	21
15	Ethics and Safety; project	22

Part 2 (Chapters 3–6) is optional enrichment for students interested in image, video, and multimodal models.

Wiley Resources

Instructor and supplementary materials are provided through Wiley. The book's Wiley page is the home for ordering information, supplementary resources, and the instructor materials for adopters.

Visit the Wiley book page →