Comments to author (Associate Editor)
=====================================

Seven reviews have been gathered, generally highlighting
the soundness of the proposed approach and its quality. As
it can be seen from the Reviewers' comments, some Reviewers
suggests to check for (important) details, e.g., some on
the experimental setting, that seem to be left out and
reduce the clarity/impact of the work. Another Reviewer
also suggests to comment on the limitations of the proposed
approach. These suggestions, along with the other comments
of the Reviewers, should be taken into account to improve
the quality of the paper.


Reviewer 1 of SMC 2023 submission 1342

Comments to the author
======================

The paper introduces a cloud-vehicle collaborative
framework for personalized adaptive cruise control. It
addresses the issue of pre-defined settings not aligning
with driver preferences. The framework incorporates offline
training using naturalistic driving data and online
adaptation based on driver feedback. HuiL simulation
experiments show significant reductions in driver
interventions. Overall, the paper is well written and
presents valuable contributions to personalized ADAS.

Reviewer 2 of SMC 2023 submission 1342

Comments to the author
======================

Strengths:

The paper addresses a significant issue in Advanced Driver
Assistance Systems (ADAS) research by proposing a
personalized ACC framework that considers driver
preferences and habits, which has the potential to improve
driver trust and comfort in the system.

The proposed framework incorporates real-time driver
feedback, which is an improvement over existing approaches
that rely solely on historical data. This aspect ensures
that the algorithm is continuously adapting to the driver's
preferences, which could lead to better performance in
unexpected situations.

The use of human-in-the-loop (HuiL) simulation experiments
provides strong evidence for the effectiveness of the
proposed method in reducing driver intervention in
automatic control systems. The HuiL simulation also
provides a safe and controlled environment for testing the
proposed algorithm, which could be a more practical
approach than real-world experiments.

The stochastic scenario generation approach used to
generate speed profiles for the experiments is a notable
strength of the study. The approach meets multiple
requirements, such as avoiding driver fatigue, making the
scenarios realistic and unpredictable, and covering a wide
range of speeds. This approach ensures that the experiments
represent real-world driving scenarios and provides more
accurate results.

The study tests and compares four different control
strategies, which provides a comprehensive evaluation of
the proposed algorithm's performance. This approach allows
the authors to determine the most effective algorithm for
the given scenarios and to understand the strengths and
weaknesses of each approach.

The IRL-based offline personalized DGPT learning and online
DGPT adaptation algorithms used in the study have been
extensively validated in previous works. This aspect
strengthens the validity of the study's results and ensures
that the proposed algorithm is built upon existing and
reliable algorithms.

Weaknesses:

The study only includes five test drivers, which may limit
the generalizability of the results. A larger and more
diverse sample size could provide more robust and
generalizable results.

The study does not consider the effects of weather and road
conditions on the proposed algorithm's performance, which
may affect the algorithm's applicability in real-world
situations. Future studies could consider incorporating
weather and road conditions into the experiment design to
ensure the algorithm's reliability in various environments.

The heuristic algorithm used in the online DGPT adaptation
module is a simplified version of the proposed framework,
which may limit the algorithm's performance and robustness.
A more complex and optimized algorithm could provide more
accurate and reliable results.

Finally, more discussion and analysis of the limitations of
the results compared to related literature are necessary to
understand the proposed algorithm's contribution to the
existing body of research. This aspect could provide
valuable insights for future studies in the field.


Reviewer 3 of SMC 2023 submission 1342

Comments to the author
======================

Strengths:

Proposes a personalized ACC framework that considers driver
preferences and habits to improve driver trust and comfort
in the system
Incorporates real-time driver feedback, continuously
adapting to the driver's preferences and potentially
leading to better performance in unexpected situations
Uses human-in-the-loop (HuiL) simulation experiments to
provide strong evidence for the effectiveness of the
proposed method in reducing driver intervention in
automatic control systems
Uses a stochastic scenario generation approach to generate
realistic and unpredictable speed profiles for experiments,
ensuring more accurate results
Tests and compares four different control strategies,
providing a comprehensive evaluation of the proposed
algorithm's performance
Uses validated offline personalized DGPT learning and
online DGPT adaptation algorithms to strengthen the
validity of the study's results

Weaknesses:

Only includes a small sample size of five test drivers,
limiting the generalizability of the results
Does not consider the effects of weather and road
conditions on the proposed algorithm's performance, which
may affect its applicability in real-world situations
Uses a simplified heuristic algorithm in the online DGPT
adaptation module, potentially limiting the algorithm's
performance and robustness
Lacks sufficient discussion and analysis of the limitations
of the results compared to related literature, which could
provide valuable insights for future studies in the field.
†


Reviewer 4 of SMC 2023 submission 1342

Comments to the author
======================

In this paper authors propose z Personalized ACC system,
which learns from offline and online data, acquired during
driving vehicles.
The nethod uses Inverse Reinforcement Learning in offline
part and a custom heuristic approach in online learning.
All processing is proceeded in cloud in Digital Twin
approach.
The system learns how to set speed of the car to
effectively control car-following maneuvers.

The paper is well written. Language is clear and text flow
is understandable.

Minor notices:
- in algorithm 1 lines 3-8 can be shortened to a single
line "update_flag = (v_f-v)>= V_D or p_t<=P_t"


Reviewer 5 of SMC 2023 submission 1342

Comments to the author
======================

Good job overall! However, it would be even better if you
could provide more information on how the system handles
aggressive driving behaviors. This would help to provide a
more complete understanding of how the technology works in
real-world situations. As we know, not all drivers are good
drivers


Reviewer 6 of SMC 2023 submission 1342

Comments to the author
======================

This paper is about a new approach to autonomous driving,
and I believe the idea presented in the paper is excellent.
However, the experimental conditions are unclear, resulting
in a reduced reliability of the experimental results.
Please consider revising the following aspects:

1.When using abbreviations for the first time in the main
text, it is necessary to include the full name alongside
the abbreviation.
2.Based on the images of the experiment, it does not seem
that the experimental apparatus accurately replicates
driving conditions. As a result, the overall reliability of
the experimental results is compromised. It is necessary to
provide a detailed description of the reliability of the
experimental conditions. For example, it is suggested that
the use of head-mounted displays or similar devices may be
necessary to improve the reproducibility of the driving
conditions.
3.It is necessary to provide details for Scenario A and
Scenario B.
4.There is no explanation regarding the attributes of the
subjects. Without this information, it is unclear whether
the experiment was conducted appropriately.
5.The statement "offline learning may fail when there are
significant changes in driving scenarios or driver mood"
lacks a clear basis.


Reviewer 10 of SMC 2023 submission 1342

Comments to the author
======================

Overall, great paper! I have a few comments from different
sections in the paper for improvement.

Section I:
- I'm not sure that the third contribution (running tests
with a HuiL driving sim) should be considered a unique
contribution, as this method is widely used to
validate/demonstrate ADAS. Also, I would be careful stating
"multiple drivers with diverse styles" unless you back up
the fact that their driving styles are diverse in the
results section

Section II:
- Please clarify what is meant by "optimize the behavior of
the human driver" in subsection A or remove this phrase. In
my mind, these models replicate how a driver follows
vehicles on the road, but doesn't "optimize" or improve on
humans' driving behavior
- Could you more precisely define which "prior knowledge"
is needed to design ODE car following models? This would
help clarify why ODE models are insufficient for the
proposed ACC system
- In subsection B, what is meant by "high-level driving
style"? I'm not sure I understand what high-level vs
non-high-level driving attributes are

Section III:
- A main confusion of mine is what exactly "takeover"
means. Does it mean the ACC is OFF and the driver is fully
in control of throttle+brake or ACC is ON but the driver is
modifying its behavior online by pressing on the throttle
or brake? 
- If ACC is still ON during the takeover segments, could
you clarify how the incremental learning works? It seems
that the IRL process primarily uses manually driven
trajectories. When you send trajectories to the IRL process
taken during the takeover segments, does the IRL algorithm
treat these trajectories the same as the manual
trajectories? Or are they handled differently, since they
come from an ACC system vs from a human driver. And how are
trajectories during incremental learning weighted relative
to the manual trajectories? I would imagine there is much
more manual data than takeover segment data, but those
interventions by the driver are really important to
capture, as they show a clear desire by the driver for the
ACC to behave differently

Section IV:
- The other major question that should be addressed more
clearly: how exactly do the driver's pedal commands during
takeover segments affect the DGPT online? The exact online
modification of the DGPT seems to occur in line 14 of
Algorithm 1, but this line doesn't incorporate the driver's
command or specify how the gap would increase with a brake
pedal press and vice versa for pressing on the throttle
- Could you clarify how the safety time gaps are computed?
I'm not sure what a maximum safety time gap means -- don't
you only need a minimum time gap to prevent collision with
the vehicle ahead? My intuition is that these safety time
gaps should depend on the lead vehicle speed and the
distance to the lead vehicle, but it seems they are fixed
parameters set online
- One small formatting comment: some of the variables in
this section have inconsistent underscore formatting (ie
g_{desired} in equation (4) doesn't match g\_desire in line
14 of Algorithm 1). Cleaning this up would help a bit

Section V:
- In Table 1, several of the data entries are bolded but
it's unclear why those specific entries are bold.
Clarification of what the bold font indicates or removal of
bold font would make the results more straightforward to
interpret
- Additionally, a plot showing data from a representative
test indicating time gaps to lead vehicle, DGPT
modification through online learning, vehicle acceleration
command vs gap to lead vehicle, etc would be helpful. Table
1 shows good overall data, but it's hard to understand the
different components of the proposed system without an
example from a driving test. If space constraints make this
difficult, some of the experimental setup in subsection A
could be made more concise

Section VI:
- It is stated that "the model gradually becomes more
consistent with the driver's driving preferences" through
incremental learning. Is there a way to show this with a
plot or chart? I'm very curious how much differences in the
DGPT are influenced by incremental learning between
instances where ACC is ON vs online adaptation within one
instance when ACC is ON