WO2016202009A1 - Road traffic light coordination and control method based on reinforcement learning - Google Patents

Road traffic light coordination and control method based on reinforcement learning Download PDF

Info

Publication number
WO2016202009A1
WO2016202009A1 PCT/CN2016/075265 CN2016075265W WO2016202009A1 WO 2016202009 A1 WO2016202009 A1 WO 2016202009A1 CN 2016075265 W CN2016075265 W CN 2016075265W WO 2016202009 A1 WO2016202009 A1 WO 2016202009A1
Authority
WO
WIPO (PCT)
Prior art keywords
traffic
lane
intersection
phase state
road
Prior art date
Application number
PCT/CN2016/075265
Other languages
French (fr)
Chinese (zh)
Inventor
朱斐
朱海军
伏玉琛
刘全
杨炯
任勇
Original Assignee
苏州大学张家港工业技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 苏州大学张家港工业技术研究院 filed Critical 苏州大学张家港工业技术研究院
Publication of WO2016202009A1 publication Critical patent/WO2016202009A1/en

Links

Classifications

    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/07Controlling traffic signals
    • G08G1/08Controlling traffic signals according to detected number or speed of vehicles

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Traffic Control Systems (AREA)

Abstract

A road traffic light coordination and control method based on reinforcement learning, comprising: a monitoring device is provided corresponding to each intersection, and each monitoring device is connected to a remote server through a network module. The control method comprises: (1) the remote server calculates a waiting time S by receiving a video signal; (2) the remote server performs analysis to obtain a road congestion condition under each phase state a i; (3) the remote server obtains a feasible degree ci ai under the phase state a i, wherein when a flow of traffic can pass through the road, the road is clear and the feasible degree ci ai is 1; otherwise, the road is congested and the feasible degree ci ai is 0; (4) the waiting time S and the feasible degree ci ai are used to calculate an optimal driving phase state a i of the intersection; (5) adjust the traffic lights. Based on video information acquired in real time and by means of coordination and control of traffic lights of a plurality of intersections in one area, traffic efficiency is improved, the flow of traffic of the area is maximized, and the road traffic congestion condition is alleviated.

Description

一种基于强化学习的路面交通信号灯协调控制方法  Road traffic signal light coordinated control method based on reinforcement learning
技术领域 Technical field
[0001] 本发明涉及一种路面交通信号灯控制方法, 尤其涉及一种基于强化学习的路面 交通信号灯协调控制方法。  [0001] The present invention relates to a road traffic signal light control method, and more particularly to a road traffic signal light coordinated control method based on reinforcement learning.
背景技术  Background technique
[0002] 交通是现代社会的基础, 是人类社会经济的命脉, 人们的社会行为与交通息息 相关。 一个城市中, 机动车、 非机动车保有量大, 路口和路段情况纷繁复杂, 要处理这样一个规模庞大、 动态、 具有高度不确定性的分布式系统, 进行有效 的控制, 是一件十分复杂的工作。 在不新增交通道路的情况下, 通过合理的交 通控制, 提高道路的利用效率, 进而提高交通通行效率是快速解决城市交通问 题的一种有效途径。  [0002] Transportation is the foundation of modern society and the lifeblood of human society. People's social behavior is closely related to transportation. In a city, the number of motor vehicles and non-motor vehicles is large, and the intersections and road sections are complicated. It is very complicated to deal with such a large-scale, dynamic and highly uncertain distributed system. work. In the absence of new traffic roads, improving traffic utilization efficiency through reasonable traffic control, and thus improving traffic efficiency is an effective way to quickly solve urban traffic problems.
[0003] 然而, 现在交通拥挤、 堵塞现象日益严重。 导致交通问题的原因, 一方面是由 于车辆越来越多, 交通规划与设计滞后, 另一方面在于很多交通信号控制系统 较为落后, 交通信号灯未能很好地根据实吋交通情况调节交通流量, 起到提高 交通通行效率的作用。 通过计算技术和机器智能帮助解决交通问题愈来愈受到 人们的重视, 已经成为趋势。  [0003] However, traffic congestion and congestion are now becoming more serious. The reason for the traffic problem is that on the one hand, due to more and more vehicles, traffic planning and design lags behind, on the other hand, many traffic signal control systems are relatively backward, and traffic lights are not well regulated according to actual traffic conditions. To improve the efficiency of traffic. The use of computing technology and machine intelligence to help solve traffic problems has become more and more important, and has become a trend.
[0004] 近年来, 大量路面交通监控设备投入使用, 实吋交通视频数据不间断地传输给 交通管理部门。 如何充分利用好这些交通视频数据, 改进路面交通信号灯的控 制, 以提高路面交通通行效率, 已经引起了越来越多的关注。  [0004] In recent years, a large number of road traffic monitoring devices have been put into use, and traffic video data has been continuously transmitted to the traffic management department. How to make full use of these traffic video data and improve the control of road traffic signals to improve the efficiency of road traffic has attracted more and more attention.
[0005] 目前已经有一些智能交通控制系统得到了应用, 但在实际交通控制所面临的一 个交通区域片内相邻交叉路口间的拥堵问题没有得到很好地解决。 区域路面交 通协调控制能较好地处理这个问题。 区域路面交通信号灯控制, 通过考虑一个 交通区域内多个路口的交通通行情况进行交通信号灯控制, 可以获得比仅仅考 虑单个路口的交通通行情况就进行交通信号控制更高的交通通行效率。 如"绿波 带"式的路面交通信号灯控制方法就是在指定的交通线路上, 当规定好路段的机 动车车速范围后, 要求信号控制机根据路段距离, 把机动车所经过的各路口绿 灯起始吋间, 做相应的调整, 这样一来, 以确保机动车到达每个路口吋, 正好 遇到"绿灯", 从而使该线路的机动车获得最高的交通通行效率。 [0005] At present, some intelligent traffic control systems have been applied, but the congestion problem between adjacent intersections in a traffic area faced by actual traffic control has not been well solved. Regional road traffic coordination control can better deal with this problem. Regional road traffic signal control, by considering the traffic traffic conditions of multiple intersections in a traffic area, traffic traffic light control can achieve higher traffic efficiency than traffic signal control considering only the traffic conditions of a single intersection. For example, the "green wave belt" type road traffic signal light control method is on the designated traffic line. When the vehicle speed range of the road section is specified, the signal control machine is required to pass the intersection of the motor vehicle according to the road section distance. At the beginning of the lamp, adjust accordingly, so as to ensure that the motor vehicle arrives at each intersection and encounters a "green light", so that the motor vehicle of the line can obtain the highest traffic efficiency.
[0006] 然而该方法无法根据实吋路面交通状况作出实际的调整, 使得区域路面交通信 号控制无法发挥其优势, 形同虚设。 例如, 在早高峰及晚高峰吋, 需要考虑的 因素较多, 如公交车车站附近公交聚集, 学校附近车辆行人在上学和放学吋爆 增, 等等。 这些因素会造成某些路口通行不畅, 甚至瘫痪。 目前, 很多交通管 理部门只能依靠人力的方式现场指挥, 直接手动控制信号灯的变化。 而人工方 式管理交通信号灯容易造成疏漏; 同吋, 人工方式管理交通信号灯一般只能管 理单个路口的信号灯, 很难做到区域信号灯的协调控制, 很可能是交通参与者 虽然通过了某个路口, 但是由于前方交通流量大, 导致仍然遭遇拥堵的尴尬局 面。 若此吋考虑区域交通协调, 停止放行很可能是最佳方案。 因此, 如何最大 化利用目前现有的实吋交通视频数据及设备, 实现区域交通协调控制, 实吋适 应路况变化, 减轻交通管理部门的工作量, 缓解交通拥堵状况, 是我们目前急 需要解决的问题。  [0006] However, this method cannot make actual adjustments according to the actual road traffic conditions, so that the regional road traffic signal control can not exert its advantages and is ineffective. For example, in the morning peaks and late peaks, there are many factors to consider, such as the bus gathering near the bus station, the pedestrians in the vicinity of the school are experiencing school and school, and so on. These factors can cause some road junctions to be poor or even embarrassing. At present, many traffic management departments can only rely on manpower to conduct on-site command and directly control the changes of signal lights. Manually managing traffic lights is easy to cause omissions. Similarly, manual management of traffic lights can only manage the traffic lights of a single intersection. It is difficult to achieve coordinated control of regional traffic lights. It is likely that traffic participants have passed a certain intersection. However, due to the large traffic volume in front, the embarrassing situation of still encountering congestion is still caused. If you consider regional traffic coordination, stopping the release is probably the best option. Therefore, how to maximize the use of the existing actual traffic video data and equipment, achieve regional traffic coordination control, adapt to changes in road conditions, reduce the workload of traffic management departments, ease traffic congestion, is urgently needed to solve problem.
技术问题  technical problem
问题的解决方案  Problem solution
技术解决方案  Technical solution
[0007] 本发明目的是: 提供一种基于强化学习的路面交通信号灯协调控制方法, 通过 采集实吋视频数据, 以车辆状态迁移为基础, 自动调节与控制某一区域的交通 信号灯, 提高交通参与者通行的效率, 缓解交通拥堵情况, 进而减轻交通管理 部门的工作量。  [0007] The object of the present invention is to provide a road traffic signal coordinated control method based on reinforcement learning, which automatically adjusts and controls traffic lights of a certain area to improve traffic participation by collecting actual video data and based on vehicle state transition. The efficiency of the traffic, ease the traffic congestion, and thus reduce the workload of the traffic management department.
[0008] 本发明的技术方案是: 一种基于强化学习的路面交通信号灯协调控制方法, 包 括对应每个路口设有监控设备, 每一所述监控设备经网络模块与远程服务器连 接, 其控制方法为:  [0008] The technical solution of the present invention is: a method for coordinated control of road traffic signal lights based on reinforcement learning, comprising monitoring devices corresponding to each intersection, each of the monitoring devices being connected to a remote server via a network module, and a control method thereof For:
[0009] (1)远程服务器通过接收监控设备发送的视频信号, 计算对应路口每个道车上车 辆的等待吋间 s, 该等待吋间为车辆在红灯和绿灯情况下停车吋间;  [0009] (1) The remote server receives the video signal sent by the monitoring device, and calculates the waiting time s of the vehicle on each road corresponding to the intersection, which is the parking time of the vehicle in the case of red light and green light;
[0010] (2)将交叉路口每个红-绿灯对应车道通行方式的组合作为一个相位状态 , 远程服务器在每个相位状态 [0010] (2) Taking the combination of each red-green light corresponding to the lane passing mode at the intersection as a phase state , the remote server is in each phase state
下, 根据步骤( 1 )中得出的等待吋间分析获得道路拥堵情况; Next, obtain the road congestion condition according to the waiting day analysis obtained in step (1);
[0011] (3)根据当前相位状态 [0011] (3) according to the current phase state
下, 其绿灯可通行车道的车流通行情况, 远程服务器获得该相位状态 Next, the green light can pass through the traffic lane of the lane, and the remote server obtains the phase state.
下的可行度
Figure imgf000005_0001
Feasibility
Figure imgf000005_0001
, 当车流可以通过吋表示为通畅, 可行度
Figure imgf000005_0002
, when the traffic flow can be expressed as fluent through 吋, feasibility
Figure imgf000005_0002
为 1, 否则为拥堵, 可行度  Is 1, otherwise it is congested, feasible
为 0; Is 0;
[0012] (4)远程服务器通过步骤 (1)中获得的等待吋间 S以及步骤 (3)中获得的可行度 '¾  [0012] (4) The remote server obtains the waiting time S obtained in the step (1) and the feasibility obtained in the step (3) '3⁄4
, 分析判断出该路口各个相位状态  , analyzing and judging the phase states of the intersection
下的行车情况, 通过一定吋间行车情况数据的记录及更新, 由程序软件分析计 算出在该路口最优行车相位状态 Under the driving situation, through the recording and updating of the driving data of the daytime, the optimal driving phase state at the intersection is calculated by the program software analysis.
[0013] (5)根据最优行车相位状态 [0013] (5) according to the optimal driving phase state
, 调整该路口的红灯绿灯组合亮起的吋间, 获得最大的行车流量。 [0014] 上述技术方案中, 所述相位状态 Adjust the red light green light combination of the intersection to get the maximum traffic flow. [0014] in the above technical solution, the phase state
为路面交通信号的红灯绿灯组合状态下各车道的车行状态, 对应绿灯的车道上 , 车辆允许直行通过路口到达对面车道, 同吋右转车道也被允许通行, 只有当 直行与右转均可通行的状态下, 所述步骤 (3)中的可行度 ^¾: For the road traffic signal, the red light green light combination state of the lanes of each lane, corresponding to the green light lane, the vehicle is allowed to go straight through the intersection to reach the opposite lane, and the right turn lane is also allowed to pass, only when going straight and turning right In the passable state, the feasibility in the step (3) is ^3⁄4:
为 1, 否则视为拥堵, 可行度  Is 1, otherwise considered as congestion, feasibility
为 0; 对应红灯的车道上, 车辆为停车状态。 0; On the lane corresponding to the red light, the vehicle is parked.
[0015] 上述技术方案中, 所述等待吋间包括该车道上车辆红灯状态下的停车吋间, 以 及绿灯状态下未能前行的停车吋间。 [0015] In the above technical solution, the waiting time includes a parking space in a red light state of the vehicle on the lane, and a parking space in a green light state.
[0016] 上述技术方案中, 根据主、 次干道或公交车道的车流量需要, 设置对应车道的 权重值 [0016] In the above technical solution, the weight value of the corresponding lane is set according to the traffic flow demand of the main road, the secondary road or the bus lane.
: ¾ : 3⁄4
[0017] 上述技术方案中, 所述步骤 (4)中"程序软件分析计算"为核函数, 通过核函数对 比现有行车情况与之前留存在数据库中的已知行车情况之间的相似度, 综合考 虑路口多个相位状态下的行车情况, 优先选择长吋间没有执行的相位状态以及 重要的相位状态, 执行该相位状态可使得所有处于等待状态的车辆在红灯和绿 灯"等待吋间"之差的和最大; 所述重要的相位状态为主干道或公交车道通行的相 位状态, 可通过设置相应车道的权重值 [0017] In the above technical solution, the “program software analysis calculation” in the step (4) is a kernel function, and the similarity between the existing driving situation and the known driving situation remaining in the database before is compared by the kernel function. Considering the driving situation in multiple phase states of the intersection, the phase state that is not executed between the long turns and the important phase state are preferentially selected. The phase state can be executed so that all the vehicles in the waiting state are waiting for the red light and the green light. The difference between the sum and the maximum; the important phase state is the phase state of the main road or the bus lane, and the weight value of the corresponding lane can be set.
的初始值来实现。 The initial value is implemented.
[0018] 上述技术方案中, 所述网络模块为以太网有线模块或无线数据传输网络模块。  [0018] In the above technical solution, the network module is an Ethernet cable module or a wireless data transmission network module.
发明的有益效果  Advantageous effects of the invention
有益效果 [0019] 由于上述技术方案运用, 本发明与现有技术相比具有下列优点: Beneficial effect [0019] Due to the above technical solutions, the present invention has the following advantages over the prior art:
[0020] 1 . 本发明通过获取由监控设备收录的视频信息, 提取视频中在不同相位信号 灯变化状态下的车流量情况, 服务器根据路面交通状况实吋调整信号灯的变化 [0020] 1. The invention obtains the traffic flow condition of the different phase signal lights in the video by acquiring the video information recorded by the monitoring device, and the server adjusts the change of the signal light according to the road traffic condition.
, 让路口交通流量最大化, 减少拥堵; To maximize traffic flow at intersections and reduce congestion;
[0021] 2. 服务器收集实吋视频数据, 以车辆状态迁移为基础, 计算车辆的等待吋间[0021] 2. The server collects the actual video data, and calculates the waiting time of the vehicle based on the vehicle state transition.
, 运用强化学习的核算法选择相位状态, 找出一个可以使所有车辆等待吋间最 短的相位状态, 实吋调整信号灯的变化, 满足路面交通状况的瞬息万变; Use the kernel algorithm of reinforcement learning to select the phase state, find a phase state that allows all vehicles to wait for the shortest phase, and adjust the changes of the signal light to meet the rapid changes of road traffic conditions;
[0022] 3. 本发明中考虑到各种车道的主次性以及行驶车辆的特殊性, 设置权重值 [0022] 3. In the present invention, the primary and secondary characteristics of various lanes and the particularity of the traveling vehicle are considered, and the weight value is set.
的初始值, 也就是每个车道设置不同的权重值, 在服务器选择吋, 优先考虑这 些车道的通行, 如主干道或公交车道, 优化整个路面交通控制系统。 The initial value, that is, the different weight values for each lane, is prioritized for the passage of the lanes, such as the main road or bus lane, to optimize the entire road traffic control system.
对附图的简要说明  Brief description of the drawing
附图说明  DRAWINGS
[0023] 图 1是本发明中实施例- -的相位状态 1下车道及车位的排列示意图; 1 is a schematic view showing the arrangement of lanes and parking spaces in phase state 1 of the embodiment of the present invention;
[0024] 图 2是本发明中实施例- -的相位状态 1-4示意图; 2 is a schematic view of a phase state 1-4 of an embodiment of the present invention;
[0025] 图 3是本发明中实施例- -的相位状态 5-8示意图; 3 is a schematic view of a phase state 5-8 of an embodiment of the present invention;
[0026] 图 4是本发明中实施例- -的某一交通区域的网络结构拓扑图; 4 is a topological view of a network structure of a certain traffic area in an embodiment of the present invention;
[0027] 图 5是本发明中实施例- -的某一路口的网络结构拓 卜图。 5 is a network diagram of a certain intersection of an embodiment of the present invention.
本发明的实施方式 Embodiments of the invention
[0028] [0008]下面结合附图及实施例对本发明作进一步描述: [0008] The present invention is further described below in conjunction with the accompanying drawings and embodiments:
[0029] 实施例一: 参见图 1〜5所示, 一种基于强化学习的路面交通信号灯协调控制方 法, 包括对应每个路口设有监控设备, 每一所述监控设备经以太网有线网模块 (或无线网模块) 与远程服务器连接, 其控制方法为:  [0029] Embodiment 1: Referring to FIG. 1 to FIG. 5, a method for coordinated control of road traffic signal lights based on reinforcement learning, comprising monitoring devices corresponding to each intersection, each of the monitoring devices passing through an Ethernet wired network module (or wireless network module) Connect to a remote server, the control method is:
[0030] (1)远程服务器通过接收监控设备发送的视频信号, 计算对应路口每个道车上车 辆的等待吋间 s, 该等待吋间为车辆在红灯和绿灯情况下停车吋间;  [0030] (1) The remote server receives the video signal sent by the monitoring device, and calculates the waiting time s of the vehicle on each road corresponding to the intersection, which is the parking time of the vehicle in the case of red light and green light;
[0031] (2)将交叉路口每个红-绿灯对应车道通行方式的组合作为一个相位状态 , 远程服务器在每个相位状态 [0031] (2) Taking a combination of each red-green light corresponding to the lane passing mode at the intersection as a phase state , the remote server is in each phase state
下, 根据步骤( 1 )中得出的等待吋间分析获得道路拥堵情况; Next, obtain the road congestion condition according to the waiting day analysis obtained in step (1);
[0032] (3)根据当前相位状态 [0032] (3) according to the current phase state
下, 其绿灯可通行车道的车流通行情况, 远程服务器获得该相位状态 Next, the green light can pass through the traffic lane of the lane, and the remote server obtains the phase state.
下的可行度
Figure imgf000008_0001
Feasibility
Figure imgf000008_0001
, 当车流可以通过吋表示为通畅, 可行度  , when the traffic flow can be expressed as fluent through 吋, feasibility
为 1, 否则为拥堵, 可行度 : Is 1, otherwise congestion, feasibility:
为 0; 如图 1所示的相位状态 1下, 出口车道就是车道 1、 2、 5、 6、 9、 10、 13、 14, 当它们都是畅通的, 则相位状态 1的可行度为 1。  0; as shown in Figure 1, the exit lane is the lane 1, 2, 5, 6, 9, 10, 13, 14, and when they are all clear, the feasibility of phase state 1 is 1. .
[0033] 4)远程服务器通过步骤 (1)中获得的等待吋间 S以及步骤 (3)中获得的可行度
Figure imgf000008_0002
[0033] 4) the remote server obtains the waiting time S obtained in step (1) and the feasibility obtained in step (3)
Figure imgf000008_0002
, 分析判断出该路口各个相位状态  , analyzing and judging the phase states of the intersection
下的行车情况, 通过一定吋间行车情况数据的记录及更新, 由程序软件分析计 算出在该路口最优行车相位状态 Under the driving situation, through the recording and updating of the driving data of the daytime, the optimal driving phase state at the intersection is calculated by the program software analysis.
[0034] (5)根据最优行车相位状态 , 调整该路口的红灯绿灯组合亮起的吋间, 获得最大的行车流量。 [0034] (5) according to the optimal driving phase state Adjust the red light green light combination of the intersection to get the maximum traffic flow.
[0035] 如图 2-3所示, 为四车道的交叉路口的 8种相位状态图, 虚线箭头表示可通行的 方向, 即绿灯状态的车道, 实线箭头表示不可通行的方向, 即红灯状态的车道  [0035] As shown in FIG. 2-3, there are eight kinds of phase state diagrams of a four-lane intersection, a dotted arrow indicates a passable direction, that is, a green light lane, and a solid arrow indicates an impassable direction, that is, a red light. State lane
[0036] 控制步骤如下: [0036] The control steps are as follows:
[0037] (1)初始化路面交通网络中所有交叉路口服务器的 Q值査找表, 并且 Q表中存放 鮮:纏 ¾  [0037] (1) Initialize the Q value lookup table of all the intersection servers in the road traffic network, and store the Q table fresh: wrap 3⁄4
的值, 其中
Figure imgf000009_0001
Value, where
Figure imgf000009_0001
指的是如图 1中的车辆位置, 并且 Refers to the vehicle position as shown in Figure 1, and
- 3:+: 5 * ί ; ¾ 猜: 5—Έ  - 3:+: 5 * ί ; 3⁄4 Guess: 5—Έ
, /指的是如图 1中的车道。 Q表中值初始值设定为 0。 初始化折扣因子 , / refers to the lane as shown in Figure 1. The initial value of the Q table is set to 0. Initialization discount factor
Τ Τ
、 学习率  Learning rate
。 初始化所有服务器的相位 权重 . Initialize the phase weights of all servers
, 随机初始化每个服务器起始动作 ,
Figure imgf000009_0002
, randomly initialize each server start action,
Figure imgf000009_0002
, 并执行。 仿真吋间步 t的初值为 0。 [0038] (2)每个交叉口服务器通过公式 And execute. The initial value of the simulation step t is 0. [0038] (2) Each intersection server passes the formula
Figure imgf000010_0001
Figure imgf000010_0001
计算所有车状态  Calculate all vehicle status
与 Q表中存在的 With the Q table
的 k值, 并保存在 K表中。 其中 The k value is stored in the K table. among them
是否相似指的是两个车道之间是否相似, 例如图 1中车道 3与车道 11是相似的。 Whether or not similar refers to whether the two lanes are similar, for example, lane 3 is similar to lane 11 in FIG.
versus
车道是否旋转对称, 魔 Whether the lane is rotationally symmetrical, magic
表示括号内条件满足结果为 1, 否则为 0;  Indicates that the condition in parentheses satisfies the result as 1, otherwise it is 0;
S 表示与状态  S representation and status
近似相关的状态集合。 Approximate related state collection.
[0039] (3) (3)
Figure imgf000011_0001
Figure imgf000011_0001
, 每个交叉口服务器观察它的入口车道, 根据相连交叉口的观察数据更新  , each intersection server observes its entrance lane and updates according to the observation data of the connected intersection
值, 如果出口车道有拥堵的话
Figure imgf000011_0002
Value, if the exit lane is congested
Figure imgf000011_0002
否则,
Figure imgf000011_0003
otherwise,
Figure imgf000011_0003
。 根据式  . According to the formula
誦台囊 囊台囊
更新权重, 当 Update weight, when
ί ί
是 500的整数倍吋, 依据式 纖隱 Is an integer multiple of 500, according to the formula
更新学习率 的值, 其中%是取余运算符。 Update the value of the learning rate, where % is the remainder operator.
(4)系统中每个服务器独自根据观察到的车辆的状态迁移、 Q表和 K表, 通过
Figure imgf000011_0004
(4) Each server in the system passes the observed state transition of the vehicle alone, the Q table and the K table,
Figure imgf000011_0004
来更新 Q值表和现实共同存在的状态 s与相位 分解到具体的路面交通灯的动 作的 to的 Q值。 其中当
Figure imgf000011_0005
Figure imgf000012_0001
It is necessary to update the Q value of the state in which the Q value table and the reality coexist and the TO of the phase decomposition to the action of the specific road traffic light. Which
Figure imgf000011_0005
Figure imgf000012_0001
, 否则  Otherwise
(5)系统中每个服务器根据 Q表和 Κ表的值, 按照公式
Figure imgf000012_0002
(5) Each server in the system according to the values of the Q table and the table, according to the formula
Figure imgf000012_0002
选择具有最大收益值的动作  Select the action with the highest return value
Si Si
, 其中
Figure imgf000012_0003
, among them
Figure imgf000012_0003
。 通过相位相关的两个参数权重 ¾  . Two parameter weights through phase correlation 3⁄4
以及拥堵参数 And congestion parameters
:  :
挑选那些权重大的长吋间没有执行的以及出口没有拥堵情况的相位执行, 此外 参数 :'  Select the phase executions that are not executed and those that are not congested, and the parameters are: '
使得服务器做决策吋考虑其他交叉口的拥堵情况, 实现了服务器之间共享路面 交通状况的协作。 相位选择会优先选取车体长的车优先离幵,  The server makes decisions and considers congestion at other intersections, achieving collaboration between servers to share road traffic conditions. The phase selection will give priority to the car priority of the car body.
*i *i
表示车 s的车体长度, 即公交车优先。 嶽 ―: Sl^* ^έ¾ΐ: 表示等待车辆 s在路面交通灯为红灯, 以及路面交通灯为绿灯收益的差。 采取某 相位动作所有处于等待状态的车收益差之和最大, 及说明该相位可以让车辆的 平均等待吋间最短, 这样与我们最终目的就一致了, 让路口交通流量最大化, 减少拥堵。 S represents the vehicle body length, i.e., the bus priority. Yue -: Sl^* ^έ3⁄4ΐ: It means that waiting for the vehicle s to be red light on the road traffic light, and the difference between the road traffic light and the green light income. Taking the phase action is the sum of the difference in the return of all the cars in the waiting state, and indicating that the phase can make the average waiting time of the vehicle the shortest, which is consistent with our ultimate goal, to maximize the traffic flow at the intersection and reduce the congestion.
(6)系统的每个服务器根据执行选择好的相位  (6) Each server of the system selects the phase according to the execution.
m■= ^ m■= ^
, 调整路面交通信号灯。 转 (3)。  , adjust the road traffic lights. Turn (3).

Claims

权利要求书 Claim
[权利要求 1] 一种基于强化学习的路面交通信号灯协调控制方法, 包括对应每个路 口设有监控设备, 每一所述监控设备经网络模块与远程服务器连接, 其控制方法为:  [Claim 1] A method for coordinated control of road traffic signals based on reinforcement learning, comprising monitoring devices corresponding to each intersection, each of the monitoring devices being connected to a remote server via a network module, and the control method thereof is:
(1)远程服务器通过接收监控设备发送的视频信号, 计算对应路口每 个道车上车辆的等待吋间 s, 该等待吋间为车辆在红灯和绿灯情况下 停车吋间; (1) The remote server receives the video signal sent by the monitoring device, and calculates the waiting time s of the vehicle on each road of the corresponding intersection, which is the parking time of the vehicle in the case of red light and green light;
(2)将交叉路口每个红 -绿灯对应车道通行方式的组合作为一个相位状 态  (2) Take the combination of each red-green light corresponding to the lane traffic mode at the intersection as a phase state
, 远程服务器在每个相位状态 , the remote server is in each phase state
下, 根据步骤( 1 )中得出的等待吋间分析获得道路拥堵情况; Next, obtain the road congestion condition according to the waiting day analysis obtained in step (1);
(3)根据当前相位状态  (3) according to the current phase state
下, 其绿灯可通行车道的车流通行情况, 远程服务器获得该相位状态 Next, the green light can pass through the traffic lane of the lane, and the remote server obtains the phase state.
下的可行度 ¾: The next feasible 3⁄4:
当车流可以通过吋表示为通畅, 可行度
Figure imgf000014_0001
When the traffic flow can be expressed as fluent through 吋, the feasibility
Figure imgf000014_0001
为 1, 否则为拥堵, 可行度
Figure imgf000014_0002
Is 1, otherwise it is congested, feasible
Figure imgf000014_0002
(4)远程服务器通过步骤 (1)中获得的等待吋间 S以及步骤 (3)中获得的可 行度
Figure imgf000015_0001
(4) The remote server obtains the waiting time S obtained in step (1) and the available time in step (3). Measure
Figure imgf000015_0001
, 分析判断出该路口各个相位状态  , analyzing and judging the phase states of the intersection
下的行车情况, 通过一定吋间行车情况数据的记录及更新, 由程序软 件分析计算出在该路口最优行车相位状态 Under the driving situation, through the recording and updating of the driving data of the daytime, the optimal driving phase state at the intersection is calculated by the program software.
(5)根据最优行车相位状态 (5) According to the optimal driving phase state
, 调整该路口的红灯绿灯组合亮起的吋间, 获得最大的行车流量。 Adjust the red light green light combination of the intersection to get the maximum traffic flow.
[权利要求 2] 根据权利要求 1所述的基于强化学习的路面交通信号灯协调控制方法[Claim 2] The roadway traffic signal coordinated control method based on reinforcement learning according to claim 1
, 其特征在于: 所述相位状态 , characterized by: the phase state
为路面交通信号的红灯绿灯组合状态下各车道的车行状态, 对应绿灯 的车道上, 车辆允许直行通过路口到达对面车道, 同吋右转车道也被 允许通行, 只有当直行与右转均可通行的状态下, 所述步骤 (3)中的 可行度
Figure imgf000015_0002
For the road traffic signal, the red light green light combination state of the lanes of each lane, corresponding to the green light lane, the vehicle is allowed to go straight through the intersection to reach the opposite lane, and the right turn lane is also allowed to pass, only when going straight and turning right Feasibility in the step (3) in a passable state
Figure imgf000015_0002
为 1, 否则视为拥堵, 可行度 :  Is 1, otherwise considered as congestion, feasibility:
为 0; 对应红灯的车道上, 车辆为停车状态。  0; On the lane corresponding to the red light, the vehicle is parked.
[权利要求 3] 根据权利要求 1所述的基于强化学习的路面交通信号灯协调控制方法 [Claim 3] The roadway traffic signal coordinated control method based on reinforcement learning according to claim 1
, 其特征在于: 所述等待吋间包括该车道上车辆红灯状态下的停车吋 间, 以及绿灯状态下未能前行的停车吋间。 The characteristic is that: the waiting time includes a parking space in a red light state of the vehicle on the lane, and a parking space in a green light state.
[权利要求 4] 根据权利要求 1所述的基于强化学习的路面交通信号灯协调控制方法 , 其特征在于: 根据主、 次干道或公交车道的车流量需要, 设置对应 车道的权重值
Figure imgf000016_0001
[Claim 4] The roadway traffic signal coordinated control method based on reinforcement learning according to claim 1 , characterized by: setting the weight value of the corresponding lane according to the traffic volume of the main, secondary or bus lanes
Figure imgf000016_0001
[权利要求 5] 根据权利要求 1所述的基于强化学习的路面交通信号灯协调控制方法[Claim 5] The roadway traffic signal coordinated control method based on reinforcement learning according to claim 1
, 其特征在于: 所述步骤 (4)中"程序软件分析计算"为核函数, 通过核 函数对比现有行车情况与之前留存在数据库中的已知行车情况之间的 相似度, 综合考虑路口多个相位状态下的行车情况, 优先选择长吋间 没有执行的相位状态以及重要的相位状态, 执行该相位状态可使得所 有处于等待状态的车辆在红灯和绿灯"等待吋间"之差的和最大; 所述 重要的相位状态为主干道或公交车道通行的相位状态, 可通过设置相 应车道的权重值
Figure imgf000016_0002
, characterized in that: in the step (4), the "program software analysis and calculation" is a kernel function, and the similarity between the existing driving situation and the known driving situation remaining in the database before is compared by the kernel function, and the intersection is comprehensively considered. In the case of driving in multiple phase states, the phase state that is not executed between the long turns and the important phase state are preferentially selected, and the phase state is executed so that all the vehicles in the waiting state are in the red light and the green light "waiting for the day". And the maximum phase state of the main phase or the transit lane, which can be set by the weight value of the corresponding lane
Figure imgf000016_0002
的初始值来实现。  The initial value is implemented.
[权利要求 6] 根据权利要求 1所述的基于强化学习的路面交通信号灯协调控制方法 [Claim 6] The roadway traffic signal coordinated control method based on reinforcement learning according to claim 1
, 其特征在于: 所述网络模块为以太网有线模块或无线数据传输网络 模块。 The network module is an Ethernet cable module or a wireless data transmission network module.
PCT/CN2016/075265 2015-06-17 2016-03-01 Road traffic light coordination and control method based on reinforcement learning WO2016202009A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510338644.6 2015-06-17
CN201510338644.6A CN105046987B (en) 2015-06-17 2015-06-17 A kind of road traffic Control of coordinated signals method based on intensified learning

Publications (1)

Publication Number Publication Date
WO2016202009A1 true WO2016202009A1 (en) 2016-12-22

Family

ID=54453489

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/075265 WO2016202009A1 (en) 2015-06-17 2016-03-01 Road traffic light coordination and control method based on reinforcement learning

Country Status (2)

Country Link
CN (1) CN105046987B (en)
WO (1) WO2016202009A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110738860A (en) * 2019-09-18 2020-01-31 平安科技(深圳)有限公司 Information control method and device based on reinforcement learning model and computer equipment
US11080602B1 (en) 2020-06-27 2021-08-03 Sas Institute Inc. Universal attention-based reinforcement learning model for control systems
CN113487891A (en) * 2021-06-04 2021-10-08 东南大学 Intersection joint signal control method based on Nash Q learning algorithm

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105046987B (en) * 2015-06-17 2017-07-07 苏州大学 A kind of road traffic Control of coordinated signals method based on intensified learning
CN105513376A (en) * 2015-11-20 2016-04-20 小米科技有限责任公司 Traffic light adjustment method and device
CN105654744B (en) * 2016-03-10 2018-07-06 同济大学 A kind of improvement traffic signal control method based on Q study
CN106910351B (en) * 2017-04-19 2019-10-11 大连理工大学 A kind of traffic signals self-adaptation control method based on deeply study
CN106991707B (en) * 2017-05-27 2020-02-18 浙江宇视科技有限公司 Traffic signal lamp image strengthening method and device based on day and night imaging characteristics
CN109979191B (en) * 2017-12-28 2022-02-11 杭州海康威视系统技术有限公司 Traffic signal control method, traffic signal control device, electronic equipment and computer-readable storage medium
CN108961788B (en) * 2018-07-20 2020-09-08 张鹏 Intelligent traffic signal lamp changing method
CN109215355A (en) * 2018-08-09 2019-01-15 北京航空航天大学 A kind of single-point intersection signal timing optimization method based on deeply study
CN109035812B (en) * 2018-09-05 2021-07-27 平安科技(深圳)有限公司 Traffic signal lamp control method and device, computer equipment and storage medium
CN110060475B (en) * 2019-04-17 2021-01-05 清华大学 Multi-intersection signal lamp cooperative control method based on deep reinforcement learning
CN110246345B (en) * 2019-05-31 2020-09-29 闽南师范大学 Signal lamp intelligent control method and system based on HydraCNN
CN111047884A (en) * 2019-12-30 2020-04-21 西安理工大学 Traffic light control method based on fog calculation and reinforcement learning
WO2021146918A1 (en) * 2020-01-21 2021-07-29 深圳元戎启行科技有限公司 Traffic light control method and apparatus, computer device, and storage medium
CN111260937B (en) * 2020-02-24 2021-09-14 武汉大学深圳研究院 Cross traffic signal lamp control method based on reinforcement learning
CN111710177B (en) * 2020-05-11 2021-07-27 华东师范大学 Intelligent traffic signal lamp networking cooperative optimization control system and control method
CN113763730B (en) * 2020-06-05 2023-01-24 杭州海康威视数字技术股份有限公司 Method and device for determining utilization rate of green wave bandwidth
CN112863206B (en) * 2021-01-07 2022-08-09 北京大学 Traffic signal lamp control method and system based on reinforcement learning
CN113487902B (en) * 2021-05-17 2022-08-12 东南大学 Reinforced learning area signal control method based on vehicle planned path
CN113393679B (en) * 2021-06-10 2022-09-06 中南大学 Regional traffic guidance method and system based on traffic intersection traffic flow identification and statistics
CN113870589B (en) * 2021-09-03 2023-05-02 复旦大学 Intersection signal lamp and variable lane joint control system and method

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6647361B1 (en) * 1998-11-23 2003-11-11 Nestor, Inc. Non-violation event filtering for a traffic light violation detection system
CN1936999A (en) * 2006-10-17 2007-03-28 大连理工大学 City area-traffic cooperative control method based wireless sensor network
CN101901550A (en) * 2010-06-24 2010-12-01 北京航空航天大学 Vehicle flow detection system and traffic lamp control method
CN102142197A (en) * 2011-03-31 2011-08-03 汤一平 Intelligent traffic signal lamp control device based on comprehensive computer vision
CN104008659A (en) * 2014-06-12 2014-08-27 北京易华录信息技术股份有限公司 System and method capable of accurately monitoring control effects of intersection signal controller
CN105046987A (en) * 2015-06-17 2015-11-11 苏州大学 Pavement traffic signal lamp coordination control method based on reinforcement learning

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4738031B2 (en) * 2005-03-18 2011-08-03 小糸工業株式会社 Traffic signal control apparatus and traffic signal system using the same
CN103280114B (en) * 2013-06-24 2015-01-07 电子科技大学 Signal lamp intelligent control method based on BP-PSO fuzzy neural network
CN104077918B (en) * 2014-07-02 2016-08-17 上海理工大学 Based on vehicle-mounted data urban traffic intersection signal lights self-adaptation control method
CN104575035B (en) * 2015-01-22 2016-08-17 大连理工大学 A kind of based on the self application control method of crossing under car networked environment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6647361B1 (en) * 1998-11-23 2003-11-11 Nestor, Inc. Non-violation event filtering for a traffic light violation detection system
CN1936999A (en) * 2006-10-17 2007-03-28 大连理工大学 City area-traffic cooperative control method based wireless sensor network
CN101901550A (en) * 2010-06-24 2010-12-01 北京航空航天大学 Vehicle flow detection system and traffic lamp control method
CN102142197A (en) * 2011-03-31 2011-08-03 汤一平 Intelligent traffic signal lamp control device based on comprehensive computer vision
CN104008659A (en) * 2014-06-12 2014-08-27 北京易华录信息技术股份有限公司 System and method capable of accurately monitoring control effects of intersection signal controller
CN105046987A (en) * 2015-06-17 2015-11-11 苏州大学 Pavement traffic signal lamp coordination control method based on reinforcement learning

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110738860A (en) * 2019-09-18 2020-01-31 平安科技(深圳)有限公司 Information control method and device based on reinforcement learning model and computer equipment
US11080602B1 (en) 2020-06-27 2021-08-03 Sas Institute Inc. Universal attention-based reinforcement learning model for control systems
CN113487891A (en) * 2021-06-04 2021-10-08 东南大学 Intersection joint signal control method based on Nash Q learning algorithm

Also Published As

Publication number Publication date
CN105046987A (en) 2015-11-11
CN105046987B (en) 2017-07-07

Similar Documents

Publication Publication Date Title
WO2016202009A1 (en) Road traffic light coordination and control method based on reinforcement learning
CN105489034B (en) A kind of complete traffic control system of main line and method
CN109767630A (en) A kind of traffic signal control system based on bus or train route collaboration
CN105608912A (en) City road traffic intelligent control method and city road traffic intelligence control system
CN107274684A (en) A kind of single-point integrative design intersection policy selection method under bus or train route cooperative surroundings
CN110136455A (en) A kind of traffic lights timing method
CN107730886A (en) Dynamic optimization method for traffic signals at urban intersections in Internet of vehicles environment
CN110136456A (en) Traffic lights anti-clogging control method and system based on deeply study
CN108603763A (en) Traveling plan generating means, traveling scheduling method and traveling plan generate program
CN109300325A (en) A kind of lane prediction technique and system based on V2X
CN110097751B (en) Two-phase signal control intersection pedestrian special phase dynamic setting method
CN104575038A (en) Intersection signal control method considering priority of multiple buses
CN109902899A (en) Information generating method and device
CN109841059A (en) A method of based on predicting that crowded section of highway professional etiquette of going forward side by side is kept away under VANET environment
CN110491147A (en) A kind of information processing method, traffic information processing apparatus and terminal device
CN107610488A (en) A kind of Traffic Light Automatic Control system and the Traffic Light Automatic Control method based on the system
CN103236164A (en) Vehicle controlling method for guaranteeing public transport vehicle priority passing
CN108986509A (en) Urban area path real-time planning method based on vehicle-road cooperation
CN108122420B (en) Method for setting clearing distance of on-road dynamic bus lane
CN105761520A (en) System for realizing adaptive induction of traffic route
CN112562333A (en) Road congestion processing method and device based on intelligent traffic
CN108492589A (en) Traffic lights intelligent adjusting method and device
WO2023035666A1 (en) Urban road network traffic light control method based on expected reward estimation
CN109115220B (en) Method for parking lot system path planning
CN108922205A (en) The traffic lights switching time control system and method for level-crossing congestion situation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16810750

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16810750

Country of ref document: EP

Kind code of ref document: A1