Tsinghua AI Model Published in Nature Sub-journal: Revolutionizing Urban Spatial Planning, 3000x Faster Than Humans
-
Now, in the field of urban spatial planning, human designers have an AI partner.
A research team from Tsinghua University has proposed a deep reinforcement learning model. Based on the 15-minute city concept, this model can perform complex urban spatial planning. Combined with human input, the machine learning-assisted land and road space planning outperforms other algorithms and professional human designers, achieving approximately a 50% improvement across all considered metrics while being 3000 times faster.
The related research paper, titled "Spatial planning of urban communities via deep reinforcement learning," was recently published in the latest issue of Nature Computational Science, a sub-journal of Nature.
In a concurrent News & Views article, Paolo Santi, a research scientist at MIT Senseable City Lab, wrote: "Yu Zheng et al. addressed key conceptual and computational challenges while successfully demonstrating the feasibility of integrating AI with human workflows in the field of spatial layout planning, opening up rich avenues for future research."
Cities have become hubs of innovation, creativity, and opportunity, attracting people from all walks of life seeking entertainment, education, healthcare, and employment. Effective spatial planning is crucial for a city's economic activity and sustainable development.
Modern urban planning often prioritizes vehicles, favoring centralized functions and car-dependent transportation systems. This approach not only causes traffic congestion but also exacerbates global warming. Meanwhile, the COVID-19 pandemic has exposed urban vulnerabilities during lockdowns. Therefore, urban planning urgently needs transformation, accelerating the shift from vehicle-oriented to people-oriented models.
Notably, the '15-minute city' concept is gaining popularity in planning new urban communities and renovating existing ones, where residents can access essential services within a 15-minute walk or bike ride. This reflects people's expectations for spatially efficient layouts within urban neighborhoods.
However, despite decades of effort in developing computational models and support tools to automate urban planning, manual work is still required for tedious layout tasks, even though urban planners today use GIS tools that are orders of magnitude more productive than decades ago.
To address these challenges, Tsinghua University's research team has proposed a deep reinforcement learning-based urban planning model capable of generating land use and road layouts for urban communities.
Yet compared to tasks with regular grid conditions like chip design or Go, urban communities present more diverse and irregular geometric forms, adding complexity to automated planning solutions.
To address this issue, the research team proposed an urban continuity graph to describe the topological structure of urban geometry, where urban geographic elements serve as nodes and spatial continuity as edges. The graph construction allows capturing the fundamental spatial relationships of any community form. Consequently, they framed spatial planning as a sequential decision-making problem on the graph, operating at the topological level rather than the geometric level.
Another major challenge in spatial planning is the enormous solution space and the even larger accompanying action space. For a medium-sized community, the action space can easily exceed 4000 to the power of 100 (4000 possible actions per step, with 100 steps for community spatial planning), making exhaustive searches infeasible.
To reduce the action space, researchers trained an AI agent composed of a value network and two policy networks, which efficiently explores and exploits the vast action space to identify optimal planning strategies. Specifically, the value network predicts the quality of spatial planning based on the implementation of the "15-minute city" concept, while the two policy networks guide the AI agent in selecting land use and road locations. By sampling actions from the policy networks and estimating rewards via the value network, the action space is significantly narrowed.
To obtain effective representations of urban geographic elements, the researchers further developed a graph neural network (GNN)-based state encoder. This encoder leverages message passing and neighbor aggregation on the urban continuity graph to capture spatial relationships among land parcels, road segments, and intersections. The GNN state encoder is shared between the value network and policy networks, facilitating reward prediction and location selection. Ultimately, the AI agent generates more efficient planning solutions compared to human experts.
Extensive experimental results show that under identical initial conditions and planning constraints, this method significantly outperforms state-of-the-art algorithms and human experts, improving objective spatial efficiency metrics by over 48.6%. Particularly when using existing real communities as initial conditions, the model can generate land-use renovation plans that increase residents' accessibility to various facilities by over 18.5%.
Considering the maturity and complexity of urban planning methods, researchers proposed a human-AI collaborative workflow based on this DRL model, where human designers focus on conceptual prototyping while the model handles heavy and time-consuming planning tasks.
Results prove that human designers benefit from this AI-assisted workflow, which outperforms fully human-driven processes in both objective planning metrics and subjective blind tests conducted by 100 professional designers, while achieving 3000x time efficiency improvements.
Furthermore, the model can learn general planning skills from simple scenarios and apply them to design large-scale complex planning tasks with different styles, such as green communities and service-oriented communities.
However, while this experiment generated over 1 million spatial plans, the dataset remains smaller than those used in comparable DRL tasks like Go or chip design. Scaling this approach to city-level applications would require collecting massive training samples from distributed clusters and training larger neural networks across multiple GPU servers.
Notably, the three decomposed subspaces (what to plan, where to plan, and how to plan) could be jointly optimized by agents, but this requires more training samples. The 'what to plan' component could also expand to include other urban sustainability elements like public transport routes. The framework currently doesn't account for subjective evaluation metrics like aesthetic and artistic scores.
The current framework is primarily guided by static metrics, which can generate spatially efficient community plans. However, planning an entire city is a more complex task that requires consideration of diverse objectives, including economic growth and resident health. It is nearly impossible to assess the impact of city-level planning using just a few static metrics.
In most experiments, researchers overlook hundreds of urban planning rules, failing to account for critical real-world issues such as land ownership, public access, urban segregation, and renovation. Yet, with necessary and reasonable adjustments, this method can effectively address these practical planning rules and political challenges.
Despite its shortcomings, the significance of this research cannot be denied.
Machine learning, as a supportive tool, can enhance the productivity of human planners and potentially create more sustainable urban living. Beyond accelerating the spatial layout process for planners, it can also offer broader benefits to other stakeholders. By incorporating customizable options into models, public platforms can be built to encourage participation from residents and developers in the planning process.
As the research paper notes, urban planning is far from a simple game of selecting land use and road locations; it is a complex interaction among multiple stakeholders. The proposed framework in this study demonstrates the potential for greater participation from all stakeholders, representing a small step toward more transparent and inclusive cities.