image: In traditional architectural planning design, both the demand side and the design side often need to spend a significant amount of time and effort, going through multiple rounds of consultation and repeated coordination to clarify the connotations of the requirements (a). In text-to-image model-based architectural planning design, data-driven text-to-image intelligent technology will become a real-time online digital production line, helping the demander and designer achieve clear communication in one go, reach feasible design solutions, and complete efficient and convenient presentations.
Credit: Beijing Zhongke Journal Publising Co. Ltd.
Recently, Geographic Information Science published the research findings of Academician Zhang Xinchang and his team from the School of Geography and Remote Sensing at Guangzhou University. The study reviews the development of text-to-image technology, analyzes the existing issues in its application within urban and rural planning design, discusses potential solutions, and presents application experiments.
The research results highlight that to address the issue of insufficient data for text-to-image technology in urban and rural planning design, a data augmentation strategy specific to the field of urban and rural planning can be used to improve the model's adaptability to diverse scenarios. To enhance the generation model's understanding and control of spatial information, a large model with spatial information enhancement based on instruction expansion can be constructed. To meet the needs for local editing and precise layout, and to achieve fine control and dynamic adjustment in complex scenes, a local editing large model for text-to-image generation based on induced layout can be explored. Experimental results demonstrate that text-to-image technology shows great potential in urban and rural planning design, particularly in tasks such as the renovation of old residential areas, industrial zone planning, and localized rural redevelopment. Its characteristics of "interactivity, real-time, and professionalism" offer a fresh perspective and tool support for innovation in urban and rural planning research.
With the rapid development of generative artificial intelligence technology, text-to-image technology has brought unprecedented opportunities to urban and rural planning design. It has shown significant advantages in areas such as design process optimization, image generation efficiency, and complex scene modeling. However, current research is still in the exploratory stage and faces many challenges, including data scarcity, insufficient utilization of domain-specific prior knowledge, and lack of control over generated content. The rapid development of text-to-image technology is fundamentally reshaping the design paradigm of urban and rural planning. By continuously breaking through existing technological bottlenecks, text-to-image technology will demonstrate greater creativity and practicality in more complex scenarios, pushing urban and rural design and planning applications toward a higher level of intelligence.
For more details, please refer to the original article:
Research and Application of Text-to-Image Technology Based on Al Foundation Models.
https://www.sciengine.com/JGIS/doi/10.12082/dqxxkx.2025.240657(If you want to see the English version of the full text, please click on the “iFLYTEK Translation” in the article page.)
Article Title
Research and Application of Wensheng Graph Technology Based on Al Big Model
Article Publication Date
25-Jan-2025