Enhancing efficiency, correctness, and social fairness in automated code generation

Huang, Dong; 黄东

File Download

FullText.pdf

Supplementary

Citations:
Appears in Collections:
- HKU Theses Online
- Computer Science: Theses

postgraduate thesis: Enhancing efficiency, correctness, and social fairness in automated code generation

Title	Enhancing efficiency, correctness, and social fairness in automated code generation
Authors	Huang, Dong 黄东
Advisors	Advisor(s):Cui, H
Issue Date	2025
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Huang, D. [黄东]. (2025). Enhancing efficiency, correctness, and social fairness in automated code generation. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
Abstract	Large Language Models (LLMs) are increasingly integrated into IDEs to assist with software development tasks such as code generation, debugging, and testing. LLMs have significantly enhanced developer productivity by generating code from natural language instructions. However, despite these advancements, LLM-generated code often suffers from critical shortcomings: functional incorrectness, poor efficiency, and social biases. These limitations hinder the practical deployment of LLMs in real-world software engineering, particularly in performance-critical and socially sensitive contexts. Functional incorrectness in LLM-generated code requires extensive manual intervention to debug and repair, slowing down software development workflows. Poor efficiency leads to increased execution time and resource consumption, rendering the code impractical for use in resource-constrained environments such as embedded systems or mobile devices. Inefficiency also exacerbates energy consumption, a growing concern for sustainable software engineering. Meanwhile, biases embedded in LLM-generated code can perpetuate inequities in critical applications, such as hiring algorithms or healthcare systems, limiting societal applicability. Addressing these challenges is essential to unlock the full potential of LLMs in software development. This thesis proposes a comprehensive framework to address these challenges, presenting three key contributions that focus on improving the efficiency, correctness, and social fairness of LLM-generated code. First, we propose EffiBench and EffiLearner to address the inefficiency of LLM-generated code. EffiBench introduces the first benchmark specifically designed to measure efficiency, incorporating a collection of 1,000 efficiency-critical problems paired with canonical solutions optimized for time and space complexity. It integrates comprehensive test cases and diverse metrics, such as execution time and memory usage, to evaluate the efficiency of LLM-generated code. Building on this foundation, EffiLearner leverages the insights from EffiBench to introduce a self-optimization framework inspired by human coding practices. EffiLearner refines LLM-generated code iteratively using execution profiles that reveal computational overheads, enabling LLMs to reduce execution time and memory usage while improving overall efficiency. Second, to simultaneously improve correctness and efficiency, we introduce EffiCoder, a fine-tuning dataset and framework that extends existing efforts. EffiCoder aggregates optimized solutions from multiple datasets and generates rich metadata and test cases to evaluate execution performance. By incorporating iterative self-optimization into the dataset construction process, EffiCoder enables LLMs to produce correct and high-performing code that balances functional requirements and computational efficiency. This framework bridges the gap left by previous fine-tuning approaches, which often focused exclusively on correctness. Finally, to address social fairness, we propose the Code Bias Score (CBS) framework for evaluating and mitigating biases in LLM-generated code for bias-sensitive tasks. CBS employs automated test generation and Abstract Syntax Tree analysis to detect and quantify bias behaviors in generated code. In addition to evaluating fairness, CBS provides feedback to LLMs, guiding them to reduce biases during code generation. This approach ensures that LLMs produce code that adheres to ethical and equitable standards without sacrificing performance. These contributions provide a unified framework for addressing the core limitations of LLM-generated code. By ensuring efficiency, correctness, and social fairness, this thesis paves the way for the broader adoption of LLMs in real-world software engineering, fostering sustainable, reliable, and socially responsible practices.
Degree	Doctor of Philosophy
Subject	Code generators Automatic programming (Computer science)
Dept/Program	Computer Science
Persistent Identifier	http://hdl.handle.net/10722/356592

DC Field	Value	Language
dc.contributor.advisor	Cui, H	-
dc.contributor.author	Huang, Dong	-
dc.contributor.author	黄东	-
dc.date.accessioned	2025-06-05T09:31:19Z	-
dc.date.available	2025-06-05T09:31:19Z	-
dc.date.issued	2025	-
dc.identifier.citation	Huang, D. [黄东]. (2025). Enhancing efficiency, correctness, and social fairness in automated code generation. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.	-
dc.identifier.uri	http://hdl.handle.net/10722/356592	-
dc.description.abstract	Large Language Models (LLMs) are increasingly integrated into IDEs to assist with software development tasks such as code generation, debugging, and testing. LLMs have significantly enhanced developer productivity by generating code from natural language instructions. However, despite these advancements, LLM-generated code often suffers from critical shortcomings: functional incorrectness, poor efficiency, and social biases. These limitations hinder the practical deployment of LLMs in real-world software engineering, particularly in performance-critical and socially sensitive contexts. Functional incorrectness in LLM-generated code requires extensive manual intervention to debug and repair, slowing down software development workflows. Poor efficiency leads to increased execution time and resource consumption, rendering the code impractical for use in resource-constrained environments such as embedded systems or mobile devices. Inefficiency also exacerbates energy consumption, a growing concern for sustainable software engineering. Meanwhile, biases embedded in LLM-generated code can perpetuate inequities in critical applications, such as hiring algorithms or healthcare systems, limiting societal applicability. Addressing these challenges is essential to unlock the full potential of LLMs in software development. This thesis proposes a comprehensive framework to address these challenges, presenting three key contributions that focus on improving the efficiency, correctness, and social fairness of LLM-generated code. First, we propose EffiBench and EffiLearner to address the inefficiency of LLM-generated code. EffiBench introduces the first benchmark specifically designed to measure efficiency, incorporating a collection of 1,000 efficiency-critical problems paired with canonical solutions optimized for time and space complexity. It integrates comprehensive test cases and diverse metrics, such as execution time and memory usage, to evaluate the efficiency of LLM-generated code. Building on this foundation, EffiLearner leverages the insights from EffiBench to introduce a self-optimization framework inspired by human coding practices. EffiLearner refines LLM-generated code iteratively using execution profiles that reveal computational overheads, enabling LLMs to reduce execution time and memory usage while improving overall efficiency. Second, to simultaneously improve correctness and efficiency, we introduce EffiCoder, a fine-tuning dataset and framework that extends existing efforts. EffiCoder aggregates optimized solutions from multiple datasets and generates rich metadata and test cases to evaluate execution performance. By incorporating iterative self-optimization into the dataset construction process, EffiCoder enables LLMs to produce correct and high-performing code that balances functional requirements and computational efficiency. This framework bridges the gap left by previous fine-tuning approaches, which often focused exclusively on correctness. Finally, to address social fairness, we propose the Code Bias Score (CBS) framework for evaluating and mitigating biases in LLM-generated code for bias-sensitive tasks. CBS employs automated test generation and Abstract Syntax Tree analysis to detect and quantify bias behaviors in generated code. In addition to evaluating fairness, CBS provides feedback to LLMs, guiding them to reduce biases during code generation. This approach ensures that LLMs produce code that adheres to ethical and equitable standards without sacrificing performance. These contributions provide a unified framework for addressing the core limitations of LLM-generated code. By ensuring efficiency, correctness, and social fairness, this thesis paves the way for the broader adoption of LLMs in real-world software engineering, fostering sustainable, reliable, and socially responsible practices.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject.lcsh	Code generators	-
dc.subject.lcsh	Automatic programming (Computer science)	-
dc.title	Enhancing efficiency, correctness, and social fairness in automated code generation	-
dc.type	PG_Thesis	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Computer Science	-
dc.description.nature	published_or_final_version	-
dc.date.hkucongregation	2025	-
dc.identifier.mmsid	991044970874303414	-

File Download

Supplementary

postgraduate thesis: Enhancing efficiency, correctness, and social fairness in automated code generation

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats