File Download
Supplementary

postgraduate thesis: Enhancing efficiency, correctness, and social fairness in automated code generation

TitleEnhancing efficiency, correctness, and social fairness in automated code generation
Authors
Advisors
Advisor(s):Cui, H
Issue Date2025
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Huang, D. [黄东]. (2025). Enhancing efficiency, correctness, and social fairness in automated code generation. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
AbstractLarge Language Models (LLMs) are increasingly integrated into IDEs to assist with software development tasks such as code generation, debugging, and testing. LLMs have significantly enhanced developer productivity by generating code from natural language instructions. However, despite these advancements, LLM-generated code often suffers from critical shortcomings: functional incorrectness, poor efficiency, and social biases. These limitations hinder the practical deployment of LLMs in real-world software engineering, particularly in performance-critical and socially sensitive contexts. Functional incorrectness in LLM-generated code requires extensive manual intervention to debug and repair, slowing down software development workflows. Poor efficiency leads to increased execution time and resource consumption, rendering the code impractical for use in resource-constrained environments such as embedded systems or mobile devices. Inefficiency also exacerbates energy consumption, a growing concern for sustainable software engineering. Meanwhile, biases embedded in LLM-generated code can perpetuate inequities in critical applications, such as hiring algorithms or healthcare systems, limiting societal applicability. Addressing these challenges is essential to unlock the full potential of LLMs in software development. This thesis proposes a comprehensive framework to address these challenges, presenting three key contributions that focus on improving the efficiency, correctness, and social fairness of LLM-generated code. First, we propose EffiBench and EffiLearner to address the inefficiency of LLM-generated code. EffiBench introduces the first benchmark specifically designed to measure efficiency, incorporating a collection of 1,000 efficiency-critical problems paired with canonical solutions optimized for time and space complexity. It integrates comprehensive test cases and diverse metrics, such as execution time and memory usage, to evaluate the efficiency of LLM-generated code. Building on this foundation, EffiLearner leverages the insights from EffiBench to introduce a self-optimization framework inspired by human coding practices. EffiLearner refines LLM-generated code iteratively using execution profiles that reveal computational overheads, enabling LLMs to reduce execution time and memory usage while improving overall efficiency. Second, to simultaneously improve correctness and efficiency, we introduce EffiCoder, a fine-tuning dataset and framework that extends existing efforts. EffiCoder aggregates optimized solutions from multiple datasets and generates rich metadata and test cases to evaluate execution performance. By incorporating iterative self-optimization into the dataset construction process, EffiCoder enables LLMs to produce correct and high-performing code that balances functional requirements and computational efficiency. This framework bridges the gap left by previous fine-tuning approaches, which often focused exclusively on correctness. Finally, to address social fairness, we propose the Code Bias Score (CBS) framework for evaluating and mitigating biases in LLM-generated code for bias-sensitive tasks. CBS employs automated test generation and Abstract Syntax Tree analysis to detect and quantify bias behaviors in generated code. In addition to evaluating fairness, CBS provides feedback to LLMs, guiding them to reduce biases during code generation. This approach ensures that LLMs produce code that adheres to ethical and equitable standards without sacrificing performance. These contributions provide a unified framework for addressing the core limitations of LLM-generated code. By ensuring efficiency, correctness, and social fairness, this thesis paves the way for the broader adoption of LLMs in real-world software engineering, fostering sustainable, reliable, and socially responsible practices.
DegreeDoctor of Philosophy
SubjectCode generators
Automatic programming (Computer science)
Dept/ProgramComputer Science
Persistent Identifierhttp://hdl.handle.net/10722/356592

 

DC FieldValueLanguage
dc.contributor.advisorCui, H-
dc.contributor.authorHuang, Dong-
dc.contributor.author黄东-
dc.date.accessioned2025-06-05T09:31:19Z-
dc.date.available2025-06-05T09:31:19Z-
dc.date.issued2025-
dc.identifier.citationHuang, D. [黄东]. (2025). Enhancing efficiency, correctness, and social fairness in automated code generation. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.-
dc.identifier.urihttp://hdl.handle.net/10722/356592-
dc.description.abstractLarge Language Models (LLMs) are increasingly integrated into IDEs to assist with software development tasks such as code generation, debugging, and testing. LLMs have significantly enhanced developer productivity by generating code from natural language instructions. However, despite these advancements, LLM-generated code often suffers from critical shortcomings: functional incorrectness, poor efficiency, and social biases. These limitations hinder the practical deployment of LLMs in real-world software engineering, particularly in performance-critical and socially sensitive contexts. Functional incorrectness in LLM-generated code requires extensive manual intervention to debug and repair, slowing down software development workflows. Poor efficiency leads to increased execution time and resource consumption, rendering the code impractical for use in resource-constrained environments such as embedded systems or mobile devices. Inefficiency also exacerbates energy consumption, a growing concern for sustainable software engineering. Meanwhile, biases embedded in LLM-generated code can perpetuate inequities in critical applications, such as hiring algorithms or healthcare systems, limiting societal applicability. Addressing these challenges is essential to unlock the full potential of LLMs in software development. This thesis proposes a comprehensive framework to address these challenges, presenting three key contributions that focus on improving the efficiency, correctness, and social fairness of LLM-generated code. First, we propose EffiBench and EffiLearner to address the inefficiency of LLM-generated code. EffiBench introduces the first benchmark specifically designed to measure efficiency, incorporating a collection of 1,000 efficiency-critical problems paired with canonical solutions optimized for time and space complexity. It integrates comprehensive test cases and diverse metrics, such as execution time and memory usage, to evaluate the efficiency of LLM-generated code. Building on this foundation, EffiLearner leverages the insights from EffiBench to introduce a self-optimization framework inspired by human coding practices. EffiLearner refines LLM-generated code iteratively using execution profiles that reveal computational overheads, enabling LLMs to reduce execution time and memory usage while improving overall efficiency. Second, to simultaneously improve correctness and efficiency, we introduce EffiCoder, a fine-tuning dataset and framework that extends existing efforts. EffiCoder aggregates optimized solutions from multiple datasets and generates rich metadata and test cases to evaluate execution performance. By incorporating iterative self-optimization into the dataset construction process, EffiCoder enables LLMs to produce correct and high-performing code that balances functional requirements and computational efficiency. This framework bridges the gap left by previous fine-tuning approaches, which often focused exclusively on correctness. Finally, to address social fairness, we propose the Code Bias Score (CBS) framework for evaluating and mitigating biases in LLM-generated code for bias-sensitive tasks. CBS employs automated test generation and Abstract Syntax Tree analysis to detect and quantify bias behaviors in generated code. In addition to evaluating fairness, CBS provides feedback to LLMs, guiding them to reduce biases during code generation. This approach ensures that LLMs produce code that adheres to ethical and equitable standards without sacrificing performance. These contributions provide a unified framework for addressing the core limitations of LLM-generated code. By ensuring efficiency, correctness, and social fairness, this thesis paves the way for the broader adoption of LLMs in real-world software engineering, fostering sustainable, reliable, and socially responsible practices.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshCode generators-
dc.subject.lcshAutomatic programming (Computer science)-
dc.titleEnhancing efficiency, correctness, and social fairness in automated code generation-
dc.typePG_Thesis-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineComputer Science-
dc.description.naturepublished_or_final_version-
dc.date.hkucongregation2025-
dc.identifier.mmsid991044970874303414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats