by Shigeyasu Fujisaka, Tatsuya Hishiki, Takahiro Yoshimi
In recent years, the use of generative AI in application development has attracted a lot of attention. DXC Technology Japan (DXC) is also promoting specific initiatives in Japan to improve quality and productivity using generative AI. As an example, we will focus on the contents of a PoC (proof of concept) that utilized generative AI in application development with a customer.
Verifying the effects and possibilities of application development using ChatGPT
DXC was quick to pay attention to the technological innovation of generative AI, including ChatGPT, which was announced in November 2022, and is implementing measures to fundamentally improve quality, cost, and delivery (QCD) in application development, operation, and maintenance. As such, we have been evaluating and verifying how to use it in Japan ahead of the rest of the world. We believe that usability in actual development projects is most important, so we conducted a PoC with the customer.
In the PoC, we utilized Microsoft’s Azure AI services, including ChatGPT4, as a specific generative AI technology and verified its effectiveness. To improve our QCD goal, we also needed to verify the processes necessary for system development using generative AI and redefine management methods and implementation techniques.
PoC and results in three themes
The PoC covered the following three themes.
1. Analysis of legacy programs using generative AI Analyze
existing legacy application programs using generative AI to clarify their structure and functions.
2. Establish a modern application development process based on generative AI.
By utilizing generative AI in the development of new applications, we aim to streamline the development process.
3. Utilization of company data in generative AI (source code, design documents, etc.)
Aim for more advanced development methods by utilizing generative AI that incorporates source code and design documents accumulated by companies.
Regarding the theme, DXC uses a system developed based on Azure AI services including Azure Open AI. It provides a variety of functions such as registering corporate data, editing input prompts, generating text, and reflecting generated results, and uses a unique mechanism to resolve issues and constraints of Azure OpenAI Service (and ChatGPT). I’m trying. We call this “DXC J Chat”.
Feedback from PoC and other projects has made clearer concrete use cases for the effective use of generative AI in application development.
・Analyze program code and generate documents such as design documents
・Generate program code from the detailed design document
・Generate test items and test code from program code based on test policy
In addition, regarding program code where an error has occurred, the generation AI can explain the reason and cause of the error and suggest a fix, refactor the code, generate another program code from existing code, etc. We confirmed that it is possible.
Challenges of utilizing generative AI and countermeasures
There are also various restrictions when using ChatGPT. Through PoC, DXC is also considering solutions.
Typical limitations include that input and output are only in text format that the maximum number of tokens is approximately 32,000, as well as that the output process is a black box. While the output results can be largely fixed by parameters, small modifications to the prompts can make a big difference. Challenges include opacity, which cannot be fully controlled, and accuracy and completeness, which can result in omissions and summarization even within the maximum number of tokens.
Therefore, during the analysis process, we cleanse the source code and divide the processing to address input constraints and accuracy and completeness issues. Regarding input constraints related to text format limitations, we have made it possible to input and output characters and tables in HTML format. In addition, we have made it possible to input and output figures in script format. Furthermore, we created a results display framework (website) to enable the management of analysis results and design documents.
To deal with the opacity of the output process becoming a black box, we have taken measures such as creating a “prompt template” consisting of a “fixed part” and a “variable part” so that uniform output results can be obtained. doing.
Another major challenge during the PoC was establishing a process for generative AI to understand company-specific internal data. General-purpose LLMs (Large-Scale Language Models) like ChatGPT do not have company-specific company information. DXC has developed a system that stores company-specific information and includes business information that matches the “intent” of the input prompt in the input information. This makes it possible for generation AI to generate information specific to your company with higher accuracy.
Confirmed efficiency of over 50% compared to manual labor
We will also introduce the results measured quantitatively.
We have confirmed that legacy program analysis (source analysis) is more than 50% more efficient than manual analysis. The analysis consists of flowcharts, input/output lists, and processing details, all of which have been highly rated for readability, accuracy, and difficulty. However, since ChatGPT cannot guarantee the integrity of each deliverable, it has become clear that human checking and final adjustments are essential.
Regarding the establishment of modern application development, we confirmed significant effects on productivity and quality indicators. The productivity index has been improved by 120% to 300% compared to standard values, and we have confirmed a significant effect. Regarding quality indicators, the quality of development and unit tests exceeded the standard, and it can be said that sufficient quality can be ensured.
The quality of the generated code was also highly rated as it was able to comply with coding standards and was highly human-readable.
Regarding the comprehensiveness of the tests, by preparing the test policy and viewpoint, ChatGPT was able to select the test items and test scripts necessary for the application logic and cover them as necessary and sufficient.
Development using generative AI has been shown to achieve more stable and high quality compared to human-based, individual-dependent coding and quality assurance through individualized testing.
DXC will utilize the knowledge gained to promote the use of generative AI in application development and operation. The “Generative AI CoE (Center of Excellence)”, a cross-organizational in-house community launched by DXC in Japan, has active discussions on a variety of topics daily. Contributing.
We believe that generative AI is not just an automation tool, but has the potential to create new value when combined with human ingenuity. Furthermore, we believe that by returning the results created by humans by adding new value back to generative AI, we will create a synergistic effect that can be called co-evolution between humans and generative AI.
DXC’s consultants and engineers will support you by leveraging their extensive knowledge of application development and operation. If you are thinking about using generative AI in your company, please feel free to contact us.