Fundamental AI Problem Solved by UK Startup Zoea for Code Generation

Fundamental AI Problem Solved by UK Startup Zoea for Code Generation
London-based AI startup, Zoea Ltd, has created a ground-breaking yet simple approach that massively minimises the combinatorial explosion in certain classes of problems. This phenomenon arises when the number of potential states in a system grows exponentially, such as the possible board configurations in a chess game, which makes solving many problems extremely difficult, if not impossible. The issue is particularly prevalent in AI but can also arise in other domains. 

The new approach has been developed within the context of the existing Zoea code generation system. Zoea transforms a set of test cases, comprising example inputs and corresponding outputs, into code directly, using a classic AI technique called Inductive Programming. Inductive Programming provides significant benefits over deep learning, such as greater transparency and the elimination of the need for training. However, it also suffers from the combinatorial explosion, which was the primary problem that Zoea set out to solve. 

Zoea’s breakthrough method reduces the amount of work required by between five and twenty orders of magnitude, depending on the size of the required program. This is equivalent to the difference between a problem taking around 950 years to solve verses taking just three seconds. 

The approach relies on the fact that programming languages have approximately 200 instructions, but most individual programs use fewer than ten of them. Therefore, if one could guess the instructions required for a given problem, it would be much easier to solve. In the new approach, guesses take the form of thousands of overlapping subsets of instructions, each containing between ten and fifty instructions derived from existing code. 

The probability of at least one subset containing all the required instructions is very high, as the distribution of instructions used by human developers is highly skewed and these patterns are preserved in the subsets. The subsets also allow problems to be tackled by hundreds or thousands of computers in parallel, with minimal duplication of effort due to overlap. Furthermore, even with thousands of subsets, less than ten guesses are required to find a solution in 50% of cases. 

This approach will enable Zoea to produce much larger programs in considerably less time, making Inductive Programming a more compelling option for AI-based code generation. Additionally, variations of this approach may help address the combinatorial explosion in other areas of AI and computing. 

“Some of the best brains in AI have been trying to solve this problem for decades” says Edward McDaid, CTO at Zoea Ltd. “The answer has in fact been hiding in plain sight the whole time”. 

“So far we’ve only scratched the surface with this new approach. There are a lot of improvements and refinements that are possible which could deliver even bigger benefits”. 

Full details of the approach and the results have been peer reviewed, published and were recently presented at the ICAART 2023 AI conference in Lisbon. 

A preprint of the paper is available here and a non-technical blog post is available here.