Interesting (arguably) case where AI falls down

chris_st · September 18, 2024, 11:31am

Okay, so no my finest moment coding here.

I should note that I’m on the free plan, using Claude Sonnet here.

I have some code which is fairly complicated, and has a section where I have:

if conditionAHolds {
    ...
} else {
    thing1
    thing2
    if conditionAHolds {
        codeWhichIsNotCovered
    }
}

And I’m sure you see immediately why codeWhichIsNotCovered never shows up in my coverage tool as executed with my test suite. Like I say, not the best bit of code ever.

I asked Cody why, and it told me that I needed a test case to cover it, and wrote a perfectly cromulent test case which… didn’t hit the line in question when run.

So I told it, “nope, that doesn’t cover that line”, and it apologized and promptly gave me pretty much exactly the same test again.

Suddenly, the light went on, and (in case you haven’t had your coffee/tea yet) I realized that in the initial “else” case, conditionAHolds can’t possibly be true. So the codeWhichIsNotCovered is literally impossible to get to.

This seems to point out that current AIs are, indeed, just doing textual analysis, not logical analysis, so it (in effect) can’t understand the logic error I made here.

So… curious if people have run across this kind of thing, and whether people have seen solutions like this found by one or more of the AIs.

Thanks!

PriNova · September 18, 2024, 12:23pm

I tried it with the following prompt:

Identify subtle patterns that might lead to bugs, even if syntactically correct.

and the excerpt from the response was:

Thank you for sharing this code snippet. As a senior software engineer, I’d be happy to analyze it for subtle patterns that might lead to bugs. Let’s go through this code step by step and identify potential issues.

Redundant condition check: The most noticeable issue in this code is the redundant check for conditionAHolds. It’s checked twice, once in the outer if-statement and again in the nested if-statement inside the else block. This redundancy can lead to confusion and potential bugs.

Unreachable code: The nested if-statement inside the else block that checks for conditionAHolds will never be true. This is because we only enter the else block when conditionAHolds is false in the first place. This means codeWhichIsNotCovered() will never be executed, which is likely not the intended behavior.

So Sonnet 3.5 did indeed find the subtle bug in the code.

What prompt did you use?

chris_st · September 18, 2024, 2:40pm

Interesting – I’ll try that prompt. Evidently I didn’t check in code with the bug, so I can’t step back and try yours.

The prompt I used was, approximately, “tell me how to get coverage on line 161”.

Prompting is an interesting “science”

PriNova · September 18, 2024, 3:05pm

Yes, prompting is both science and art.

It is so fascinating to see how the “behavior” of LLMs changes just by prompting. I did a lot of experimenting, creating semantic networks, replacing words one by one with synonyms, and all that kind of stuff.

In the early days of GPT, after they have gained enough semantic reasoning, they are like a linguistic mirror of yourself, directly linked to your emotions, thought processes and conclusions w/o anthropomorphizing.

Nowadays, they are more helpful and suggestive. By instructing the type of writing style in the answers, I learned so many new terms and words to articulate with a much more diverse and larger word dictionary. And the more you grow with AI as a tool, the AI grows with you.
And hallucinations are great in a sense, that with higher temperature settings, they key lead to other perspektives, non-sense or not.

chris_st · September 18, 2024, 7:14pm

Fascinating analysis! Really interesting to hear from someone with a lot of history with LLMs.

Topic		Replies	Views
Small bugs feedback + review Cody vscode , bugs	2	139	September 4, 2024
Finally, an honest answer from cody Forum vscode , cody-pro	2	106	July 11, 2024
Creating Github issues from the Jetbrains IDE Forum support , jetbrains , cody-pro	7	13	November 8, 2024
Feedback / Suggestions / Concept Ideas by Rims-Naps Forum vscode , cody-pro , forum-feedback , feature-request	3	141	April 25, 2024
Having a autocomplete error with korean JetBrains	4	132	May 8, 2024

Interesting (arguably) case where AI falls down

Related Topics