Issues with Cody Not Finding Relevant Files Compared to Cursor

I just want to let the developers know that I’m getting “empty” results in terms of file numbers when asking questions that Cursor can easily handle. For example, when I ask something like, “Where is the movement system for characters in this game?” Cursor successfully identifies the main Movement script along with several other related scripts. However, Cody fails to find any relevant files and instead provides explanations based on the project name, without identifying the actual files. This isn’t an isolated issue—it’s the same for any other questions I ask. Because of this, I’ve completely stopped using Cody for navigating my codebase.

Just to clarify, in case it matters: my projects are Unity projects with 10-200 scripts, many folders, and a functioning Codyignore file (which excludes plenty of folders and files). I know the ignore file is working because Cody doesn’t interact with the ignored files.

I’m open to the possibility that my issues are caused by an unstable Codyignore, but when I see articles like this one (Chat-oriented programming (CHOP) in action), “Let’s ask something a little more complex - such as how would I add a new LLM model to our Cody VS Code extension.” - I’m amazed that I can’t achieve anything even remotely similar. I’m curious if the developers are aware of these issues and if they are being worked on, or if I’m the only one experiencing this?

Hey sorry you’re running into this, can you send me some screenshots of what you’re seeing in chat? This seems like an indexing-related issue. If you’d prefer to keep these private, feel free to DM me on Twitter or shoot them over to beyang@sourcegraph.com. Appreciate the feedback!

Hi there!

While trying to take a screenshot, I managed to narrow down the issue a bit.

  1. A lot of it has to do with the language used for the query. When using Cursor, I’m used to making queries in a language other than English, and this usually doesn’t cause any issues. With Cody, things at least start working when I make queries in English (in both blended and embeddings modes). This partially resolves the problem for me… But an automatic translation of the query into English (which seems to be critical) would be much more convenient for many users whose native language is not English.

  2. What practically doesn’t work at all (or works strangely) in any language is the “Keyword” context mode. I noticed something else during testing: in this mode, it’s critical not to place punctuation marks immediately after the last letter. I’m attaching two screenshots in this mode. In one of them, the question mark is attached to the last word (and the chat can’t find any files), while in the other, it’s not (apparently, this is the correct way to search, but how would a user know this?)—and the chat finds a file, but it’s the wrong one! And that’s odd—it’s clear from the screenshot that I have a file with that name, which contains a class with that name. But Cody suggests a different file for some reason.

  3. Regarding the embeddings mode, as I found out earlier, it generally works in English. And I could probably rely on it. But there are two important nuances, especially the second one, which makes it difficult to do so, unlike with Cursor:

3.1. First, Cody indexes everything, the entire project. In my case (a Unity game), it indexes 1000 times more than necessary, ignoring the ignore file. I’m not sure if this is good for subsequent efficiency, but I hope that during the queries, the ignore file will at least help avoid distractions. In any case, indexing a new project can take 20-30 minutes. I tried switching to the pre-release version (v1.31.1724080603), and maybe I misunderstood something, but it seems like no indexing happens at all in that version.

3.2. More importantly, the indexing frequency. In Cursor, it happens every 10 minutes and can be initiated manually. Considering that the ignore file is taken into account there, it only takes a few seconds. When the indexing of updates happens in Cody, I don’t understand at all. Sometimes something significant changes in the project within minutes, and waiting for it for weeks (or however long it takes) seems strange. I know that you can delete the corresponding files for re-indexing, but then you’ll have to wait another 30 minutes.

I’ve spent a lot of time on these experiments and descriptions for you, and I hope that this helps improve something.

.
!/.cody
!.ignore
!/Assets/
/Assets/*
!/Assets/Scripts/
!.txt
!
.cs


Thanks for sharing this! Will dive in and post back here when these are fixed. We have some upcoming context improvements that should help here (an improved embeddings model and reranker, and better server-side indexing).

Thank you nikobellic for sharing your feedback. We have a lot on the roadmap for improvements to context, both remote context for enterprises, as well as local context for your local code. For example, improving how we index (frequency, w.r.t size of code base, etc.) and foreign languages are on our roadmap!

Just in case: After the latest update, codebase questions don’t work at all if blended mode is enabled and I’m using an ignore file (and it’s not that I’m ignoring incorrectly - I can discuss these files individually with Cody just fine). If I disable the experimental mode and remove the ignore file, the search starts finding something, but mostly garbage (again, I have thousands of files that need to be ignored, to be more precise - only the Scripts folder should not be ignored).

Hi everyone,

I haven’t found any related news, but I’ve been testing and the chat’s performance has improved. It now respects the .gitignore file when finding files in embedding mode (and it started actually to find them!).

I’m still unsure about the frequency of embedding updates and whether manual updates are possible. Also, it still attempts to index embeddings for the entire project, ignoring the .gitignore file during that process. It’s been indexing for 20 minutes and is only at 5%, lol.

1 Like

Glad to hear the ignore feature is working properly for you now.

As far as I know, the embeddings are updated every 24 hours depending on the number of file changes.
I will post the exact numbers soon.

The speed of generating embeddings should have been improved in the last few days.