Log most expensive predicates and timings to query log#1349
Log most expensive predicates and timings to query log#1349angelapwen merged 5 commits intogithub:mainfrom
Conversation
| // Write summary to Query Log as well, as this information was present here before structured logging. | ||
| void logger.log(' --- Evaluator Log Summary --- ', { additionalLogLocation: this.logPath }); | ||
| void logger.log(buffer.toString(), { additionalLogLocation: this.logPath }); |
There was a problem hiding this comment.
Note that this will also log to the "CodeQL Extension Log" console view as well, so this information is showing in many different places (Query Server console and Structured Evaluator Log (Summary), and now Query Log and Extension Log). Does this seem too noisy?
There was a problem hiding this comment.
Hmmm...not sure. It seems like query information is best kept in the query log. But, if people are asking for it in the extension log as well, that's probably fine.
There was a problem hiding this comment.
Nobody is actually asking for it in the extension log, only in the query log. But I was under the impression that in order to log to the query log, I had to also log it to the extension log like above. logger.log() without the additionalLogLocation only writes to the extension log, I believe?
There was a problem hiding this comment.
Which log it writes to depends on the logger object at construction time. We have three global logger objects, one each for the extension log, the query server log, and the language server log. Where does this variable come from?
I also think it is best kept in the query server log.
There was a problem hiding this comment.
A ha, I see. I can just continue to use the query server logger object and log to an additional location of the query log (as requested).
aeisenberg
left a comment
There was a problem hiding this comment.
Looks good. Before merging can you try out running two queries at once to see if their results are intermingled in the logs?
In particular, try running 1 long query, and before it finishes, run a short query and see if al the log entries are in the proper file.
My guess is no, but this isn't something you need to worry about now. Pretty sure this is a long standing limitation of the query server, but want to see how it behaves with structured logging.
|
Eek, sorry, I had enabled auto-merge. Let me give that a try now. |
|
Ok. I ran Additionally it seems like both are taking quite a while to display results because the extension is blocking on |
|
Thanks for looking into this. Now that I remember, this is expected behaviour. Queries on the same database are run serially, and queries on separate databases are run in parallel. This is enough of an edge case that I think we are fine the way things are. |
Ah, I see, I had thought about trying a query on another database but didn't. I guess that's rare enough that we haven't heard this as pain point too much. I'm still not sure why both query results blocked on the former's summary file being created. I guess it is that the summary command is being run serially as well? 🤔 not quite what I expected. |
|
Hmmm...not sure. Perhaps @edoardopirovano can weigh in here. I would think that true parallelism here would be a major change to the query server, but it would be nice to better understand the mechanisms going on. |
|
(My apologies to any external contributors who may be following along this conversation: The comment below may be hard to follow as it contains many links to our private code base because that is where the underlying issue is.) Hmm, it looks like something is quite broken here. I don't think the fact we run queries serially on a single DB is the expected behaviour: I think that's a recent regression and some debugging shows that I unfortunately introduced this with a mistake in https://github.com/github/semmle-code/pull/41918. In particular, on this line I think I should have chosen My intent was that we would continue to allow parallel running of queries (as we have before), but would rotate the structured logs so that everything ends up in the log of the query that started last (not ideal, I'll admit, but this is a rare enough occurrence that it seemed okay to avoid the extra complexity of having to keep multiple file handles open and log to all of them). However, what's happened by using Note that using In summary, I think Angela has uncovered a bug and the solution is to change |
|
This all seems reasonable to me, but I am not entirely familiar with this area. Would you be able to create an issue for this in our internal repository and the rest of our team can comment? |

Prior to our move to structured logging, the most expensive predicates and timing summary was available in the Query Log generated for each query run. When we moved to structured logging, this information was only surfaced in the Query Server console and the Structured Evaluator Log (Summary) views. We are now logging this information back to the Query Log for parity with the existing user experience.
The per-query
query.logfile now ends with:Checklist
ready-for-doc-reviewlabel there.