Each time when creating an application that is using document databases like CosmosDB we are facing a challenge of estimation of its costs and later optimisation. Both tasks are not so easy to do. To solve the first one we can use cost calculator. We just need to provide main characteristics of our solution (like a number of documents, information about actions per seconds, …) and also we need to provide sample documents that will be stored in our database. After doing that we will be able to read the estimated cost of our solution:
Based on this data we should decide if we like to use CosmosDB or maybe we should think about the other solution. Based on my observations, I can say that most developers do not have any issues when they are staring using CosmosDB from a coding perspective. At the beginning they need to have some time to change the way of thinking – from relational one to document one. After doing that they will be able to use the whole potential of this product.
But sooner or later they will need to optimise queries. Reasons for those actions can be different. Sometimes they will need to reduce the cost of the database. Sometimes they will need to improve queries performance. In both cases we should based on statistics of system usage. We should measure the cost of queries and the frequency of usage. Then we will be able to select for optimisation queries that have the biggest impact on our solution.
I think most of us have some experience with Azure SQL and know that this product provides good quality statistics:
Data available in Query Performance Insight allows us to start optimisation smartly.
It is hard to say the same in case of CosmosDB. We have only access to basic statistics. We can check only how many RU is our solution using and if the provisioned value is not exceeded or overestimated. By using this data it is very hard to select anything to improve.
From my perspective its not good enough. I want to be able to monitor each query and get detailed information about it. It should show the exact query text, cost of it, information about target collection, etc. That information should allow me to calculate statistics regarding the whole system.
Mentioned information should be displayed in two ways. The first one as details of the specific query:
or as a table with information about a group of queries:
Based on that information finally, we can plan our optimisations. You can see that we can find information like:
- name of the type of query
- query duration
- query text
- name of collection targeted by the query
- cost of query
in the result table.
And of course, you can export those data and conduct any kind of analysis you want.
To achieve this situation, you need to create an extension to CosmosDB that will log mentioned statistics into Application Insights. The source code of this extension looks like this:
public static class ApplicationInsightsQueryTracker { private static readonly TelemetryClient TelemetryClient = new TelemetryClient(); public static string DependencyName { get; set; } = ""; public static async Task<FeedResponse<T>> ExecuteWithStatsLogging<T>(this IQueryable<T> queryable, string commandName = null, string operationId = null) { var documentQuery = queryable.AsDocumentQuery(); var now = DateTimeOffset.UtcNow; var stopwatch = Stopwatch.StartNew(); var response = await documentQuery.ExecuteNextAsync<T>(); stopwatch.Stop(); LogStats(now, stopwatch.Elapsed, response.RequestCharge, commandName ?? string.Empty, operationId, response.ContentLocation ?? String.Empty, queryable.ToString()); return response; } public static void LogStats(DateTimeOffset start, TimeSpan duration, double requestCharge, string commandName, string operationId, string contentLocation, string query) { var dependency = new DependencyTelemetry(DependencyName, commandName, start, duration, true); if (operationId != null) { dependency.Context.Operation.Id = operationId; } dependency.DependencyTypeName = "CosmosDB"; dependency.Properties["contentLocation"] = contentLocation; dependency.Properties["query"] = query; dependency.Properties["requestCharge"] = requestCharge.ToString(CultureInfo.InvariantCulture); TelemetryClient.TrackDependency(dependency); } }
The last thing that you need to do, is the execution of introduced extension when you are querying ComsosDB. You need to use the following code:
IQueryable<Family> query = client.CreateDocumentQuery<Family>( UriFactory.CreateDocumentCollectionUri(DatabaseName, CollectionName), "SELECT * FROM ....", queryOptions); query.ExecuteWithStatsLogging("FamilyByLastName");
As you can see, you can add a query name when you are gathering statistics. With it you will be able to group information about the same type of queries easily.
Leave A Comment