Qualitative Monitoring: Improving your website’s Core Web Vitals through performance monitoring

Getting your Trinity Audio player ready...

In a previous article, Bohdan Kladkovyi shared tips on improving a publisher’s Google Core Web Vitals using news.amomama.com as an example. In this feature, he discusses the importance of these metrics and the best methods of monitoring and responding to them. TLDR: Sitespeed.io was the monitoring tech they eventually settled on.

What can qualitative monitoring do for your website? By taking advantage of tools that provide more effective monitoring, we stopped worrying about the impacts of each website feature on overall website performance. It was no longer something we had to focus on since we no longer had to wait up to 28 days for feedback on field data. Instead, almost in real-time, we monitor and immediately respond to certain factors that affect our performance.

Since AmoMama is a completely ‘organic’ product, one of our main web traffic sources is Google, making it crucial that we “keep our finger on the pulse” to ensure that our performance remains in the green zone. This article will discuss the technology we rely on for performance monitoring. Additionally, we will touch on how we can simulate real user experience under lab conditions.

We will answer several important questions, including the following:

What are the differences between lab and field data?
What tools exist for Google Core Web Vitals measurement and monitoring?
What tools do we use and how?
How do we simulate user experience, and why is it important?

What are the Differences Between Lab and Field Data?

Before getting into any of the minutiae on the topic, we should offer a brief reminder regarding what Google considers lab data versus field data.

Lab Data: Google considers lab data to be the run results for only the first screen and first viewport of a specific device. For example, Lighthouse, by default, runs on MotoG4 with slow 4G.

However, with the release of Google Chrome v.103, the company also presented a new version of Lighthouse, which allows you to measure laboratory data for the first screen and as simulated field data in the lab. However, beneath the surface, this is merely an early version of Lighthouse user flows. Regardless, it has already proven helpful in testing our Core Web Vitals, which will be discussed shortly.

Field Data: Field data is essential in ranking your site in search results. So, what is it exactly, and how does one work most effectively with it?

Field data is measured by monitoring all users who visit a page, then measuring a specific set of performance metrics for each. Because field data measures actual visits from real users, it shows the true devices under real network conditions, as well as the geographic location of the users, which would be very difficult to maintain and predict with lab measurements collected in a controlled environment with predefined device and network parameters.

Managing our field data is challenging as we have more than one website and location for which the vast majority of users possess older mobile devices, outdated versions of browsers, and poor Internet connections—slow or fast 3G. However, we certainly cannot turn off or block our website for such locations since our main goal is to provide our users with high-quality, interesting content at any location or connection speed. So, in connection with the development and maintenance of the main website and such locations, we have created a system for clear, high-quality monitoring of field data, enabling us to provide our website to all users for the quick, smooth-loading consumption of entertaining content.

What Tools Exist for Google Core Web Vitals Measurement and Monitoring?

While there are many different tools available, let’s talk about how we came to the best solution for our particular case.

First, we needed to take a good look at and classify the different tools for measuring and tracking field and laboratory data that Google offers to see which best met the needs of our particular company. The image below effectively outlines each of these tools and what they have to offer:

PageSpeed Insights is a powerful tool that allows you to immediately view field and laboratory data for a specific link and origin, which is its huge advantage. However, I would also like to highlight a few drawbacks. For starters, you must constantly view field data (with a delay) and, if necessary, keep a manual report—such as in Excel. It is very inconvenient. Further, there is no historic data, only the current state of key metrics.

Field Data:

Lab Data:

Chrome UX Report (CrUX) is a publicly available dataset of real user experiences using websites. It measures the field performance of all core Google metrics.

Unlike lab data, CrUX data is collected from users. Using this data, we can understand the distribution of real user experiences on our websites with reporting based on data from BigQuery. The CrUX dataset on BigQuery includes detailed Core Web Vitals data for all key metrics, which is available as monthly reports at the origin level. To generate a report on the site, you can use this link.

In our case, this tool came with several advantages and disadvantages we had to consider.

Advantages:

Provides a clear picture of Core Web Vitals for the month and past periods;
Offers the ability to create flexible reports;
Features comfortable and clear work through Data Studio; and
Automates the report export process.

Disadvantages:

Offers reports for the previous month only on the second Tuesday of the next month;
Relies on costly direct data pulls, when needed, from BigQuery; and
Yields slow responses to changes in metrics in one direction or the other.

Despite these drawbacks, this tool can effectively monitor real field data, and the ability to retrospectively view changes in metrics is pretty neat.

The CrUX report looks like this:

Search Console identifies groups of pages on our website that need attention based on real (field) data from CrUX. URL performance metrics are grouped by status, metric type, and URL group—or groups of similar web pages. It is the primary tool our business holders use as it paints a fairly clear and understandable picture. When reviewing your report, it is immediately obvious which zone the website is located (green, yellow or red) and how it affects SEO.

For example,

This simple tool takes a lot of the pressure off businesses, allowing them to deal less with metrics and analysis to determine whether or not they are meeting Google’s requirements. By entering the Search Console, it becomes clear how their webpages are doing. Still, it also comes with advantages and disadvantages.

Advantages:

Aims to provide clarity and transparency; and
Avoids unnecessary words and text, making clear how the Core Web Vitals of a site are doing.

Disadvantages:

Requires up to a 28-day wait time for the graph to be updated, according to Google. However, in our experience, the graph typically changes every two weeks;
Has a data delay of two days. PageSpeed Insights offers fresher data; and
Features Search Console, which is unsuitable for development, searching for new areas of performance development, or quick response to problems.

Chrome Dev Tools and Web Vitals Extension are additional tools that are must-haves when it comes to the localization of performance bugs during development.

I could go on with this topic for much longer, but after analyzing the tools that Google offers, we could not choose. There didn’t seem to be a solution allowing us to respond quickly to problems while monitoring field data in real-time. Although we attempted to use the PageSpeedInsight API and output metrics to Grafana using Prometheus Exporter for Google Pagespeed Online Metrics, we were disappointed. Using this API did not give us the desired result and was buggy, so we decided to work out another solution for ourselves…

What Tools Do We Use—and How Do We Use Them?

After a great deal of thorough research, we finally chose the appropriate tool—one that allowed us to track both laboratory data and simulate field data as well as build the necessary monitoring using Grafana – Sitespeed.io, a set of open-source tools that make it easy to monitor and measure the performance of your website.

We liked the approach of the developers of this tool:

“Measuring performance shouldn’t be hard: you should be able to have full control of your metrics, own your own data, and you should be able to do it without paying top dollars. That’s why we created sitespeed.io.”

Our team has a deep respect for open-source, so we couldn’t ignore this perfect match!

The most significant advantages of using this tool are:

Fast installation (you can choose to run a docker container or install using npm; more details can be found in the documentation);
The ability to choose report creation from the browser or device (mobile or desktop), as well as the type of connection (3G, 3Gfast, etc.);
Ability to interact with the page (clicks, scrolls, etc.) and take metrics during this activity, which is very important because one click on some dropdown menu can significantly worsen the metrics;

Ability to set up a performance dashboard, which is convenient for observing changes in metrics (more details can be found here), and looks something like this:

Also, Sitespeed is always updated, and its developers are very responsive to the need to make changes. For example, when Google introduced new test metrics, there was a new release of Sitespeed in less than a week, which included these metrics.

The most significant disadvantages were:

The need for a specialist who can set up all of the necessary infrastructure and maintain it.
Confusion over the new information, data, and reports the solution will generate.

With that said, we feel that Sitespeed is a very flexible tool with many interesting features. So, let’s take a more detailed look at exactly how we set up the testing process with the help of Sitespeed. To keep things simple, we prepared two types of tests: lab data on selected links and field data for selected posts.

I’ll do my best to describe these flows without using too many technical details. Let’s start with reviewing the lab and field data—the most important facet is how we analyze it.

Lab Data:

First, we test the link in first-view mode, meaning we open the link with the help of Sitespeed and collect metrics on the first viewport. As a result, we receive a folder with a report included. Within, you will find a significant amount of data that can help us analyze our performance metrics and create a strategy for improvement.

We see the main page first, which contains the average metrics values that interest us. You can configure the order in which these appear, which you can read more about here, but we were satisfied with the default configuration. You will also see four red metrics with the prefix ‘coach’ (coach overall score, coach performance score, coach privacy score, and coach best practice score). It is Sitespeed’s internal performance assessment. Sitespeed makes recommendations within the report about what needs to be done to improve these metrics, but that will be discussed shortly.

A detailed report is generated for each tested page. Here we can see the specific link that we are testing (the screenshot shows the main page of news.amomama.com) and the metrics values for that particular page, the number of iterations (all runs) on which the data was collected, and the selected average value for each metric (in the example, we ran the tests only once, the default Sitespeed setting is three iterations). But the most interesting pieces are hidden in the video, filmstrip, and coach tabs.

On the video tab, you will find… you guessed it, videos. The benefit of this page is that several visual metrics are dynamically displayed in addition to the page itself. With this visual, we can easily track the moment a certain metric became worse and what actions on the page caused it.

An even greater amount of information about metrics changes at specific moments in time is hidden in the filmstrip tab. There, you will find all the screenshots illustrating the changes that have occurred, which are not necessarily visual. You can also see changes in metrics under the screenshots.

Now, the only tab left from our list is the Coach tab. Above, I mentioned the coach overall score, coach performance score, coach privacy score, and coach best practice score, which are in red. This tab provides detailed instructions on what to do to improve these values.

How do we measure the field data?

Honesty, it’s done in essentially the same way, but we added a bit of interaction with the help of JavaScript. The Sitespeed test runs in much the same way as the lab data test. However, in this case, the page is scrolled very slowly (100 pixels every 500 milliseconds to the very bottom of the article—and they are quite large). Checking one post takes an average of two and a half minutes; as you might guess, we check more than one link.

So, the question arises: Do we really need these tests? Would a quick first-view check be enough with more links included? The answer is a resounding no. Our CLS score did not rise above the yellow zone for a long time. On the technical side, we attempted many different solutions but did not achieve the result we were hoping for. After we started running our first tests with scrolling, the screenshots indicated that the CLS score “jumps” at one particular element—the advertisement video player.

This change is not visible on the first screenshot because the video player is below its borders. For some time, we have been communicating with our partners who provide us with this player because it suits us in all other parameters. We hoped that they could reduce the impact on CLS from their side. It was incredibly helpful to have that monitoring and reporting data from Sitespeed to prove that the problem was occurring on their end rather than ours.

Below, you will find the performance board that helps us to analyze performance metrics so that we may react quickly to any changes:

So, now that we’ve shown you a bunch of confusing graphs, parameters, and metrics, there is a pretty logical next question: How does it help us?

The next picture shows that same board, but with only one parameter selected – TTFB (time to first byte). Recently, this metric has grown significantly, which we saw on the corresponding chart. We decided it was important to fix this situation as soon as possible. After the fix was implemented, we did not have to wait or collect statistics on our own to gauge its effectiveness. Instead, we just opened our board and saw that our idea had worked and that the TTFB metric had improved.

Additionally, we can prevent potential errors in our features before their release. For example, when a developer misses something in his code, we see it in the same dashboard but only for the test environment. We can fix the bug immediately and avoid any negative impacts on performance.

Conclusion

Now that we have got acquainted with many different tools that help us measure and monitor Google Core Web vitals, we can conclude that they all have powerful and advanced functionalities. Of course, everyone chooses which will best meet their individual needs—but each tool has significant value.

We chose to highlight sitespeed.io in this article because it opened many doors for us, becoming a springboard for the dynamic development of our product. It has allowed us to offer users high-quality entertainment content as efficiently and quickly as possible from any location and on any Internet connection.

Thank you!

Bohdan Kladkovyi
Delivery Manager, AMO