By the Cybera Data Science Team
To an ever increasing degree, participation in the economy and society at large is mediated by the internet. If people don’t have good internet access, they feel disenfranchised. Currently in Canada, there are many, many people who feel disenfranchised. For several years now, it has seemed that internet offerings grossly favour urban Canadians over rural. And even for those with the fastest network speeds, there seems to be a large disparity between download capabilities, and upload capabilities.
Seeking a Canada-specific data science challenge, the team at Cybera wondered if there was a data-driven, statistically rigorous way to investigate whether or not these internet disparities do, indeed, exist. And if so, by how much? (And which areas have it the best/worst)?
The go-to website for testing a given computer’s network speed is speedtest.net, which is operated by Ookla. It measures upload and download speed, latency and jitter, and the length of the network. Ookla also reports some of the economic aspects of the internet, such as the dollar cost of bandwidth, and actual speeds delivered (compared to what was promised). We obtained Ookla’s global networking data for the period running January 2008 to September 2014. Ookla recorded each network feature daily, captured as a weighted average of data collected over 30 days.
The raw data is available here for anyone who would like to analyze the data themselves (the 2015-2016 data is only available under a private license).
While many factors determine the total quality of a network — such as latency and jitter — for this post, we will focus on speed.
Internet speeds in Canada
Overall, we can see that average internet speeds in Canada have steadily increased over time.
The trend reported by the Ookla data is in line with the Canadian Internet Registration Authority (CIRA) report on speeds in Canada, published in April 2016. The CIRA report only analyzed speeds in 2015, and were collected using a different measurement platform than the Ookla data, so the two cannot be directly compared. However, their results are consistent.
In both reports, Canadian download speeds are shown to have increased at a much higher rate than upload speeds. Ookla reported an average increase of 1.44 Mbps per year for download, versus 0.5 Mbps per year upload speeds. CIRA shows average speeds of 18.64 Mbps download and 7.26 Mbps upload.
Sidenote: Why are upload speeds important? While fast download speeds are great for catching up on YouTube and Netflix, if Canadians want to contribute anything to the internet — i.e. create and share their own content — they need to have good upload speeds, too. Unfortunately, the data shows this asymmetry is increasing over time.
Comparing internet speeds across provinces
Our understanding of network data in the provinces was complicated by the sparsity of data in some regions. Nevertheless, if we confine ourselves to provinces with reasonably complete data sets, we can still glean some insights.
Within Canada, the data shows major variances between provinces and how their networks have changed over time. New Brunswick, for example, showed a dramatic increase in download speeds in 2013, whereas Manitoba saw little growth.
Surprisingly, the Ookla data showed that New Brunswick (29.3 Mbps), Newfoundland (24.2 Mbps), and the Yukon Territory (23.6 Mbps) had the best mean download speeds in 2014, and were in the top four regions in 2013. This is somewhat consistent with the 2015 CIRA report, which shows both New Brunswick and Newfoundland in the top five provinces and territories. (Note that New Brunswick shot to the top of the ranking in 2013 after being ranked 12th in 2012.) Yukon Territory, however, ranked last among the provinces and territories in the CIRA report.
Alberta’s ranking peaked at 5th place in 2008, and remained in the middle of the pack (with respect to provincial download means) until 2011, after which it dropped to 10th. It has not ranked higher than 8th in the last four years.
Interestingly, CIRA’s report ranked Saskatchewan in 2nd place for 2015, while our analysis ranks the province 12th for the prior year. Also, the Yukon Territory ranked last according to CIRA’s report, while our data shows it mostly ranking in the top five provinces and territories since 2009 (see the table below).
Table 1: Provincial rank of mean download speeds by year.
Similar to the download speed results, the maritime provinces of New Brunswick (10.6 Mbps) and Newfoundland (9.2 Mbps) ranked highest amongst Canadian provinces for mean upload speeds in 2014, with Prince Edward Island (6.5 Mbps) rounding out the top three. These three provinces had similarly strong showings in 2013 (with respect to mean upload speeds), placing in the top four in Canada.
Provincially, upload speeds continue to lag behind download speeds, with the average difference between the two increasing from 8.4 Mbps in 2012 to 11.8 Mbps in 2014. However, there was a large three-fold jump in the highest mean upload speed results, from 2012 (2.8 Mbps) to 2013 (8.5 Mbps). This suggests that packages with higher upload speeds are becoming available, although we are still not close to seeing symmetric upload and download speeds.
How does where you live impact your network speed?
Is there a real difference in network speeds between rural and urban communities? Are rural Canadians at a disadvantage?
For our research, we used the Statistics Canada Population Centre census criteria to define an urban community as an area with a population of at least 1,000 and a population density of 400/km2. Anything less than that is defined as a rural community. Using this definition, we outlined 219 rural communities in the Ookla dataset, and 223 urban.
The data does indeed show big differences between rural/urban internet in Canada. Although annual means are similar, differences are statistically significant and speeds are increasing at a faster rate in urban communities than they are in rural communities. A similar picture emerges in Alberta, which shows statistically significant differences in the mean download speeds of urban and rural communities.
Digging a little deeper into the differences between network speeds in urban and rural Alberta, there appears to be a correlation between the national population rank and the download/upload speeds. The following two plots show the least populous areas on the left, and the most populous on the right (Calgary is ranked 3rd and Edmonton 5th, nationally), and how the population size relates to download speeds.
To view the entire plot from 2008-2014, see here.
The plots show a developing linear correlation between population size and speed from 2011-2015, with the R2 values (which shows the proportion of variance explained by the model) progressing from 0.13 to 0.55. Note that the slope of the “best fit” lines (a straight line that best represents the data on a scatter plot) is increasing over the years. This means the discrepancy in download speeds between smaller and larger municipalities has grown between 2008-2014.
The same trend is more difficult to spot in the upload data, as R2 values do not exceed 0.31. However, this changes if two outliers are excluded from the dataset (which can be brought into the frame by clicking the autoscale button). These outliers are Olds and Drumheller, with average upload speeds of 21 Mbps and 13 Mbps, respectively, in 2014. Removing these towns from the analysis shows a stronger correlation between population size and upload speeds (the R2 value jumps from 0.08 to 0.58).
Sidenote: Olds is a unique town in Canada, playing home to the country’s first community-owned fibre optic network, which began operating in January 2013. It is not as clear why Drumheller has such unusually high values, as we are not aware of any different broadband providers in that area.
One point to keep in mind is that the Ookla dataset consists of aggregated data that requires a minimum of 300 tests to be conducted within the last 6 months for inclusion. For community specific speed data (e.g. city_daily_speeds.csv), it is possible that communities at the lower end of the spectrum do not meet this sampling criteria and are therefore not represented in this data. This could skew any urban-rural comparison towards communities that perform a larger number of tests.
Unfortunately, Ookla no longer publicly releases its internet speed data, so this analysis only provides a historical perspective on how the speed and quality of the internet has changed in Canada. Nevertheless, two suspicions many Canadians have about our internet availability and capabilities can be confirmed:
- The gap between available upload and download speeds has been widening for years.
- There are statistically significant differences in network speeds in rural and urban communities.
Interestingly, for rural networking speeds, one of the most noticeable outliers in this trend is Olds (Canada’s first community-owned fibre to the home municipality), which has far higher upload speeds than the rest of Alberta. Alberta as a province compares poorly to the other provinces, ranking 10th in average download speeds over the last three years. This is surprising, given that Alberta possesses the province-wide Alberta SuperNet, which provides connectivity to every community.
This blog post will be the first in a series of blogs related to the Ookla data. In upcoming posts, we plan to present more details on our methodology and also report on quality metrics available in the dataset. We invite anyone who is interested to download the dataset and explore it themselves, and reach out to us with any questions, comments, or suggestions!
About Data Science at Cybera
Data science and machine learning are transforming the way problems can be approached. Rather than a deductive approach, where theories are tested against observations, machine learning is an inductive approach that allows patterns and theories to emerge from observations. Because the patterns and theories do not have to be stated initially, machine learning can uncover surprising and unexpected features in the data. This technology is now being used in every domain, from sociology to economics to medicine.
That said, it is still difficult to successfully apply the techniques of machine learning to a problem. In order to understand those problems, Cyberans decided to practice the art ourselves.
Because Cybera has expertise in computer networking, and an interest in advocating for strong networks in Canada, it was appealing to start our explorations by studying network performance