When the CRTC sought submissions to its 2015 consultation on โBasic Telecommunications Services,โ thousands of people offered their opinions on what the future of Canadaโs internet should look like. For the first time ever, these submissions have been made searchable via a single, simple-to-use platform, and a surprisingly varied narrative of struggles and concerns has emerged.
The 2015 CRTC consultation set out to determine if broadband should be considered a basic need for Canadians (the consultation ultimately decided that it should be). Submissions ranged from telecommunication companies arguing about the impact on their bottom lines, to low-income Canadians describing having to choose between paying for internet or food.
Collating and analyzing the 65,000+ pages of material submitted to the 2015 consultation required building a new data mining tool. This work was carried out by the data science team at Cybera (Albertaโs not-for-profit technology accelerator), with funding from the Canadian Internet Registration Authorityโs Community Investment Program. The goal of Cyberaโs Policy Browser tool is to better clarify if and how government decisions are made based on the publicโs input.
Building this tool was not an easy task: the CRTC submissions were delivered in a wide variety of formats (from PDFs and Word documents to spreadsheets) that are traditionally very difficult to pull formatted text and data from. Cyberaโs team used machine learning techniques to extract the text from these submissions, and group related phrases.
Using the Policy Browser tool, the team uncovered a broad spectrum of internet access issues and priorities across Canada. For example, when discussing the concept of โaffordability,โ larger telecom companies and industry groups tended to use words like โmarket demandโ and โeconomic benefits,โย while the nearly 3,000 individual Canadians who submitted feedback tended to use more personal terms, like โjobsโ, โhomeโ and โfoodโ. A search of negative words used by individuals frequently came back with โpayโ, โgreedโ, โridiculousโ, โpoorโ and โworry,โย whereas network operators tended to use more neutral language.
Bigram plot of frequently used words (and how they were associated with other words) in submissions to the 2015 CRTC “Basic Telecommunications” consultation. (click image for larger version)
Breaking the submissions down into groups (advocacy agencies, government, telecom companies, individuals, etc.) revealed interesting patterns in what issues were emphasized. Advocacy groups focused on affordability and telecom operating costs, while government groups discussed minimum speeds and how government funding should be allocated. The telecom companies focused their responses on the definition of โbasic service,” and described how their revenue systems work and what plans they currently offer to address โaffordabilityโ needs.
โWhat is really interesting is how various parties are focusing on their own specific (and often unrelated) problems in these submissions, rather than presenting solutions to the same problems,โ notes Barton Satchwill, Vice President of Technology at Cybera. โYou get a good sense of the complex landscape the CRTC has to navigate when addressing different needs and demands.โ
โGrowing the awareness and understanding of the Canadian internet ecosystem is an important step in helping to improve it,โ says David Fowler, Vice President of Marketing and Communications at CIRA. โWe do this in several ways, including presenting data from Canadaโs Internet Factbook, and we also act as a catalyst by supporting others. Weโre proud to have funded Cyberaโs project, which takes information that is often vast and dense, and turns it into a narrative that all Canadians can access and more easily understand. Growing this knowledge is a positive way for Canadians to engage in building a better online Canada.โ
Adds Satchwill: โItโs still early days for the Policy Browser tool and the data analytics it can run. One of our biggest accomplishments so far was making the submissions more accessible! The CRTC made the consultation documents available on its website for individual download, which is a cumbersome process that would take one person a lifetime to go through. Our hope with this platform is that it can be adapted for other government consultations. This will make it easier for researchers to study the role that public submissions play on how regulations and policies are created.โ
Cybera built the Policy Browser using open source tools, and is making the source code available to anyone who would like to apply it to other data mining applications involving large numbers of text files. For further information, visit the Policy Browserย or contact datascience@cybera.ca.
Background
About Cybera
Cybera is a not-for-profit technology-neutral organization responsible for driving Albertaโs economic growth through the use of digital technology. Its core role is to oversee the development and operations of Albertaโs cyberinfrastructure โ the advanced system of networks and computers that keeps government, educational institutions, not-for-profits, business incubators and entrepreneurs at the forefront of technological change.
Working with government, education, and private sectors, Cybera is creating a community that champions vital networking and computing services and utilities for everyone, everywhere. We also provide member organizations with unbiased, highly skilled expertise on technology products, processes or services, and access to shared IT tools.
About CIRA
CIRA is building a better online Canada through the Community Investment Program by funding innovative projects led by charities, not-for-profits and academic institutions that are making the internet better for all Canadians. CIRA is best known for our role managing the .CA domain on behalf of all Canadians. While this remains our primary mandate, as a member-based not-for-profit ourselves, we have a much broader goal to strengthen Canada’s internet. The Community Investment Program is one of our most valuable contributions toward this goal and funds projects in infrastructure and access, digital literacy, online services, and research. Every .CA domain name registered or renewed contributes to this program.
To date, CIRA has supported 102 projects with over $4.2 million in contributions.