AI and Language Data Flaring in Africa: Addressing the Low-Resource Challenge

Page 1


Policy Brief No. 216 — November 2025

AI and Language Data Flaring in Africa: Addressing the Low-Resource Challenge

Key Points

→ African languages are under-represented in artificial intelligence (AI) systems due to limited language data, excluding millions from digital participation in their native languages.

→ Factors such as multilingual complexity, foreignlanguage-dominant policies, weak institutional backing and lack of digital infrastructure contribute to the low-resource classification of African languages.

→ “Language data flaring” — paralleling gas flaring — captures the systemic neglect and poor management of African language data leading to data undercollection, poor storage and limited use in AI.

→ Addressing the gap requires policies that integrate African languages into national digital agendas, support documentation, fund projects and foster inclusive, collaborative AI development.

→ Community-led documentation, open-source tools and growing recognition of linguistic diversity in AI offer promising paths forward.

Introduction

Modern AI systems, built using deep-learning techniques, require massive amounts of data to function effectively to produce realistic outputs that reflect the patterns and structures within the training data. For language technologies, this data is sourced from news, books, blogs, social media and other digital platforms that host linguistic content. However, only a small number of languages possess sufficient data to support the development of robust AI technologies. Despite having millions of speakers, most African languages are low resource, meaning they lack the data necessary to build robust AI models. Many Africans are therefore excluded from digital tools, online resources and AI-driven services because they are not available in indigenous languages (Adams et al. 2024). Figure 1 shows the lack of cultural and language diversity of AI in government frameworks, government actions and non-state actors for Africa compared to other regions.

According to Pratik Joshi et al. (2020), more than 95 percent of African languages are classified as “leftbehind” — languages for which it is nearly impossible to build AI-powered language technologies due to insufficient data. This classification is not based on the number of speakers but rather on the availability

About the Author

Ife Adebara is an AI researcher whose work integrates natural language processing with the preservation and advancement of African languages. She holds a Ph.D. in linguistics from the University of British Columbia, along with master’s degrees in computer science from Simon Fraser University and the University of Birmingham. With more than 10 years of experience in multilingual AI, she specializes in creating inclusive language technologies for lowresource and indigenous languages. Her research focuses on ethical data curation, model development and language policy, advancing linguistic equity in AI. She leads the development of AfroLID, Serengeti and Cheetah — innovative models that provide support for more than 500 African languages and varieties. Ife is the co-founder and chief technology officer of EqualyzAI, a company pioneering agentic AI built on Africa’s most inclusive language data sets.

of digitized text. The discrepancy becomes clearer when comparing speaker populations with data availability. Languages such as Catalan (five million speakers), Finnish (10 million) and Swedish (10.5 million) are classified as high resource because of extensive digital documentation. However, African languages such as Amharic (37 million), Igbo (30 million) and Swahili (80 million) remain under-resourced, despite their large speaker populations. This situation highlights a systemic gap in language representation, where access to digital historical documentation and policy decisions dictate whether a language thrives in AI or remains excluded.

Language data scarcity is not simply a technical issue; it is shaped by deep-rooted socioeconomic, political and infrastructural challenge. Several factors contribute to the absence of comprehensive data sets for African languages:

→ Government policies and institutional neglect: Many post-colonial African governments have historically prioritized European languages (English, French, Portuguese) in education, governance and media, limiting formal support for indigenous languages.

→ Digital infrastructure gaps: African languages are rarely integrated into mainstream technologies — such as keyboards, search engines and social media interfaces — making digital text collection difficult.

→ Data preservation challenges: Many African languages have strong oral traditions but limited written documentation. Existing linguistic resources are often stored in formats that are inaccessible for AI training, such as physical manuscripts, audio recordings or scattered academic papers.

This gap is what we term “language data flaring,” analogous to gas flaring in the oil industry. Just as gas flaring involves the wasteful burning of natural resources during oil extraction, language data flaring results in the inadequate collection, preservation and utilization of linguistic data. In Africa, this manifests in lost or inaccessible linguistic materials, underutilized archives, poor digitization practices and limited integration of local languages into educational and technological infrastructures. These practices not only squander linguistic resources but also deepen the digital

Figure 1: Cultural and Language Diversity in AI

Source: Adams et al. (2024, 48).

divide, ensuring that African languages remain marginalized in the global AI landscape.

In this policy brief, the author explores how African languages have been digitally marginalized due to language data flaring practices. This situation systematically excludes African languages from technological advancements, silencing their presence in the AI-driven digital world. Just as a power outage disconnects communities from critical resources, language data neglect excludes African languages from the digital ecosystem. Over time, this exclusion contributes to linguistic and cultural erasure, limiting opportunities for technological inclusion. However, efforts to address these challenges are gaining momentum. Across Africa, researchers, technologists and policy makers are driving initiatives to improve language data availability and integration into AI systems (Adebara et al. 2022, 2023; Adebara, Elmadany and Abdul-Mageed 2024; Adelani et al. 2022). This policy brief examines the factors contributing to the low-resource status of African languages, highlights emerging success stories in addressing data scarcity, and provides key recommendations for policy makers, researchers and industry leaders. Addressing this challenge requires a concerted effort to document, preserve and integrate African languages into AI systems, ensuring their voices are not only heard but that they can thrive in the

digital age. By implementing these strategies, we can bridge the digital divide and ensure African languages play a meaningful role in the future of AI.

Multilingual Africa: A Double-Edged Sword for AI Development

Africa is a very complex multilingual continent with more than 2,000 languages and language varieties spoken in the continent — about one-third of all the languages spoken in the world (Hammarström 2018). As a result, many Africans navigate between multiple languages from early childhood. Take, for example, a child growing up in Ethiopia’s capital, Addis Ababa. They might speak Oromo at home with their parents (mother tongue), use Amharic in their neighbourhood and local market (language of their immediate environment) and then encounter English as the primary medium of instruction in secondary school. Each language represents different aspects of their identity and serves distinct social functions. The situation can be even more complicated. Consider a family living in Bauchi, Nigeria: the mother is from the Haba ethnic group

and speaks Kilba as her mother tongue, while the father is Idoma and speaks Idoma as his first language. They live in the predominantly Hausa-speaking northern part of Nigeria, but in the military barracks where they reside, their child is exposed to Nigerian Pidgin. From birth the child is immersed in four different languages, before attending primary school where they are exposed to English as a fifth language.

The linguistic landscape in African communities creates unique patterns of linguistic expertise. The aforementioned child growing up in Bauchi, Nigeria, for example, navigates multiple linguistic domains, developing different competencies, and registers for each of the five languages in their repertoire. While they may actively use Idoma and Kilba within family settings, they may struggle to use this language outside the home. Simultaneously, they might use Nigerian Pidgin for interactions with peers but not with elders, while developing broader proficiency in English and Hausa for formal and institutional contexts. The fluid nature of multilingual competence means that a speaker’s ability to use each language is often context-dependent and domainspecific. This child may find themselves unable to effectively communicate in any of these five languages outside their established contexts of use — facing challenges when attempting to use home languages in formal settings or, conversely, struggling to express intimate or cultural concepts in languages reserved for institutional spaces.

This multilingual reality is both a challenge and an opportunity for AI development in Africa. On the one hand, the fluid and context-dependent nature of language use makes it difficult to develop AI models that accurately capture linguistic diversity. On the other hand, Africa’s multilingual expertise offers a unique advantage. The ability of individuals to navigate between multiple languages, dialect and registers presents an opportunity to design more flexible, adaptive AI models capable of handling real-world linguistic diversity. By prioritizing language-inclusive AI development, Africa can pioneer AI systems that better reflect human multilingualism — not just for the continent, but also for the world at large.

Colonial Legacies and Policy Gaps: The Silent Gatekeepers of Language Data

Across Africa, the dominant language policy in response to the multilingual landscape is to adopt a foreign language for official business and, in some cases, a few indigenous African languages may be given some official status either at regional or national levels or for education (Petzell 2012; Foster 2021; Ouane and Glanz 2010).

Figure 2, Figure 3 and Figure 4 show the official language use across Africa. In Nigeria, English is the official language, while only three out of 512 indigenous languages are officially recognized as regional languages. In Ghana, English again is the official language, but 10 of the country’s 73 indigenous languages are also used as institutional languages; Swahili is the only official indigenous language in Tanzania out of 118 others in addition to English; 12 of 61 languages in Kenya have some official status; only 12 of 20 indigenous languages in South Africa are institutional languages (Adebara and Abdul-Mageed 2022).

Even when indigenous languages gain official status alongside foreign languages, they often hold a symbolic rather than functional role. For example, although the African Union recognizes Kiswahili as one of its official languages, its website and official document releases remain in English and French, not Kiswahili. This same pattern extends to education systems: even where indigenous languages are used, their role is typically limited to early childhood education, often alongside a foreign language rather than as the sole medium of instruction (Petzell 2012; Foster 2021; Ouane and Glanz 2010). From secondary school through university, foreign languages dominate as the primary medium of instruction. This policy structure has direct implications for AI development. First, it creates a literacy gap in indigenous languages, where speakers may be fluent and proficient in their mother tongues but lack the written proficiency needed to contribute to digital content. Second, the overwhelming use of foreign languages in official government documents excludes

Figure 2: Official Languages in Africa

Source: Author.

São Tomé and Príncipe

Gabon Republic of the Congo

Bissau Sierra Leone
Zambia
Namibia Malawi Zimbabwe Botswana
Sudan Eritrea
Djibouti
Côte d’Ivoire
Burkina Faso
Comoros
Mauritius
South Africa
Lesotho
Mozambique
Seychelles

Figure 3: Official Languages of Education in Africa

None Foreign

São Tomé and Príncipe Republic of the Congo

Both indigenous and foreign languages

Source: Author.

Note: “None” refers to those countries with no separate official languages for education. In these countries, the language or languages used in education are the same as used officially.

Mauritania Senegal Guinea GuineaBissau
Zambia
Namibia
Malawi
Zimbabwe
Botswana
South Africa Lesotho
Eswatini
Djibouti
Côte d’Ivoire
Cabo Verde
Gambia
Comoros
Mauritius
Seychelles

Figure 4: Official Regional Languages in Africa

São Tomé and Príncipe

Indigenous

Both indigenous and foreign languages

None

Mozambique Republic of the Congo

South Africa Lesotho

Source: Author.

Note: “None” refers to those countries with no separate official regional language(s). In these countries, the official languages are used regionally.

Mauritania Senegal
Guinea GuineaBissau Cabo Verde Sierra Leone
Zambia
Namibia
Malawi Zimbabwe
Botswana
Comoros
Mauritius Liberia
Benin
Djibouti
Côte d’Ivoire
Burkina Faso
Seychelles

indigenous languages from government discourse, limiting access to crucial public information in indigenous languages. Since most AI models, especially those for natural language processing (NLP), depend on large-scale written text data, the dominance of foreign languages in education not only erases indigenous languages from formal spaces but also prevents it from thriving in digital and AI-driven technologies, reinforcing their continued marginalization in technology.

Media and the Digital Divide: Where Are Africa’s Indigenous Languages?

Text Media: A Shrinking Space for Indigenous Languages

Low literacy rates in African languages directly impact newspaper readership and sales, restricting indigenous-language publications to only a handful of widely spoken languages. For instance, out of 11 newspapers in Uganda, only four publish in five dominant indigenous languages, while the remaining seven use English (Lugalambi, Mwesige and Bussiek 2010). This pattern is seen across Africa, where only languages with official status or large speaker populations sustain newspaper circulation. Countries such as Ghana and Eswatini have lost all their indigenous-language newspapers, while Nigeria has seen numerous failed attempts at sustaining publications (ResCue and Agbozo 2021; Mthembu and Lunga 2020; Fosu 2024; Onyenankeya 2022). Despite rare successes such as Nigeria’s Alaroye, South Africa’s Isolezwe (a Zulu daily selling more than 100,000 copies) and Ethiopia’s Amharic press, indigenous-language newspapers often struggle to survive (Tshabangu and Salawu 2022; Salawu 2020).

Moreover, many newspapers in African languages operate as subsidiaries of foreign-language media houses. In Nigeria, defunct English-language publishers such as Daily Sketch Press Ltd. and Concord Press used to publish Yoruba, Hausa and Igbo newspapers, but these indigenous editions have mostly disappeared (Salawu 2020; Tshabangu and Salawu 2022; Onyenankeya 2022). Similarly, South

Africa’s Perskoporasie (Perskor) once published an isiXhosa newspaper, Imvo Zabantsundu, but it no longer exists. Beyond newspapers, book publishing in African languages is also in decline, as more authors opt for colonial languages to boost their sales potential. In situations where indigenous-language books exist, they rarely have a digital presence, limiting their accessibility for AI-driven applications. The absence of indigenous language books and newspapers online means AI models lack sufficient high-quality textual data for training, reinforcing digital exclusion.

Radio and Television: A Lingering Dominance of Foreign Languages

Radio remains one of the most widely accessible and influential forms of communication in Africa, with thousands of stations across the continent (Molale and Mpofu 2023; Conroy-Krutz and Koné 2022; Brooke 2024). In Cameroon alone, there are more than 280 stations; Ghana has 354, Uganda 258 and Mali more than 300 (Cheo, Chie and Menguie 2023; National Communications Authority 2023; Myers and Harford 2020; Myers 2009). The affordability of radio makes it indispensable, especially in rural areas, as it requires neither literacy nor constant electricity. While radio broadcasts in a wider range of indigenous languages than newspapers and television (the later only using the most dominant indigenous languages), foreign languages still dominate the airwaves. In many cases, indigenous-language programs have short time slots, while prime-time broadcasts feature programs in English, French or Portuguese. Even when indigenous-language radio or television content is available, it is rarely archived or digitized, making it difficult to use for AI training or linguistic research.

Another major obstacle is that many stations lack an online presence, and those that offer digital streaming often prioritize foreign-language programming, while indigenous-language content remains offline. Additionally, local broadcasting laws require stations to archive content for only a short period. In Rwanda1 and Nigeria (National Broadcasting Commission 2016), for example, stations must store media for three months and 90 days, respectively, after which they may delete it at their discretion. With limited funding for data

1 Law No. 02/2013 of 08/02/2013 Regulating Media, online: <https://rwandalii.org/akn/rw/act/law/2013/2/eng%402013-03-11>.

storage, valuable indigenous-language broadcasts are quickly lost, further deepening the digital divide.

Social Media: Digital Platforms and Linguistic Marginalization

Social media platforms enable users to create and share content in their own languages. However, most platforms provide little to no support for African languages. Meta’s Facebook, for instance, supports 112 languages, including 11 African languages, but many of these are only partially supported, and the platform’s interface is not fully localized in any of them. In contrast, Instagram, LinkedIn and X support 32, 36 and 34 languages, respectively — none of which are African. This lack of support excludes indigenous-language speakers from full participation in digital spaces, limiting their ability to engage in online discourse, access information and contribute to global conversations. More critically, it contributes to the under-representation of African linguistic and cultural identities in digital spaces, reinforcing linguistic marginalization and potentially accelerating language shifts away from indigenous languages.

Even when users are literate in indigenous languages, the absence of African-language keyboards makes online writing challenging (Adebara and AbdulMageed 2022). Many African languages rely on diacritics to mark tone, vowel length and other features, yet standard keyboards do not support easy input of these characters (ResCue and Agbozo 2021). Omitting diacritics can introduce significant ambiguity, as seen in Yoruba: igbá (calabash, basket), igba (200), ìgbà (time), ìgbá (garden egg) and igbà (rope). Similarly, in Akan, grammatical tone distinctions impact verb meanings: Ama dá ha “Ama sleeps here” and Ama dà ha “Ama is sleeping here” (Adebara and Abdul-Mageed 2022). As a result, social media engagement in indigenous languages is minimal, with users primarily interacting through likes and shares rather than written comments (Sunday et al. 2018; Molale and Mpofu 2023; ResCue and Agbozo 2021). Where written comments are adopted, diacritics are often omitted and code-mixing is prevalent (Molale and Mpofu 2023; Yevudey 2018).

The AI Implications of the Digital Divide: Recommendations for Change

The dominance of foreign languages as official languages in Africa has significant implications for AI development:

→ Limited digital presence reduces training data. NLP models rely on large-scale text and speech data sets, yet most indigenous languages are either unavailable online or remain undigitized.

→ Lack of transcription and archiving weakens AI capabilities. Without systematic transcription and preservation of indigenouslanguage content, AI technologies remain underdeveloped, further restricting AI applications for African languages.

Without proactive efforts to digitize, preserve and integrate indigenous language into AI systems, these languages risk continued marginalization in the digital era. Closing this gap is critical to ensuring that AI technologies reflect Africa’s linguistic diversity rather than reinforce historical inequalities. The following recommendations outline key strategies for addressing these challenges:

→ Institutionalize indigenous languages in education and governance. Across Africa, there is a strong correlation between a language’s official status and its availability in digital media, particularly in countries where local languages are used in education and administration (Adebara and Abdul-Mageed 2022). Ethiopia, Tanzania and South Africa are among the few African nations where one or more local languages is widely used for official and administrative purposes, with Amharic, Swahili and Zulu among some of the highest resourced languages in Africa. Fully integrating indigenous languages into education and governance is therefore crucial for expanding language data in Africa. Developing educational materials, documenting official government information and ensuring that legal proceedings and public service announcements are accessible

in indigenous languages are important steps in countries with language policies that recognize indigenous languages. Countries without such policies should consider adopting similar policies. Literacy is a long-term process that depends on sustained education, and policies that restrict indigenous-language instruction to only a few years in school are ineffective. Higher literacy rates in African languages can enhance authorship, readership and the commercial viability of books and newspapers, strengthening their presence in both digital and print media.

→ Prioritize digitization and online archival of indigenous-language content. Digital public infrastructures play a crucial role in the digitization of African languages by providing open, scalable and interoperable platforms that support linguistic diversity. These platforms can be developed through collaborations between governments, researchers and technology companies. Policy frameworks can further accelerate progress by incentivizing broadcasters and other content creators to retain and provide open-source indigenous-language transcripts for AI training, ensuring a steady supply of high-quality language data. Digitization efforts must extend beyond new content to include historical archives siloed in different institutions, which provide valuable linguistic and cultural insights over time. Additionally, investing in indigenous-language keyboards and other digital tools will further enhance accessibility and usability, strengthening the digital presence of African languages.

→ Develop and fund indigenous-language content creation and curation. Funding initiatives that support the production of newspapers, books, radio, television and social media content in indigenous languages can incentivize creators and significantly expand the pool of linguistic resources. Media organizations should also be encouraged to archive indigenous-language content for extended periods to ensure long-term accessibility and preservation. Furthermore, community-driven initiatives will be beneficial in ensuring diverse and representative language data. This is because language is dynamic — while a language may have a finite vocabulary, the ways in which words are combined to form sentences are infinite. Capturing an accurate representation of language use requires broad

community participation in data collection. Without diverse contributions, digitized language resources risk reflecting only a narrow subset of actual language usage. Communityled efforts in data collection, curation and development of AI technologies for African languages have already yielded tremendous results (Adams et al. 2024; Adelani et al. 2022; Adebara et al. 2022, 2023; Adebara, Elmadany and Abdul-Mageed 2024). Organizations such as the African Languages Technology Initiative, Data Science Nigeria, Deep Learning Indaba and Masakhane have done extensive work in data creation and curation for African-language AI. Organizations such as EqualyzAI have also developed a crowdsourcing platform — Equalyz Crowd — to facilitate language data collection, creation and enrichment in Africa.

Conclusion

Bridging the digital divide for African languages is a complex but necessary endeavour, and there are promising signs that these languages can take a central role in the AI revolution. Achieving this will require collaborative efforts from governments, policy makers, language experts, technologists and local communities. Africa’s youthful and dynamic population is eager to see their languages represented in the digital space, and it is imperative that they are empowered to engage in global conversations in the languages they choose. By implementing the strategies outlined in this policy brief, we can move toward a more inclusive digital future — one where African languages thrive in AI. Finally, truly achieving linguistic and cultural representation requires not just the strategies discussed in this paper but also questioning the assumptions that guide AI’s use, and critically examining how AI technologies incorporate linguistic data at a fundamental level.

Works Cited

Adams, Rachel, Fola Adeleke, Ana Florido, Larissa Galdino de Magalhães Santo, Nicolás Grossman, Leah Junck and Kelly Stone. 2024. Global Index on Responsible AI 2024 South Africa: Global Center on AI Governance. https://girai-report-2024-corrected-edition.tiiny.site/.

Adebara, Ife and Muhammad Abdul-Mageed. 2022. “Towards Afrocentric NLP for African Languages: Where We Are and Where We Can Go.” In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), edited by Smaranda Muresan, Preslav Nakov and Aline Villavicencio, 3814– 41. Dublin, Ireland: Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.acl-long.265.

Adebara, Ife, AbdelRahim Elmadany and Muhammad Abdul-Mageed. 2024. “Cheetah: Natural Language Generation for 517 African Languages.” In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), edited by Lun-Wei Ku, Andre Martins and Vivek Srikumar, 12798–823. Bangkok, Thailand: Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.acl-long.691.

Adebara, Ife, AbdelRahim Elmadany, Muhammad AbdulMageed and Alcides Inciarte. 2022. “AfroLID: A Neural Language Identification Tool for African Languages.” In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, edited by Yoav Goldberg, Zornitsa Kozareva and Yue Zhang, 1958–81. Abu Dhabi, United Arab Emirates: Association for Computational Linguistics. https://doi.org/ 10.18653/v1/2022.emnlp-main.128.

———. 2023. “SERENGETI: Massively Multilingual Language Models for Africa.” In Findings of the Association for Computational Linguistics: ACL 2023, edited by Anna Rogers, Jordan Boyd-Graber and Naoaki Okazaki, 1498–537. Toronto, ON: Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.findings-acl.97.

Adelani, David Ifeoluwa, Jesujoba Oluwadara Alabi, Angela Fan, Julia Kreutzer, Xiaoyu Shen, Machel Reid, Dana Ruiter, et al. 2022. “A Few Thousand Translations Go a Long Way! Leveraging Pre-trained Models for African News Translation.” In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, edited by Marine Carpuat, Marie-Catherine de Marneffe and Ivan Vladimir Meza Ruiz, 3053–70. Seattle, WA: Association for Computational Linguistics. https://doi.org/10.18653/v1/ 2022.naacl-main.223.

Brooke, Peter. 2024. “Radio in Africa: Past and Present.” Journal of African Cultural Studies 36 (1): 1–5. https://doi.org/10.1080/13696815.2023.2294814.

Cheo, Victor Ngu, Esther Phubon Chie and Yollande Menguie. 2023. “An Evaluation of Media Use of Indigenous Languages in Cameroon.” The International Journal of African Language and Media Studies: 30–46. www.rhycekerex.org/ an-evaluation-of-media-use-of-indigenouslanguages-in-cameroon.html.

Conroy-Krutz, Jeffrey and Joseph Koné. 2023. “Promise and peril: In changing media landscape, Africans are concerned about social media but opposed to restricting access.” Dispatch No. 509, February 18. Ghana: Afrobarometer. www.afrobarometer.org/ wp-content/uploads/2022/04/AD509-PAP7Promise-and-peril-Africas-changing-medialandscape-Afrobarometer-dispatch-19feb22.pdf.

Foster, Danny S. 2021. “Language of Instruction in Rural Tanzania: A Critical Analysis of Parents’ Discursive Practices and Valued Linguistic Capabilities.” Ph.D. thesis, University of Bristol. https://researchinformation.bris.ac.uk/en/studentTheses/ language-of-instruction-in-rural-tanzania.

Fosu, Modestus. 2024. “Language Choice and the Problematics of Ideology in the Pre- and PostIndependence Ghanaian Press: A Historical and Cultural Analysis.” Journalism and Media 5 (3): 1194–210. https://doi.org/10.3390/ journalmedia5030076.

Hammarström, Harald. 2018. “A survey of African languages.” In The Languages and Linguistics of Africa, edited by Tom Güldemann, 1–57. Berlin, Germany: De Gruyter Mouton.

Joshi, Pratik, Sebastin Santy, Amar Budhiraja, Kalika Bali and Monojit Choudhury. 2020. “The State and Fate of Linguistic Diversity and Inclusion in the NLP World.” In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, edited by Dan Jurafsky, Joyce Chai, Natalie Schluter and Joel Tetreault, 6282–93. Association for Computational Linguistics, July. https://doi.org/10.18653/v1/2020.acl-main.560.

Lugalambi, George W., Peter G. Mwesige and Hendrik Bussiek. 2010. Uganda. A Survey by the Africa Governance Monitoring and Advocacy Project, the Open Society Initiative for East Africa and the Open Society Media Program. Nairobi, Kenya: Open Society Initiative for East Africa. www.opensocietyfoundations.org/ uploads/98954150-a6df-49eea680-d9d4a8cd07fe/uganda-publicbroadcasting-20100701.pdf.

Molale, Tshepang and Phillip Mpofu. 2023. “(Dis) continuities of African Language Radio on Social Media: The Case of South Africa’s Motsweding FM and Radio Zimbabwe.” In African Language Media, edited by Phillip Mpofu, Israel A. Fadipe and Thulani Tshabangu. London, UK: Routledge. https://doi.org/10.4324/9781003350194.

Mthembu, Maxwell V. and Carolyne M. Lunga. 2020. “The extinction of siSwati-language newspapers in the Kingdom of Eswatini.” In African Language Media: Development, Economics and Management, edited by Abiodun Salawu. London, UK: Routledge. https://doi.org/10.4324/9781003004738.

Myers, Mary. 2009. Radio and Development in Africa: A Concept Paper. Ottawa, ON: International Development Research Centre. https://idl-bnc-idrc.dspacedirect.org/items/ c805347c-9447-48f4-9bdc-b94031b655b4/full.

Myers, Mary and Nicola Harford. 2020. Local Radio Stations in Africa: Sustainability or Pragmatic Viability? Washington, DC: Center for International Media Assistance. June. www.cima.ned.org/publication/local-radio-stationsin-africa-sustainability-or-pragmatic-viability/.

National Broadcasting Commission. 2016. Nigeria Broadcasting Code. 6th edition. www.scribd.com/ document/490616209/NBC-Code-6TH-EDITION.

National Communications Authority. 2023. List of Authorised VHF-FM Radio Stations in Ghana. https://nca.org.gh/ wp-content/uploads/2023/11/FM-LIST-Q2-2023.pdf.

Onyenankeya, Kevin. 2022. “Indigenous language newspapers and the digital media conundrum in Africa.” Information Development 38 (1): 83–96. https://doi.org/10.1177/0266666920983403.

Ouane, Adama and Christine Glanz. 2010. How and why Africa should invest in African languages and multilingual education: An evidence- and practice-based policy advocacy brief Hamburg, Germany: UNESCO Institute for Lifelong Learning. https://files.eric.ed.gov/fulltext/ED540509.pdf.

Petzell, Malin. 2012. “The linguistic situation in Tanzania.” Moderna Språk 106 (1): 136–44. https://doi.org/ 10.58221/mosp.v106i1.8233.

ResCue, Elvis and G. Edzordzi Agbozo. 2021. “Creating Translated Interfaces: The Representations of African Languages and Cultures in Digital Media.” In Rethinking Language Use in Digital Africa: Technology and Communication in SubSaharan Africa, edited by Leketi Makalela and Goodith White, 51–72. Bristol, UK: Multilingual Matters and Channel View Publications. https://doi.org/10.2307/jj.22730532.7.

Salawu, Abiodun, ed. 2020. African Language Media: Development, Economics and Management. London, UK: Routledge. https://doi.org/10.4324/9781003004738.

Sunday, Oloruntola, Ayo Yusuff, Simon Godwin Iretomiwa, Vincent Adakole Obia and Samuel Ejiwunmi, eds. 2018. “Use of indigenous languages for social media communication: The Nigerian experience.” In African Language Digital Media and Communication, edited by Abiodun Salawu. 1st ed. London, UK: Routledge.

Tshabangu, Thulani and Abiodun Salawu. 2022. “Indigenouslanguage Media Research in Africa: Gains, Losses, Towards a New Research Agenda.” African Journalism Studies 43 (1): 1–16. https://doi.org/10.1080/23743670.2021.1998787.

Yevudey, Elvis. 2018. “The representation of African languages and cultures on social media: A case of Ewe in Ghana.” In The Routledge Handbook of African Linguistics, edited by Augustine Agwuele and Adams Bodomo 1st ed. London, UK: Routledge.

About CIGI

The Centre for International Governance Innovation (CIGI) is an independent, non-partisan think tank whose peer-reviewed research and trusted analysis influence policy makers to innovate. Our global network of multidisciplinary researchers and strategic partnerships provide policy solutions for the digital era with one goal: to improve people’s lives everywhere. Headquartered in Waterloo, Canada, CIGI has received support from the Government of Canada, the Government of Ontario and founder Jim Balsillie.

À propos du CIGI

Le Centre pour l’innovation dans la gouvernance internationale (CIGI) est un groupe de réflexion indépendant et non partisan dont les recherches évaluées par des pairs et les analyses fiables incitent les décideurs à innover. Grâce à son réseau mondial de chercheurs pluridisciplinaires et de partenariats stratégiques, le CIGI offre des solutions politiques adaptées à l’ère numérique dans le seul but d’améliorer la vie des gens du monde entier. Le CIGI, dont le siège se trouve à Waterloo, au Canada, bénéficie du soutien du gouvernement du Canada, du gouvernement de l’Ontario et de son fondateur, Jim Balsillie.

Credits

Director, Program Management Dianna English Program Manager Ifeoluwa Olorunnipa Manager, Publications Jennifer Goyder Graphic Designer Abhilasha Dewan

Copyright © 2025 by the Centre for International Governance Innovation

The opinions expressed in this publication are those of the author and do not necessarily reflect the views of the Centre for International Governance Innovation or its Board of Directors.

For publications enquiries, please contact publications@cigionline.org.

The text of this work is licensed under CC BY 4.0. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

For reuse or distribution, please include this copyright notice. This work may contain content (including but not limited to graphics, charts and photographs) used or reproduced under licence or with permission from third parties.

Permission to reproduce this content must be obtained from third parties directly. Centre for International Governance Innovation and CIGI are registered trademarks.

67 Erb Street West Waterloo, ON, Canada N2L 6C2 www.cigionline.org

Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.
AI and Language Data Flaring in Africa: Addressing the Low-Resource Challenge by Centre for International Governance Innovation - Issuu