Acquiring trust, not sovereign datasets, is the key to effective AI regulation
Building public trust is vital to creating effective artificial intelligence (AI) regulation in Australia—not sovereign large-scale datasets. The assertion that AI regulation would be ineffective without the existence of large-scale sovereign datasets—as recently argued in the Strategist—is flawed.
Public trust in AI systems is low in Australia. Only 34 percent of Australians are willing to trust AI systems, according to research by the University of Queensland and KPMG. For AI to flourish, Australians need confidence that their data will be safe in the hands of those who will use it to build and improve AI systems, and that these systems will be deployed safely, securely and responsibly. Regulations function to protect users’ rights, and to apply standards and boundaries that encourage responsible innovation. If well crafted, they should be adaptable to the evolving nature of AI. This assurance of trustworthiness is what will pave the way for greater AI development and adoption by businesses and consumers alike.
The recent Strategist article, ‘Sovereign data: Australia’s AI shield against disinformation’, argues that large sovereign datasets are essential for combating disinformation and ensuring trustworthy AI, and that AI regulation in Australia would be ineffective without first establishing such datasets. There are several problematic assumptions underlying this argument and assertions that deserve greater scrutiny.
An assertion that the success of AI regulations is dependent on the data used in particular AI models overestimates the influence of dataset control on regulatory success. Even the highest quality source data can’t stop an AI being misused—but strong and enforceable regulatory frameworks can. Effective AI governance focuses not just on the source of data but also on how AI systems are deployed and managed.
National sovereignty of data is neither appropriate nor desirable for all AI. Sovereign control of data can be beneficial in specific contexts such as national security and critical infrastructure. But the idea that datasets—whether Australian-generated or otherwise—need always be kept, owned and controlled within Australia overlooks the many cases in which higher-quality data and AI expertise are better sourced from abroad. For example, if developing an AI model for semiconductor chip design, it would be unwise not to collaborate with Taiwan, a global leader in chip manufacturing.
Global datasets are critical in the many cases in which Australia does not have a representative dataset to call upon, cannot generate one that aligns well to the function, or is simply working on projects that require global data analysis—such as climate research. For example, translating foreign languages into English requires datasets from native speakers abroad, as Australia’s multilingual data is limited.
Likewise, there are many cases where the international community has more advanced capabilities than Australia—a fact we can’t ignore. Image datasets such as CIFAR-* and Imagenet for computer vision, object and facial recognition have already undergone intense public scrutiny and bias analysis by researchers and activists online that make it difficult to hide potential manipulation. But there is no such thing as unbiased data. The trick is identifying the degree to which biases have been introduced in datasets and determining whether there are any risks that need to be managed to make it appropriate for Australian uses. This is best done by building trusted international partnerships with shared standards for quality and accountability.
It’s also incorrect to assume that bigger datasets are inherently better. In fact, better is better when it comes to data—cleaner, truthful, secure, appropriate and representative for the purpose. Viral mis- and disinformation and uneducated opinion can generate large amounts of data—scale can dilute good data rather than protect it. And such disinformation can just as easily be created from AI using large, sovereign datasets as it can from non-sovereign datasets.
Not all AI systems rely on data that raises privacy concerns. For instance, AI models that use meteorological data to investigate air quality trends or to predict rain, use data that is neither personal nor inherently sensitive. The regulatory focus in such cases should be on ensuring the accuracy and reliability of the data rather than its geographic origin, if regulation is even needed at all.
In an effort to support domestic AI innovation and manage the risk of overregulation, the Australian Government is taking a risk-based approach in its AI regulation. The Department of Industry, Science, and Resources has done commendable work gripping up the issues, laying a solid foundation that addresses key concerns such as safety, rights protection and alignment with international regulations in its proposed mandatory guardrails for high-risk AI settings. However, refining certain aspects will further strengthen this framework.
The current definition of ‘high-risk AI’ is too broad, risking overregulation of low-risk systems, which could stifle innovation. A clear, tiered regulatory approach that distinguishes between different levels of risk will promote AI adoption without burdening businesses unnecessarily.
Regulations must also account for the global nature of AI development. Foreign developers of products used in Australia should be held to the same standards as domestic developers to prevent unfair advantages, while AI systems—whether developed locally or abroad—must meet high standards for safety, transparency and accountability. Failure to comply should result in enforcement, to build public and business trust in the regulatory system.
Overall, Australian regulation must be responsive, evolving alongside the technology itself. Regulating a field as dynamic and wide-ranging as AI will require ongoing adaptation, and iterative updates that learn from past experience.
Sovereign datasets may have limited roles in specific sectors, but they are far from the cornerstone of effective AI regulation. In shaping its AI future, Australia must prioritise trust over territorial control of data. The real key to effective AI regulation lies in building public confidence through clear, consistent rules that keep pace with technological change globally. They should protect rights, promote transparency and foster responsible innovation—regardless of geographical location of the data or the developer.