In October 2020 the US House Antitrust Subcommittee, chaired by Congressman David Cicilline, published its report on competition in digital markets. It conducted a full review of the market from top to bottom, focusing on the dominance of the giants in the industry: Facebook, Amazon, Apple and Google. The report zeroes in on their business practices, and how these could potentially amount to monopolies.
They found that each platform had become, in one way or another, in direct and singular control of channels of mass distribution. They are no longer disruptive and innovative start-ups, but now resemble business monoliths akin to the oil barons and railroad tycoons of the past, controlling their respective industries, absorbing or removing competitors with ease.
The positions held by these companies have allowed them to maintain their power in the market, subjugating any alternatives to their products. The report remarks that they have “abused their role as intermediaries to further entrench and expand their dominance. Whether through self-preferencing, predatory pricing, or exclusionary conduct, [they] have exploited their power in order to become even more dominant.”
The Department of Justice has since taken an initial step to rectify these issues and has followed up with a lawsuit against Google, alleging it has used its monopolised power to link search and Chrome to its Android mobile operating system. For it to get to this point, it’s worth understanding exactly how these companies came to exert such an excessive level of market control.
Amazon and Apple built their monopolies by having detailed access to their users’ data, which in turn allowed them to build the best AI recommendation systems for their respective ecosystems. In addition, Facebook and Google built their monopolies on the network graphs between their users and their preferred content. It is easy to see that whoever owns the biggest network graph can understand the network best and can most easily influence it in their preferred direction.
The point is that the biggest datasets on user preferences and on network graphs of users were clearly always going to be the key competitive advantages for whomever could own them first. Yet it is only now that regulators seem to have got around to realising this might after all be a problem for wider society. So now may an opportune moment to consider what other emerging areas of technology, especially those which are AI-related, could lead to future technology monopolies.
For instance, developments in the field of Natural Language Processing (NLP) are increasingly making it possible not just to understand language in a mechanical sense (i.e. based on keywords) but in a cognitive, ‘human’ way through sentiment analysis and recognising figures of speech such as sarcasm, nuances in dialects and the changing forms of jargon. When it comes to effective NLP, understanding what is not said can be as important as what is said.
This is easier said than done, as language is incredibly complex. A word or a sentence may have a completely different meaning in different languages, tone and sentiment. NLP providers apply algorithms in order to analyse a given text, extracting the meaning associated with every sentence and collecting the essential data from them.
Currently the most powerful and flexible NLP algorithm is Generative Pre-trained Transformer 3 (GPT3), an autoregressive deep learning language model created by OpenAI which can produce human-like text. The technology has been in beta tests since June 2020 and is said to be of such a quality that it is really quite challenging to distinguish it from text written by a human. It holds the record for the highest number of machine learning parameters (175 billion) in a single model. It was trained on hundreds of billions of words and is capable of coding in several languages, meaning it requires no additional training for specific language tasks. The scale of this undertaking meant that it has taken over two years since the original paper was published to train it. GPT3 is especially impressive as it can adapt its final layers to a specific task so that while it is difficult to train, it is comparatively easy to use, and to adapt for a specific task.
While GPT3 is merely the latest in what is likely to be a succession of ever larger and more powerful language models, it does nicely illustrate that size of model and training data set are key drivers of accuracy and robustness in these models. Furthermore, recent research demonstrates that larger models are easier to compress, so there is little corresponding trade-off when it comes to inference (when models are being used in the wild).
Thus, whoever has access to the most training data, and has the most computing resources available for training, is able to build the most accurate models. In the case of language, it seems evident that a single such model could be used for almost all language applications and could be adapted to run on a wide range of hardware.
How would such a universal language model be assessed for bias and fairness? Simply being trained on historic corpora that are less than completely unbiased might be concerning enough. But imagine if the next big tech monopoly started to deliberately engineer in biases that advantaged its own users over others – or that favoured its preferred type of politics.
There are some AI trends that are reasonably predictable, and where these trends might reasonably point to potential monopoly abuse, it makes sense for the community to consider how society could mitigate these potential abuses – just as they currently do around other ethical issues such as user-bias, fairness and transparency in new AI algorithms.
In the case of a universally adopted language model, for example, it might make sense to view it as a piece of societal infrastructure, like the road network, rather than as a piece of proprietary technology. Or to have a regulator akin to the FDA, to validate its real-world performance in each proposed use case.
Almost everyone will benefit from a healthy AI powered economy where innovation and competition thrive, but where the use of data and AI is transparent, accountable and fair. Take DataSwift, for example, which offers a range of products that give consumers ownership and control over their personal data, whilst offering businesses the APIs and tools required for data-rich and scalable applications, all within a rigorous ethical framework.
As AI becomes increasingly more powerful, and therefore economically valuable, the opportunities for existing and new monopolistic players to exploit it will only increase. Regulators and the AI community need to become much more aware of the potential for monopolistic practices created by the next wave of AI innovation.
Today’s approach, of building up huge tech monopolies, only to attempt to break them up many years later, is hardly the most efficient (or ethical) path to get there.
Originally published on Forbes’ website.