AI think, therefore AI am
What exactly do people mean when they talk about AI in 2018? Where do I start if I want to embrace AI in my business? Get your questions answered in our Think:Act magazine on artificial intelligence.
by Wolfgang Zehrt
Illustrations by Mungang Kim
If you think disruption of the media business is old news, think again. It's only just getting started, and the largest threat – or opportunity, depending on how you look at it – is looming within sight: robot writers.
Ask Anybody to describe a journalist to you and you'll get as many answers as the people you question. The stereotype fits a number of images and descriptions: There's the US cliché of a trench coat-wearing scribbler in a trilby hat with a press ticket in the band; then there's the hard-playing, hard-drinking hack of London's Fleet Street; and what about the fearless seekers of truth all over the world who are prepared to die – and sometimes do – to get their stories out? However tenacious they are, though, we all think of journalists as human beings, and that the role they play is a vital component in a functioning democracy: speaking truth to power and promoting freedom of thought as well as speech.
In recent years, however, there seems to be an ever-growing set of conditions that threaten to damage the fourth estate, the way journalism works and those sweating over their keyboards. The disruption of the business model with the advent of the internet and digital publishing was one such threat. The specter and accusation of "fake news" striking at the integrity and trustworthiness of the media was another. But added to these is a new and possibly much more mortal threat. It isn't so much about business or about how the news is delivered – online, print, mobile – but more about how the news is written. How it is produced. How it is generated.
According to a recent BBC news report, by 2022 some 90% of all news content will be written by "robots." Digitization and the rapidly increasing amount of data it has made readily available is enabling large parts of today's reporting to be created by a computer: the weather, football and stock markets have been the first areas in which "natural language generation" programs have been able to deliver good, readable stories.
Today, a computer at the Norwegian NTB news agency even writes a large proportion of its election reporting. That said, editor-in-chief Mads Yngve Storvik stresses that he can't see a robot being able to conduct an interview any time soon. At the US news agency The Associated Press, a computer already produces 10,000 economic and baseball reports every month. And under new owner Jeff Bezos, billionaire founder of Amazon, The Washington Post is rapidly developing a new content management system (CMS) that has put automated content generation at its heart right from the get-go.
Robot journalism will likely lead to thousands of media job losses around the world. However, it might not mean that all is lost for the journalists who can find the right way through. Investigative stories such as The Panama Papers-style reports, or outstanding portraits and profiles – the kind of content that differentiates a publication from its competitors – could thrive in this new media age. No matter how good, a stock market report will never win any journalism prizes, whether it was written by a robot or a human. Leaving aside such specialist reporting, though, can the news media become fully automated? The dream might be for day-to-day, high-frequency and personalized news business to be handled by computers that never need to stop for breaks, but will it always need that special human touch? What about the questions of judgment and tone?
While machine learning already works well when it comes to replicating the style and voice of very specific media (tabloid, serious, B2B), artificial intelligence still finds it difficult to summarize the most relevant messages from documents and data if a human has not previously provided examples to explain what the key findings might be. Software has so far not had the world knowledge to realize, for example, that a rate drop of more than 3% in a day would normally be unusual for a stock market heavyweight such as a major auto manufacturer, but that it may well occur in conjunction with new revelations such as a diesel scandal. However, software can now look for a suitable quote from analysts on precisely this rate trend and can incorporate it perfectly, both in terms of content and language. This is something that would have been unthinkable just a year ago.
The robot journalist is getting ready for the next step in its evolution, but its human colleagues still have to tell it which subjects are worth writing about and what data should be used. That could soon change. It is already employing relatively simple-to-use algorithms to automatically work out topics that take into account the frequency of keywords in internet searches while cross-checking the potential theme against the intensity of discussion of the event on social media. Current topics that are shaping opinions can thus be identified with a great degree of reliability: The robot simply has to track down images from databases using the right keywords. Fully automated video content could also be created in this way.
There is another significant reason for developing fast new media content that is rapidly updated. Thomas Scialom, a researcher at the French natural language generating startup Recital, put his finger on it: "Mobile media use means that there are ever fewer visual aids to help readers understand content, and the amount of information is also limited by the size of the screen. At the same time, the time spent reading content on a mobile is also falling." Shorter information, written specifically in the tone of the target audience and with personalized, targeted content – that's not something a human writer can produce, but an AI journalist can. For example, why should someone read a report about all of the Nasdaq rates if you only need a report on trends for Apple shares? Could a graphic provide that information? Automated text generation can do far more: The report on Apple shares can not only summarize historic trends, but also provide rankings – is it only Apple shares that are under pressure today, or is the trend also affecting China's Tencent? Is Amazon on the up and Alibaba plummeting? Does it have something to do with the start of the vacation in the US or China? Has disappointing economic data been released? The more sources that are used, the better an article will be than a graphic.
Anglo-German business news agency dpa-AFX was one of the first to develop a template solution: It was simply a case of filling in the gaps in pre-written sentences with new data. The sentences provided today are much more varied and sophisticated, but this basic principle is only slowly being replaced. Hamburg-based computer linguist Patrick McCrae explains, "The perfect solution for content automation must consequently not only have extensive opportunities for language diversification, but also have the ability to incorporate powerful analysis. Text generation will not progress far beyond the dynamic filling-in of gaps in the texts unless artificial intelligence is involved. Truly interesting texts with diverse content will be created if surprising, non-trivial findings can be extracted from the relevant data sources. That's exactly why we need artificial intelligence." Here's a powerful example: One German digital publisher can generate a view of the monthly employment and training markets at the touch of a button for 411 regions across Germany, focusing on different professional groups or levels of education, if desired. The software makes discoveries in the mountain of data with each analytical pass that a human editor would have only spotted by chance, if at all.
But biggest challenge in this relatively new world of automated writing, and one that is causing even the likes of Google and IBM's Watson to thrash their disks, is free text creation written from flowing text sources, and not just from data stored or handled in a particular way. McCrae frames the problem clearly: "Automated text comprehension without any limitations in terms of the subject is a problem that computer science has yet to solve."
Yet if the problem were to be solved, it would deliver a great prize: a kind of perpetual media motion where new texts could be created with a seemingly endless output of flowing text. It could be the salvation for news agencies, which could start with one single text created by software and then turn it into 120 different versions for 120 newspaper customers. But will software be capable of understanding flowing text, including irony, sarcasm or annotations? Scialom sums up the problem quite simply: "The biggest challenge is currently getting machines to understand unstructured data." Once this challenge has been met, only a small band of journalists will be relevant,likely writing highly specialized and unique contributions for which readers around the world will be willing to pay a decent price. We will just have to wait to see what publications will have survived for them to be writing them for.
What exactly do people mean when they talk about AI in 2018? Where do I start if I want to embrace AI in my business? Get your questions answered in our Think:Act magazine on artificial intelligence.
Curious about the contents of our newest Think:Act magazine? Receive your very own copy by signing up now! Subscribe here to receive our Think:Act magazine and the latest news from Roland Berger.