After being hit with an obnoxious cold when I returned from my trip to Poland, I spent the next few days getting into a Netflix series called Aggretsuko. The anime follows the life of a Japanese millennial, named Retsuko, trying to make it in the corporate world by day, and taking her frustrations out on a karaoke mic at night singing death metal. Aggressively singing Retsuko – Aggretsuko. Everyone’s also a talking animal, but the company behind it made Hello Kitty, so that’s to be expected. Despite the cutesy sounding premise, it’s quite a mature and interesting series.
But this is an AI blog, so where’s the AI?
In the latter half of season two, Retsuko starts dating a donkey named Tadano whom she meets at driving school. Despite his outwardly lazy exterior (he’s still on the theory part of the class by the time Retsuko graduates), he’s actually a tech mogul who’s looking to revolutionise the world with his new AI. Enter, ENI-O.
I’m a big fan of fiction, and I enjoy seeing what creatives come up with in works of Sci-fi when they’re not limited by the beliefs about current tech limits academics might have. This post (and perhaps more posts like these in the future) will be me going over this depiction of AI with a fine tooth comb, to consider how plausible it is and whether it offers any novel ideas. I’m hoping that by doing so I can give you as a reader a better understanding of what AI can already do, and what you might be able to expect from the technology in the future.
Tadano’s goal as a tech developer is to use ENI-O to replace tedious labour, freeing up individuals to explore more fulfilling avenues and have the opportunity to pursue careers that let them express themselves instead. This sounds all too familiar among idealists in the AI field, and while I’d love to believe it’s as simple as building an AI that can do all the menial jobs for us and the rest will follow, there are plenty of barriers to adapting AI technology at this scale, such as creating the right business environemnt for using it, and dealing with the diplacement of labour. That doesn’t mean it’s not worth pursuing, but it does draw attention to Tadano’s rosy eyed view of his field. As a man in his mid-20s enjoying exceptional career success, though, it’s understandable that he would think his worldview is flawless.
This brings us to focus on ENI-O itself, specifically: what does ENI-O actually do? If it’s going to replace all the menial jobs in the world, then presumably it needs to do a lot. Before we break down what we see ENI-O do, let’s examine how ENI-O as an AI was trained.
According to Tadano, ENI-O was trained on all forms of media, including social media, videos and written text. “The world’s zeitgeist”, as he put it. In other words, by examining the collective conversations of the human population and the content they shared, it was supposedly able to learn the skills to do humanity’s jobs for it. This would be done most likely using a combination of Natural Language Processing and Computer Vision, which essentially lets AIs read and interpret images. He also implies that this is an online learning process, meaning that ENI-O continues to read new data as it comes out rather than finishing its training at a pre-determined point. Depending on your viewpoint and experience, the idea might sound simple enough to be genius, or too non-specific to ever achieve what Tadano’s talking about.
A comparable example in real life would be GPT-3, an autoregressive language model trained on a significant collection of the internet. To summarise, having read most of the internet, it learned the ability to respond to text prompts to do things such as answer questions, write poetry and code, and even generate adventure games in the form of AI Dungeon 2. Impressive!
However, a brief playthrough of the AI Dungeon’s demo mode indicates that the model is far from flawless in generating its scenarios, and while its punctuation and grammar is top notch, it tended to be rather inconsistent in responding to my prompts, and also seemed quite eager to make scenes a bit steamy. So despite the versatility of GPT-3, it still has a ways to go before it can pass a Turing test.
When comparing examples of AI, it’s important to pay attention to the dates involved, especially in recent years. As you no doubt have heard, AI is an extremely active field at the moment, and technology from two years ago can easily have been superseded by the present date. While it might be overstating things to say we’re making new breakthroughs every day, a few months can be a key factor in whether a paper is cutting edge or old news.
We don’t have any reason to believe that Aggretsuko isn’t set in the present day, based on the level of technology we’ve seen and its slice of life premise, so let’s assume ENI-O exists in 2019, the year the season came out. GPT-3, if we continue using that as our reference, was released in 2020. Only a one year gap, so the technology isn’t implausibly ahead of its time.
We should also consider the amount of resources something like ENI-O would need to function. Tadano specifically states the instance of ENI-O in his limo/office (both the same location) is just a terminal, implying the core of ENI-O exists somewhere more appropriate. An AI scanning the world’s media on a regular basis would need a lot of resources (GPT-3 used 175 billion parameters, for comparison) so presumably ENI-O is housed in a building consisting of a lot of processors, memory cores and cooling units, and is just interfacing with Tadano through a microphone and webcam. Tokyo (the show’s setting) apparently has good wifi, so it’s reasonable that ENI-O isn’t disconnecting from the limo every five minutes. Maybe it has trouble going through tunnels, but we can only speculate on that part.
In terms of training method and resources used, ENI-O isn’t doing anything unreasonable. As far as we can see Tadano is a one-man operation, but he apparently has money from previous ventures, and once he figures out the training parameters he can just set ENI-O going with sufficiently configured inputs and equipment. He would need to check these if ENI-O started behaving in a consistently weird fashion, as parameters do need to be adjusted over time, but we can assume he does all this off-screen. While it’s a bit of a stretch that he could achieve this completely solo, as model training requires a lot of trial and error to hit the right notes, it’s by no means impossible as far as budgets are concerned.
Now let’s look at what ENI-O actually does. We’re going to focus on the actions actually depicted, while trying to minimise positing potential things ENI-O could be doing. After all, if it’s going to replace menial labour, then there’d be no limit to what it might be able to solve and we’d be here all day.
Tadano introduces ENI-O to Retsuko’s employers and the audience as a machine that will streamline their accounting department. Although he doesn’t go into a great deal of specifics, we could assume this would include a lot of account balancing and validating transactions. Since the setting is in Japan, Retsuko’s company, Carrier Man Trading Co., Ltd., still uses a lot of paper reports, and data entry is a regular part of her job. From this, we can infer that ENI-O would also do some kind of image recognition to understand values to transfer to a spreadsheet database. It might also be used to generate automated reports for employees and customers, or highlight errors in the reports or accounts for resolution as needed, if it doesn’t resolve them itself.
Already we’re having to do a lot of speculating on what ENI-O could do, and it already sounds like a lot for a single AI. In particular, an AI trained on social media, like how ENI-O was trained, doesn’t seem like it should have a lot of ability to do these sorts of tasks. After all, there’s a world of difference between sentiment analysis of the latest politician tweet versus analysing tax records to see who’s short on their payments.
Another function ENI-O displays is that it actually recommended the company that Tadano approached with the offer for ENI-O’s services in the first place. Again, without specifics on how it came to this decision, we can only make assumptions. Based on its training method, I would speculate that it analysed the company’s media profile to check for positive flags, such as openness to new ventures and a good market reputation, as well as fitting the right economic profile. The CEO had previously expressed interest in modernising the company too, so perhaps it picked up on a statement he made in a business report at some point. Of course, his idea of modernising the company was a cold noodle slide through the office, but it’s the thought that counts. Either way, this seems more in line with ENI-O’s training methods as a believable function.
ENI-O also displays capability as a chatbot, which is a given for just about any AI that shows up in media. As an audio/visual medium, your audience needs to be able to hear your robots vocalising after all. While AIs developed as chatbots aren’t automatically able to express themselves in humanlike voices in the real world, and the technology for that isn’t quite at that level yet, the ability to respond to voiced queries is perfectly in line with modern technologies like Alexa. So in that department ENI-O is again hitting the realistic mark, if a bit better than it should be.
The most egregious function that ENI-O performs is driving Tadano’s limo. Although Tadano employs a chauffeur, he pays him to sit on his phone all day and just be there in case ENI-O decides to eliminate its oppressive overlord by driving the vehicle into a brick wall, or someone confuses its driving ability with a carefully altered stop sign. Otherwise, the AI plans the routes and pilots the vehicle.
While the previous functions could be grouped under a very broad umbrella of “Analyse and interpret data, then respond”, self-driving cars are very much their own domain when it comes to AI. There are a lot of different factors that go into training self-driving cars, including topics such as supervised and unsupervised learning, object recognition, and reinforcement learning. Natural language processing is just really an unrelated field from this topic, and at best ENI-O’s social media scans might’ve taught it to recognise what different things on the road are. While it might have even been able to read some driving manuals or roadmaps for downtown Tokyo, it simply wouldn’t be able to actually drive a car without specific programming and training to do so.
Tadano is described in the series as the person probably closest to General AI. While this might seem to be the case if we accept everything in the show as it’s depicted, it’s important to remember the distinction between a general AI and a multi-purpose AI. Specifically, for all Tadano hypes up his invention, there are some functions it explicitly doesn’t do. It doesn’t pilot Tadano’s jet, for example; Tadano does that (yes, he is still failing driving school). It also can’t solve queries about Retsuko’s personal life, since it doesn’t know Retsuko well enough to come to a conclusion. The fact that it’s actually able to acknowledge this lack of data is very much a valuable tool in the AI market, but that’s besides the point.
It would be a stretch to call ENI-O a general AI, because Tadano can’t just show it a new task and have it take a whack at it, like a proper general AI could. For example, introducing it to the accounting department is a joint venture (and one the series doesn’t really follow up on, unfortunately). Tadano still has to do work to actually integrate it with the tasks at hand, rather than just describing the problem to ENI-O and letting it have at go. And while he’s certainly willing to throw ENI-O at new situations (in season 3, he applies it to a dating app), it would seem more the case that he’s extending the ability set rather than using its existing intelligence. It’s a subtle distinction, and debatable if I’m being pedantic, but one I think is worth remembering with what would be the game-changing technology breakthrough that is general AI.
With further research ENI-O might reach the state of being a general AI, and in the real world researchers have suggested anything from 80 years to less than 10 as when we’ll reach this new milestone of intelligence. For now though, it probably makes more sense to think of ENI-O as a suite of AI functions that Tadano is continually developing and adding to. Collecting it under one interface that can easily interact with humans is certainly an effective way of managing it, but it’s not going to be revolutionising the world without a bit more work.
Overall, Aggretsuko takes broad enough strokes with its depiction of AI that a lot of its workings and specifics are left to your imagination, which is probably the better way for non-experts to go about using AI in their stories. That said, the areas it tries to get more specific about are being a bit liberal with just how much its training would be able to do, especially since we have a real lfie counterpart to ENI-O in the form of GPT-3. The fact that really kills it for me though is having ENI-O do completely unrelated functions, specifically driving the limo, that a separate AI would be better served to.
Does this mean ENI-O really is a general AI, and I’m just splitting hairs? Or is it really multiple AIs that Tadano is just packaging together for bigger bragging rights? He is a start-up founder, after all, and they do love to talk up their products. Whether you’re a fan of the series, I’ve piqued your interest in checking it out, or you just have some thoughts to add on to this discussion, I hope you’ll share them in the comments below.