Embrace change, take risks, and disrupt yourself
Hosted by top 5 banking and fintech influencer, Jim Marous, Banking Transformed highlights the challenges facing the banking industry. Featuring some of the top minds in business, this podcast explores how financial institutions can prepare for the future of banking.
NVIDIA Executive Reveals How AI is Reshaping Financial Services
Against a landscape where 91% of financial institutions are either assessing or deploying AI in production, Malcolm deMayo, global vice president of the financial services industry at NVIDIA, joins us on the Banking Transformed podcast.
This episode provides an in-depth look at the rapidly evolving AI landscape in financial services, backed by NVIDIA's latest industry survey of global financial services professionals. We explore why 75% of institutions consider their AI capabilities industry-leading or competitive and discuss the finding that 97% plan to increase their AI infrastructure spending this year.
Malcolm deMayo also shares insights on how institutions tackle key challenges, including data privacy and talent acquisition, while implementing AI across critical areas such as risk management, operations improvement, and customer experiences.
Where to Listen
Find us in your favorite podcast app.
Jim Marous (00:09):
Hello, and welcome to Banking Transformed, the top podcast in retail banking. I'm your host, Jim Marous, Founder and CEO of the Digital Bank Report, and co-publisher of The Financial Brand.
Jim Marous (00:19):
In a landscape where 91% of financial institutions are either assessing or deploying AI in production, Malcolm deMayo, Global Vice President of Financial Services at NVIDIA joins us on the Banking Transformed Podcast. This episode provides an in-depth look at the rapidly evolving AI landscape in financial services, backed by NVIDIA's latest industry survey of global financial services professionals.
Jim Marous (00:46):
We explore why 75% of institutions consider their AI capabilities industry-leading or at least competitive, and discuss the finding that 97% of institutions plan to increase their AI infrastructure spending in the coming year.
Jim Marous (01:05):
Malcolm deMayo also shares his insights on how institutions can tackle key challenges, including data privacy and talent acquisition, while implementing AI across critical areas such as risk management, portfolio optimization, and fraud detection.
Jim Marous (01:21):
Financial services organizations need a comprehensive AI platform to build and scale AI across the enterprise that empowers internal teams and external partners. To that end, executives must prioritize high-yield AI use cases to improve employee productivity while delivering superior customer experiences and investment results.
Jim Marous (01:47):
So, Malcolm, before we start, can you introduce yourself to our audience and share a little bit about your background?
Malcom deMayo (01:55):
Sure, Jim. Thank you for having me here today. I am responsible for the financial services industry here at NVIDIA, and for the last 30-some-odd years, I've been working in the banking industry with various technology companies like NCR, Teradata, EMC, Oracle, and now, here at NVIDIA.
Jim Marous (02:17):
So, NVIDIA is not a new name to any of our listeners, but it may be new from the perspective of how you're helping financial institutions navigate the landscape to improve scale and speed of AI implementation. Can you discuss just a little bit about the relationship you have with financial institutions to make them more productive in their AI initiatives?
Malcom deMayo (02:40):
Sure, it’s a great question. So, our journey in financial services is not new. We began working with financial services firms 16, 17 years ago. What we do at NVIDIA is we build an accelerated compute platform, and it's taken 30 years to build. It's very different from hardware accelerators, there's lots of them on the market.
Malcom deMayo (03:03):
But we build a full stack. And that full stack solution allows us to accelerate a variety of workloads like high performance compute. These are workloads you find very commonly in financial services. Things like options trading, where they're looking to discover price, they're looking to do risk assessments and using back-testing to develop algorithmic trading strategies.
Malcom deMayo (03:29):
And in insurance for underwriting, these are Monte Carlo simulations. In banking, the regulatory reports like daily regulatory reports, like daily credit loss is all simulated. And we accelerate these – in some of the largest banks, these workloads take 20 hours to run, and we accelerate them down to four.
Malcom deMayo (03:50):
So, we've been working with financial services for 17 years, what we do is accelerate workloads. What that enables them to do is they can do more with less, or they can start to rethink how the very processes that we're accelerating, they can rethink them and reimagine them.
Malcom deMayo (04:09):
Because when you think about it, when something takes 20 hours, you make all sorts of compromises on the data attributes that you use and the number of scenarios you run. When it takes four hours, you can rethink that and hopefully, do a much better job.
Jim Marous (04:24):
And do more with less, in many cases.
Malcom deMayo (04:27):
Substantially more with less.
Jim Marous (04:29):
Can you share your perspective on what you think the most significant changes that AI has brought to financial services in just the past year? Because I could go back five years, but that's ridiculous because things have changed so fast and so many elements have come into the marketplace. But what do you see as the most significant change you've seen in banking brought on by AI?
Malcom deMayo (04:52):
One of the biggest changes with generative AI is that the programming language is our language. So, instead of less than 2% of the world's population being involved in developing solutions, you now have virtually everyone engaged.
Malcom deMayo (05:10):
Banks have to be very careful. AI is not new to them. This is a decades-old journey. And so, they have to move in a cautious manner (and they're very good at this) in terms of bringing in new technology. But when you think about the virtually hundreds of processes in banks, and the fact that the language of banking is our language, that the impact or the potential for impact of AI is enormous.
Malcom deMayo (05:40):
The initial areas: Goldman Sachs, talks very much about how they're using generative AI for code generation, and that 40% of the code being generated is being accepted by their developers. If you think about when you have thousands of developers, 40% of their work is now automatically generated by AI. They're so much more productive. Think of the impact to the bottom line, but more importantly, think about the backlog of work that they can now get to. So, that's one example.
Malcom deMayo (06:15):
Another example is in customer service. You've probably read about Klarna over in Sweden using OpenAI's technology, API to be able to handle two thirds of their calls in a single month using generative AI, reducing the average time to air and resolution from 11 minutes to two minutes.
Malcom deMayo (06:35):
So, when you start thinking about the impact of customer experience, every survey I've read says, if you improve the customer experience, you have the opportunity to sell them more products. So, generative AI is going to be a massive tsunami sized wave of change in banking and in financial services.
Jim Marous (06:55):
It's interesting because the initial use of AI was a lot in the fraud and risk areas, where they were trying to anticipate what was happening or be able to build models that would defend an organization. But it's interesting, generative AI is much more of a two-way conversation if it were around what you can test, what you can find out, the learnings you can make along the way, and it really has quite a big impact.
Jim Marous (07:21):
One thing that came out is you just released a study on the deployment of generative AI solutions in financial services. When you look at that report, what was your biggest surprise from the research?
Malcom deMayo (07:34):
The biggest surprise is how engaged financial services firms have gotten. And we're way past the hype. Banks and financial services firms are gearing up. They are building, they're trying to recruit talent, which is a very big challenge for them. They're investing in the technologies and they're experimenting and prioritizing use cases.
Malcom deMayo (08:00):
So, they understand that this technology can help them to improve productivity, drive operational efficiencies, and even generate new revenue in the streams. So, that was probably the biggest surprise for me.
Jim Marous (08:15):
It’s interesting, your study showed that 43% of financial institutions already are using generative AI, and 91% (when you take that and say how many are assessing that capability) of financial institutions are at least assessing or using generative AI.
Jim Marous (08:34):
So, what applications have you seen that are driving this rapid adoption, particularly in report generation, back office operation, and customer engagement?
Malcom deMayo (08:45):
Well, so a big area is customer service in contact center. The opportunity to improve the productivity of a call center agent is massive. Most of these people have to toggle between five or six systems. They're reading texts that's unscannable, and this is after someone's waited anywhere from 10 to 20 minutes to speak to them.
Malcom deMayo (09:11):
So, we're building solutions today that allow AI to listen in on the question, scan ahead. You think about our accelerated compute platform, which you have with every node of our platform gives you four research assistants who can read a thousand books a second.
Malcom deMayo (09:29):
So, when you think about that power, so now you've got this capability to listen in on the question, read down what the customer's history is, understand all of that in a blink of an eye, be able to suggest to the call center agent how to handle the call.
Malcom deMayo (09:49):
At the same time, that research assistant has access to how good calls have gone, and all the transcripts are stored. So, the AI can help maybe a newer call center agent be more productive sooner. So, that's one example.
Malcom deMayo (10:07):
Another example, you mentioned fraud. So, in credit card, being able to embed new capabilities like graph neural nets and transformer capabilities into existing models so that model doesn't have to go back through validation, but it can be more accurate in identifying anomalies and reducing the amount of work that people have to do on the backend.
Malcom deMayo (10:34):
We're seeing a lot of interest and a lot of drive in modernizing models. And while I'm on that topic, many banks have models in production. They tend to be machine learning models, and they tend to be very small. And like I had said a bit ago, a lot of compromises were made.
Malcom deMayo (10:54):
The amount of data that they're using and the amount of intelligence in the model, there's an opportunity to upgrade those models with more robust machine learning based models, and even start to introduce transformer architecture, so generative AI.
Malcom deMayo (11:11):
So, the use cases, there are hundreds of them that are driving adoption, and at the end of the day, they fall into three categories: improving productivity, driving operational efficiency, and creating new revenue streams.
Jim Marous (11:28):
It’s interesting, Malcolm, we've talked about this before in other podcasts we've done, is that one thing with regenerative AI is you're not only deploying answers to questions, but you're also collecting, as you referenced just a second ago – collecting information for long-term use.
Jim Marous (11:47):
Do you see a time when financial institutions may be able to provide maybe an agent on behalf of the consumer that will look for answers to their questions almost before they're asked? In other words, we use our GPS systems in our cars regularly, but they've gotten more and more sophisticated based on how they learn our driving habits.
Jim Marous (12:11):
Do you see this happening in the banking world where I may be able to call my financial institution, and my financial institution will be able to help me through generative AI models to build a better financial wellness platform for me on my behalf?
Malcom deMayo (12:27):
I think so. Yeah, I think at the end of the day, the future is hard to predict, but banking and financial services has the canonical use case for generative AI. And it's the ability to train an AI to understand every transaction you've ever had with the bank.
Malcom deMayo (12:48):
So, in theory, at some point in the future, you could have an assistant for every customer, an AI assistant for every customer. Think of a future where you have human employees working collaboratively with AI employees, it's not that far off.
Jim Marous (13:08):
Exactly. And I think we're seeing in some industries right now to a degree, but in banking there's regulatory and compliance issues as well as safety issues on that. But it's interesting because I just use it on a personal basis for my business and the way that the learning tool helps to learn on my behalf is astounding at times to me.
Jim Marous (13:31):
So, while financial institutions are starting to dip their toe or their entire foot or more into the water of generative AI, what common challenges do you see banking organizations face when implementing AI solutions, and how do they overcome these challenges?
Malcom deMayo (13:49):
The most prevalent – and this dates back to the beginning of computer science, is data. But interestingly, I'll give you a sneak preview on this year's survey: talent acquisition has just trumped data as the top.
Malcom deMayo (14:08):
So, there is a fierce competition to bring in data science and data management engineers. And that's going to continue. And they’re going to have to train their employees, retrain their employees, and also augment with new recruits and partnerships with universities. That's what we're seeing them do. Partnerships with universities, bringing in new recruits, and also training their employees.
Malcom deMayo (14:41):
I don't know if you saw that S&P Global announced they hired Accenture to help them train 35,000 of their employees. JPMorgan Chase has built a program internally to train every new recruit. NVIDIA has a university, it's available to anyone.
Malcom deMayo (14:58):
So, there's lots of places you can turn to to get smart quickly, but the other challenge is like the data challenge, we're building inside our platform capabilities to help, we call it federated learning. If you want to train a fraud model on your Spanish transactions, your Spain transactions, and on your French transactions and on your German transactions, and then share the knowledge without exposing the data, the sensitive data, we've enabled that in our platform very securely.
Malcom deMayo (15:29):
And so, these are opportunities for financial firms to improve the intelligence of their models, improve the ability of their models without exposing sensitive customer information.
Jim Marous (15:45):
You talk about the whole issue of talent, but isn't another solution (and to bring up NVIDIA) working with partners, because not everybody's been able to develop their own tools internally or their own solutions internally. But the partnership with organizations that are investing in the talent and are bringing multiple case studies together for improvement in the models, isn't that also an option?
Malcom deMayo (16:16):
Absolutely. And there's lots of great partners out there. AI is a full stack problem, and we've built a full stack solution, but it's also a team sport. So, having the right partners is absolutely critical.
Jim Marous (16:33):
Oh, yeah.
Malcom deMayo (16:34):
And what we've done is built a huge ecosystem of partners, Jim. We have over 20,000 startups have come through our inception program. And for those that listen to your podcast, if you're a startup, it's really easy to join inception. You get cloud credits, you have access to our smart people, and all of our processes and best practices. And we even have built a launch pad for you to run POCs to have real world experience. And by the way, enterprises can use that too.
Malcom deMayo (17:05):
So, from an ecosystem perspective, we try to work with everyone that is building AI. So, over 1,700 companies are building AI solutions using our platform. So, we actually don't sell our platform directly. It's only available through partners and where we'll directly engage is when … by the way, if any enterprise calls us, we'll try to help them.
Malcom deMayo (17:33):
But where we'll directly engage is where there's a really substantial problem, something that's needle moving for the industry, and it really couldn't be solved without our platform, and there's lots of examples of that in banking.
Jim Marous (17:49):
It’s interesting that as we look at that partnership issue, it really makes it so smaller organizations can really do some pretty phenomenal things that used to be only relegated to the biggest and largest and most progressive financial institutions.
Jim Marous (18:08):
And I think we're seeing this in every industry, is that generative AI and AI in general really allows those organizations that have the right leadership and the right leadership thinking to be able to do as much or more than their larger partner organizations because of the fact that organizations like NVIDIA and others are allowing them to partner to build these solutions at scale, which is kind of exciting when you think about it.
Jim Marous (18:33):
How do organizations balance AI innovation with regulatory compliance, especially in the financial services field?
Malcom deMayo (18:45):
So, it's a great question. Regulations exist to protect the consumer or protect our society from financial crime. And so, banks, I think don't get enough credit for the work they do for society.
Malcom deMayo (19:08):
But inside of a financial firm, they're constantly balancing. You talked about how the small companies are more agile, well, they don't have the technical debt. They don't have the legacy systems, and they also don't have the broad base of customers. And there's something that goes along with that, and it's called trust.
Malcom deMayo (19:32):
So, the large financial institutions want to make sure that they protect their customer, and that they remain the trusted house bank or bank of choice. And at the end of the day, it is a big challenge when a large portion of your budget is dedicated to keeping the lights on, running these legacy systems – how do you innovate?
Malcom deMayo (19:58):
Generative AI is helping. Probably one of the most exciting use cases right now in financial services is understanding the COBOL code base that exists in a lot of these large organizations, documenting the business practices and helping them think through that transformation to something more agile and modern.
Malcom deMayo (20:21):
So, you're going to see a lot more modernization as digitization continues, it's a big trend in banking. I think it's job one for banks to constantly think about how to innovate in terms of their product offering and the experiences that they're … and I don't think if you look at any 10k or any annual report, it's in every one of them, we're focused on improving customer experience.
Jim Marous (20:46):
Yeah, it's interesting. As we get to Money20/20 in Las Vegas, and the subsequent ones in Amsterdam and in the far east, it’s a sharing environment. It's an environment where people share ideas, what's going on in the industry, what's being tested, and everybody's showing their wares out there.
Jim Marous (21:07):
And your research showed that 51% of institutions are focused on identifying additional AI use cases up from 40% last year. What are some of the most exciting new applications you're seeing in AI in the industry right now?
Malcom deMayo (21:25):
So, I don't know if they're exciting.
Jim Marous (21:29):
You know what, if they make money, and if they move the market forward, that's good.
Malcom deMayo (21:33):
In capital markets, we're seeing a massive adoption on the buy side of generative AI to build, to bring together structured and unstructured data and divine or fine discovers alpha signals, new trading strategy signals, and massive, massive data sets.
Malcom deMayo (21:53):
Some of these companies have exceeded exabyte size data sets. And in regular banking, we've talked about fraud. There’s a lot of focus on financial crime, and financial crime, there are an army of people working in every bank doing a lot of manual jobs where the tool of choice is search is a search engine.
Malcom deMayo (22:20):
And in many cases, the software is rules based, which is in incredibly manual to maintain, hundreds and hundreds and hundreds of rules that have to be maintained by human programmers, and they're not adaptive to change.
Malcom deMayo (22:37):
So, being able to take a large transaction model and create a large transaction model that can start to augment some of those transaction monitoring systems that have been in place for years to reduce the amount of alerts and really help the bank focus their resources on the real financial crime, these are exciting – I don't know if they're exciting, but these are real needle movers for financial services.
Malcom deMayo (23:05):
If you think about the contact center discussion we had, that’s super exciting. Creating a digital avatar, a digital human that is working together with the contact center agents or with wealth managers, or with bankers to help them prepare, to help them answer questions, to do the kinds of things you were talking about earlier, to help provide guidance and education.
Malcom deMayo (23:32):
So, there's just so many different use cases that are being looked at. One of the things in addition to data and talent that's slowing financial services down, is they need to transform their infrastructure faster. These new applications don't run on yesterday's CPU servers well, and so they need to move to what we're calling an AI factory.
Malcom deMayo (23:57):
And an AI factory is something that we've built with Bank of New York and a number of other institutions, where they have created an AI hub. That AI hub is a standard, essentially massive compute where their various groups, the wealth managers, the settlement, the treasurers, everyone can build, experiment, and progress AIs into production.
Malcom deMayo (24:25):
And they're using a stack that would allow the wealth managers, if they want to deploy this on AWS, if they want to run it in their data center, if they’re using our platform, which is available everywhere, makes them more resilient, allows them the flexibility of running it where their data is, not where the compute is.
Malcom deMayo (24:44):
And so, you think about NVIDIA's mission – our mission is to make high-performance compute AI and computer vision available anywhere; in cloud, on-prem, in colo – to every enterprise in the world. And so, if they're using that platform, they have the ability to say to a regulator, “I'm fully resilient.”
Malcom deMayo (25:06):
If there's a CrowdStrike like thing that takes a cloud region down or takes a data center down, “I can move these workloads to somewhere else. It's available, I'm resilient. I'm not going to destabilize the system.”
Malcom deMayo (25:18):
So, I think that it's an exciting time, but financial firms need to move faster to create the AI factories and transform the data center of yesterday, which is where we store things to create a factory, which generates insights and creates data and content. And that's the big difference. This is the new Industrial revolution: data insight and intelligence out.
Malcom deMayo (25:46):
So, those who get there the quickest will have a big advantage in terms of being able to create better experiences for their customers, to be able to reduce cost and to be able to generate new creative revenue streams.
Jim Marous (26:05):
It’s interesting, Malcolm, because you talk about the external resilience, but you also referenced earlier when we talked about talent, is that the way of thinking about the distribution data within an organization has really changed.
Jim Marous (26:22):
I come from way far back in banking where whenever I wanted information, I had to go to the person who was in charge of data and asked for specific reports. These were green bar reports, they weren't on the computer, and you had to analyze them.
Jim Marous (26:37):
What you're really talking about here with generative AI and with the AI as it's being formulated going forward, it’s a completely rethinking of the back office where you're really democratizing the insights and the ability to use the data that's there in new and innovative ways.
Jim Marous (26:57):
I mean, it's really making it so every single employee will have the ability to access data for their own uses. Correct?
Malcom deMayo (27:05):
Absolutely. I mean, when you think about the difference between just search and generative AI, you ask a question on search, you get a thousand documents back that you have to read. You ask a very precise question to generative AI, it summarizes those thousand documents into what you tell it to do.
Malcom deMayo (27:27):
If you want it in a paragraph, it'll give it to you in a paragraph. Now, impose that model on your own corporate data, and think about all of the different data repositories across the bank. You now have the opportunity to rethink how you price for customers.
Malcom deMayo (27:47):
If a customer is a single product customer, they might get a different rate than a customer that is across your products. And you now have the ability to understand that quickly in real time as opposed to there's just been so many barriers in the data retrieval era that we both grew up in – the storm and retrieve. And at the end of the day today, it's now take that data and generate insights and ideas and suggestions.
Malcom deMayo (28:29):
I'll give you an example. We did a GTC talk last year – Capital One, this is I think a good example of how we need to modernize models. These people are some of the best on the planet, some of the brightest minds, and they have built incredible models to determine how to serve messages up when customers visit their homepage.
Malcom deMayo (28:54):
When you look at a customer authenticating on your website, and so you know who they are, but your model's not capable of understanding that they visited on Monday, Tuesday, Wednesday, and Thursday. It just isn't smart enough.
Malcom deMayo (29:08):
Replacing that with a smarter model that's capable of understanding what you're looking at, what you're interested in, and pinpointing that and making that offer to you, that's all possible today.
Jim Marous (29:20):
It’s interesting too because the old way of asking for information was pretty linear if you visualize in that way. And what's interesting today is the most skillful people using generative AI tools are those that learn how to prompt, learn how to ask the questions, learn the depth of the question that can be asked to give you an answer, and learn completely new ways of asking questions.
Jim Marous (29:48):
We referenced it quite a bit, an author we had on the show that actually went the opposite direction, what you would usually do, and said, “If you were to give certain my chapters only a one star out of five, why would this only be a one star?” As opposed to asking AI to say, “How do you improve what we're doing?” It asked it, “What was wrong with what I've said?”
Jim Marous (30:09):
And it's a different way of viewing, but the ability to understand how you ask these questions to get the most out of generative AI is really a learning process in and of itself, but it really gives power in a democratized basis to the line people, to the people that aren't used to having to ask these questions, or get the answers they really need to get. So, it’s going to be interesting to see how it all plays out, because we're all still learning.
Jim Marous (30:41):
Your survey indicated that 97% of institutions plan to increase the investment in infrastructure spending. What investments do you see that financial institutions must prioritize as they actually prepare for the application of AI?
Malcom deMayo (31:01):
Well, I mean, in fact, it's really basic. They need to build the AI factory. They can do it in their data center, or they can do it in cloud, or they can do it in a colo. We're agnostic as to where they do this, but they need to do it. You need to have the right infrastructure. So, that means the accelerated compute platform.
Malcom deMayo (31:22):
And what we've built, Jim, is just very simple, is a three layer cake. It's hardware, it's software, and it's development frameworks, and they're available – you can access this through the server vendors, through the cloud providers, and also through colos, we have partnerships with secondary cloud providers as well.
Malcom deMayo (31:44):
What the accelerate compute platform is … the problems we're solving – the observation Jensen had 30-ish years ago, was that CPU performance scaling was ending. And it's pretty much flatlined at this point. In the last eight years, we've accelerated with GPU acceleration by a … we’re close by 1000x And let me give you an example of why that matters, why we need an accelerated compute platform.
Malcom deMayo (32:15):
In 2012, we built Titan, a supercomputer that took up a tennis court at 17,000 GPUs, consumed close to 10 megawatts of power, and generated 20 petaflops of AI performance. The way to think about 20 petaflops of AI performance is – by the way, that's a thousand million, million floating points. That's 20 research assistants that can read a thousand books a second.
Malcom deMayo (32:43):
So, fast forward to 2024, Blackwell fits in the palm of your hand, consumes three quarters of a kilowatt of energy. 10 megawatts to three quarters of a kilowatt. Tennis court to the palm of your hand, and generates 20 petaflops of AI performance. So, 20 research assistants that can read a thousand books a second.
Malcom deMayo (33:03):
And to think about the ability that that unleashes in an organization that is run on the English language or on any language, your favorite language. So, they need to make the investment. And so, we've built this three-layer cake, that hardware layer – we are the only company on the planet that's creating a new architecture every year.
Malcom deMayo (33:28):
The reason we're able to do that – there's a lot of them, but one of them is that we have been using AI inside of NVIDIA. We built ChipNeMo. ChipNeMo is a chip design AI that works collaboratively with our chip designers and helps them to understand what won't work faster.
Malcom deMayo (33:48):
The experimentation process and invention, there's a lot of trial and error. There's a lot of experimentation and failure to get to what will work. And as a result, we've built Hopper, then Blackwell, and then Rubin, every year, a new radically more performant architecture. What does that mean?
Malcom deMayo (34:09):
That means training times will improve by five times, and inference times will improve by up to 30 times, which means you'll spend five times less money or 30 times less money depending on what you're doing.
Malcom deMayo (34:20):
The second layer of this accelerated compute platform is an operating system we call NVIDIA AI Enterprise. And it does three things. First, it abstracts the developer from the hardware so they can program in Python, Scala, SQL, whatever they're used to programming in as they build their AI models.
Malcom deMayo (34:40):
The second thing it does, is we've optimized it to run on the platform so you get the best performance, which means you can do more with less.
Malcom deMayo (34:47):
And then the third thing it does is it's hardened for enterprise. So, you don't have to worry about security patches, you don't have to worry about … we take care of all of that. And we accelerate all three phases of not just AI, but high-performance compute and computer vision as well. But we accelerate in AI the data processing, the model building training, and the inferencing.
Malcom deMayo (35:12):
And probably one of the most expensive workloads that companies don't know they pay for every day is data pre-processing. This is taking the data from multiple source systems, bringing it together, doing complex joins, doing groupings. It's de-duping the data, it's compressing the white space out of the data. It's extracting HTML and noise out of the data. And we accelerate and provide tools to help improve data quality.
Malcom deMayo (35:42):
So, we also have created a training set of capabilities so that really any of the techniques, whether it's fine tuning or parameter efficient fine tuning, or RAG (retrieval augmented generation), any of these capabilities are available in our platform.
Malcom deMayo (36:06):
And we've built guardrails, and we've built an inferencing engine called NIM, which takes our inferencing engines, takes open-source models like Llama or Mistral or Cohere or any of them – we NIM them. We put them in a Kubernetes container with our inference engines, and this gives you up to a 5x performance boost in inference.
Malcom deMayo (36:29):
And these are available ai.nvidia.com to anyone, you can just go and download them. They're microservices-based. That performance improvement means you spend less money. It means you do more with less. And it also allows you to deploy, because it's microservices based, it allows you to deploy in minutes, not in days.
Malcom deMayo (36:50):
So, we're going to spend probably eight and a half billion dollars this year in R&D. We're constantly improving the optimization of our platform.
Malcom deMayo (37:01):
The third layer are these development frameworks. You're going to have humans working with AIs as a natural course of business. We're already doing that inside of NVIDIA. That means you can envision a future where a human is in the loop all the time. But AI is also in the loop, working to make that human more productive.
Malcom deMayo (37:25):
And the human's the pilot, he's directing them, telling them what to do, giving them instructions, and that's kind of a quick glimpse of where we're heading.
Jim Marous (37:34):
Wow. A quick glimpse of where you're heading is a very long and really dynamic change in the way we do business. It’s rethinking the back office, rethinking the front office, and rethinking the final results.
Jim Marous (37:50):
So, finally, Malcolm, what emerging AI applications or just overall, the way organizations are going to use AI excites you the most right now?
Malcom deMayo (38:03):
I think the idea that it's just the constant innovation and the fast pace, we're immersed in it. It's just incredible to be part of. The fact that we have the opportunity to truly rethink how things are done, it's very, very likely that first application will look a lot like the way we do things today. We'll just sort of repave the road the way it is, optimize the workflow the way it is, and then realize quickly, “Wow, we can rethink this and we can do this smarter.”
Malcom deMayo (38:42):
There are processes in these large financial institutions that have been around for a long time, commonly called dead processes. Nobody knows why they still do them. There's an opportunity to clean house and really reinvent. Like you said, the front office, the middle office, and the back office.
Malcom deMayo (39:01):
We had a chief operating officer from one of the brands in our office in Santa Clara in our briefing center, and he was telling us that today, they have people that spend their entire day just pulling contracts and reading them and answering questions, and he told us it's a 4 billion euro expense annually.
Malcom deMayo (39:25):
We had another brand tell us that 7 out of 10 of the questions they get in their contacts aren't asks for advice. They're problems that need to be solved, but not advice questions, not financial advice questions. When you hear these things, you realize just how impactful AI can be.
Malcom deMayo (39:44):
I don't know if that's exciting, Jim, but I can tell you that it is impactful, and this is going to happen in a very controlled, safe, responsible way. But the opportunity to improve efficiency, improve productivity, and move the needle.
Malcom deMayo (40:05):
I told you about ChipNeMo. Think about NVIDIA, in 2022, our revenues were 29 billion. The next year it was 60 billion. We added less than 5% headcount. Think about that operating leverage. Some of that is attributed to our ecosystem, some of that is attributed to the fact that we're leveraging AI to do the R&D work that normally would've required hiring more people.
Malcom deMayo (40:29):
So, at the end of the day, we're going to continue to grow our people base because as companies become more productive, and when they become more productive, they become more profitable, they hire more people.
Jim Marous (40:42):
Yeah. It’s interesting, from my perspective, I think the biggest thing I'm excited about is all we discussed today, a lot of it is internal, is making things more efficient, making things more productive, eliminating steps that are not needed so that people can think beyond where they do today.
Jim Marous (41:02):
It's going to be the deployment and the feeling from a consumer or a business standpoint of what this means to them. I think we're just starting to scratch the surface on the deployment. I kid a lot of times on the podcast that I know my financial institutions know a lot about me, that I would be surprised they know.
Jim Marous (41:21):
Maybe they know what they think I'm going to do next, and they may be right or may be wrong, but the challenge is when do I start to feel the impact that the financial institutions having with the use of AI. And I think we're going to start to see the feel of this. We're going to start to see consumers increasing their trust in their financial institutions because of how they're able to help them now with the deployment of AI, and generate AI solutions.
Jim Marous (41:49):
We're going to see a person that maybe has used voice banking in Bank of America learning that all the conversations they had in the past are now all combined in a way that makes it so that the solution is at their fingertips as opposed to a step or three or four away.
Jim Marous (42:07):
I think that's really exciting because it's where it travels through the organization to the consumer, to the small business, to the end user. That's going to be really exciting. And to see the role of NVIDIA and actually, the involvement of NVIDIA and organization events like Money20/20 and the ability for you to share what you share with us today on the podcast around what you're doing to make banking better, and what banks are doing to make banking better as well.
Jim Marous (42:39):
It's really an exciting future, and I think we're probably underestimating what will come out of it, but it may take longer than we think it is banking.
Jim Marous (42:50):
Malcolm, thank you so much for being on the show today. I really appreciate it.
Malcom deMayo (42:54):
Yeah, it was great to chat with you today, Jim. Happy to be here.
Jim Marous (42:58):
Thanks for listening to Banking Transformed, the winner of three international awards for podcast excellence. If you enjoyed today's interview, please take some time to give our show a five-star rating. Also, be sure to catch my recent articles on The Financial Brand and check out the research we're doing on the Digital Banking Report.
Jim Marous:
This has been a production of Evergreen Podcasts. A special thank you to our senior producer, Leah Haslage; audio engineer, Chris Fafalios, and video producer Will Pritts. I'm your host, Jim Marous.
[Music Playing]
Jim Marous:
Until next time, remember, the future of banking includes the use of generative AI. Waiting to adopt AI could put your institution at a competitive disadvantage.