I will be the one to say it. AI disappointed most of us in 2024. We started the year with Chatbots that were good enough to be entertaining but not good enough to replace a human when accuracy or creativity were important and we ended the year with… well, the same thing. The conversations are a little more entertaining, but the world has not been changed.
In 2023 and 2024 we were so enthralled by the first couple leaps of Generative AI that it seemed likely that getting rid of hallucinations, creating reasoning through the use of agents and multi-step thinking (like OpenAI o1) would very soon propel us to another giant leap. Instead, the growth has been much more incremental.
I still believe that AI is going to change the world. This is a common problem in technology. I’m very much reminded of Bill Gates’ quote, “Most people overestimate what they can do in one year and underestimate what they can do in ten years.” This is where the first part of my key to AI Investment comes in, patience.
Patience will be required in two ways.
First, there’s the obvious, don’t give up on AI. The underlying technologies are expanding at an unbelievable rate; AI’s ability to reason jumped forward in 2024 (I posted about OpenAI o1), Agentic AI showed potential, an AI effectively won the nobel prize for Chemistry, and the legal frameworks for AI are starting to come together. There may be more work required to make AI practical than we thought, but that work is being done.
The second way that patience is required is a little bit less obvious. You need to exercise patience in not throwing good money after bad on Use Cases that are impossible today. One thing that has become clear about AI is that it takes an ecosystem to achieve a use case. Unless you’re NVIDIA or Google you’re not creating your own LLMs, building your own GPUs, writing your own vector algorithms, etc… You have to use what you can buy. I consider all of these things to be your “AI Foundations”… no matter how hard you try as a company, some use cases are just not possible with the current set of AI Foundations.
Think of it using this chart:
There are some use cases that are just above the yellow line and not practical to consider today. This is particularly true for use cases for which there isn’t “partial” value. Think about one of the most talked about use cases… replacing your software developers with AI Agents. It’s just not feasible yet with the foundations that we have. The LLMs are not creative enough, the understanding of requirements is not deep enough, the reasoning is not up to the level of critical thinking. Additionally, this problem doesn’t have interim value milestones as currently framed. Either the AI can perform as a software developer or it can’t. It doesn’t make my application better if it submits code that is only 40% right and fails all the tests.
Patience is necessary to avoid the mistake I’ve seen companies make; they pour more and more resources into these impossible use cases because they see the AI agent go from 40% right to 43% right. They make these incremental gains with painstaking analysis of prompts and patches like adversarial AI. It will never make be close enough to 100% to be used without a fundamental shift in the AI Foundations that we’re building on. Unfortunately, when that shift happens all the work you did on this set of foundations may or may not be applicable. For example, a lot of the prompts written to make ChatGPT 4 try to reason don’t make any sense when they’re fed to ChatGPT o1… so people are just throwing them away.
I know what most of you are thinking… my two recommendations seem to contradict each other. On one hand, I’m saying that AI will be important and you need to continue to invest in it. On the other hand, I’m saying that you should stop investing in many of the AI use cases you think would be most valuable because they’re infeasible. The trick to deploying your 2025 AI investments will be in “The Key to AI Investment in 2025 Part II: Preparation”
In this post, we’re going to try to give Jonabot a little bit of my history so it can answer questions about my past.The concept we are using for the biographical information is Retrieval Augmented Generation (RAG). Essentially, we will augment the AI by giving it access to reference information that’s valuable just before it answers. The best way to think of RAG is as a “cheat sheet”. Imagine asking someone, “Who was the fourth President of the United States?” You would expect them to answer in their own voice, but either with the right answer if they knew it, and if they didn’t, they might guess or say they didn’t know. One of the problems with Generative AI is that it tends to guess and not explain that it’s guessing. This is called a hallucination, and there are several good examples of it. With RAG, we not only ask, “Who was the Fourth President?” but we also give the AI Large Language Model (LLM) the answer (or a document with the answer). This results in an answer that’s in the “voice” of the LLM but contains the right answer. No hallucinations.
The way this is accomplished is to take all of the available information that you want to be on the “cheat sheet” and creating a vector database out of it. This allows that information to be easily searched. Then, when the AI is asked a question, we do a quick search and augment the prompt with the results of the search before putting it to the LLM.
I have seen many clients do things like ingesting their entire FAQ and “help” sections and making them the cheat sheet for their AI Chatbot. This is also useful if you need the Chatbot to be knowledgeable about things that have happened recently (since most LLMs were trained on the internet 2+ years ago). For Jonabot, we want to provide information about me and my history that it wouldn’t have learned by ingesting the internet (since I’m not famous, my wikipedia page the base AWS Titan LLM knows very little about me).
To enable this technically, I created a text document with a whole bunch of biographical information about Jonabot separated into lines. I also broke my entire resume into individual lines and fed them one at a time. I’m not choosing to put this document in my Git repo since, while none of it is private information, I don’t think it’s a good idea to just put all of it out on the internet in one place. Here’s a quick example, though:
And, an example from where I was typing in my resume:
I then created a simple Jupyter Notebook (Biographical_info_import) that works with that file. The notebook does two things:
It creates the Vector Database. It does this by ingesting each line in the document and then committing them to a vector database. For simplicity in this project, I am leveraging the Vector Database that comes with AWS Bedrock, “Titan Embeddings,” and the LangChain libraries to do the establishing and committing. I am also using it locally. This obviously wouldn’t scale for massive usage since it recreates the vector database from the text file every time it runs.
I created a simple query to test how accurately it retrieves information. Eventually, we will use the query results to augment the prompt to the LLM, but for the moment, I want to demonstrate how it works separately.
The results were pretty impressive. I was able to query “raised” and got “Pittsburgh, PA” or “Musical Instruments” and got “Piano and Guitar”. This is, of course, just based on a pure semantic search. The next step is to link these embeddings to the model with the prompt we built in a previous post and see how Jonabot sounds. I leveraged some of the sample code that AWS put together and built out a basic chat interface.
I have to admit, the results were pretty impressive:
A few reflections on what Jonabot said here:
The first response may not be exactly what I would say. I tend to talk about a holistic approach of working bottom-ups on platforms and top-down on business strategy… but the answer is impressive in how much it relates to my background. In particular, I’m impressed that it knew of my focus in cloud, data, agile, etc…
The model is a little bit more confused with a factual question about where I worked… The response to “Interesting, where have you worked?” is virtually an exact copy of part of my Mission statement in my resume but doesn’t mention any of my employers. If we are glass half full people, we could say that it answered more of the question of “where in IT” I have worked. Not satisfied, I asked the follow-up of, “What firms hired you?”. The response is a line pulled directly from my resume about which clients I worked with in my first stint at IBM back in 2005-2008.It’s still not a great answer.
Crown Alley is indeed my favorite bar (it’s around the corner from my house in NYC), but I don’t go there to get the news… it made up everything after the fact.
Overall, RAG seems to add to Jonabot’s performance greatly. This is especially true considering I only spent about an hour throwing together some biographical information and copying my resume. RAG is even more effective if you have a larger Knowledge Store (say your company’s FAQ) to pull from. One concern, which exists with enterprise uses of RAG as much as it does Jonabot, is that it does seem to focus on one answer found by the search (like my clients at IBM instead of all the companies that employed me).
I think Jonabot, with prompt engineering and RAG, is good enough to be a fun holiday project! In my next post, I’ll recap and give lessons learned, and (if I can figure out how to easily) I’ll give you a link to Jonabot and let you chat with him.
Now that we have the basics using a Foundational LLM and a bit of prompt engineering, it’s time to look into our first option for making Jonabot a little more like Jonathan. This will involve a technique called pretraining. This means providing a training set of additional data that the model did not have access to and allowing it to continue to train on that data. The hope is that the resulting model will include some Jonathan-specific ways of speaking and that it will know some of the things I like to talk about. Since we’re going for me as a consultant, we will pull my blog posts and my tweets. These aren’t always, but are usually, professional.
For the Tweets, X lets you pull an archive of all your posts, messages, periscopes, and other content. I found the tweets.js file, which had every tweet I’d ever made. If you want to follow along, you can use the “Parse_Tweets” jupyter notebook to find just the tweet text and add it to a JSONL file (which is the training format that Amazon Bedrock uses. When I looked through my data, I noticed that many of my tweets included links to either images or to other tweets that didn’t make sense without context, so I removed anything that had a link in it. I also noticed that I had a bunch of tweets from untappd, which is an app I use to track which beers I like. I removed those as I don’t think they’ll help train Jonobot.
Similarly, WordPress allows you to export your entire WordPress site. In this case, it comes as an XML file. I used the “Parse_Blog” jupyter notebook to go through that export and store each blog post or page in the JSONL file. Two quick notes on this:
Amazon uses the concept of “tokens” to limit the amount of content involved in each transaction. The limits are listed here. For training data, the limit per <input> in the JSONL is 4096 tokens, with AWS stating that each token is a “few characters”. To be conservative and save time, I just made each <input> 4000 characters or less. Only a few of my longer blog posts needed to be cleaned up.
In case you’re trying to reproduce this work based on what’s in the git repo… I discovered that Bedrock only accepts one file at a time for pretraining, so I pasted them together manually.
Now that we have some training data, it’s time to train our model! We ended up with 1100 “<input>” statements representing things I have said! Hopefully, this will make a model that sounds more like me than the base LLM model.
Amazon Bedrock makes this training much easier than you’d expect. You simply go to the Custom Models section of Bedrock and click “Create Continued Pretraining Job”. Note, if you’re going to do this more than once and put it into an LLMOps pipeline, you can do the same via cloud formations or boto3 in Python. I used this configuration:
If you’ve been following along closely, you will have noticed that the Training Job is leveraging the “Titan Text G1 – Lite” model instead of the “Titan Text G1 – Express” model I had used in the prompt testing. This swap was done because I realized how EXPENSIVE it would be to run inference against the model. Unfortunately, AWS has not really set this up to be enthusiast-friendly and the smallest possible HOURLY charge for throughput on a model based on “Express” is $20/hr. For “Lite” it’s only $7/hr. If I am going to have any hope of hosting this for a long time, it will have to be on the $7/hr version.
Now, the moment everyone has been waiting for… let’s try it out! I asked the same two questions that I asked in the prompt engineering post. Unfortunately, the answers got worse:
User: Why should I want your advice in my company’s IT Transformation?
Bot: I am an expert in IT Transformation. I have been doing this for 20 years. I have been a consultant for 15 of those years. I have been a leader of IT Transformation for 10 of those years.
User: Tell me how you would approach IT Transformation differently than your competitors.
Bot: I would approach IT Transformation differently than my competitors. I would start by looking at the company’s business strategy. I would then look at the company’s IT strategy. I would then look at the company’s IT Transformation strategy. I would then look at the company’s IT Transformation roadmap.
In addition to doing poorly on these questions, the training didn’t even seem to make the model factually aware of things I would have hoped that it would become aware of. When I asked the model who Jonathan Cavell was, it confidently reported that I am a philosopher at the University of Toronto! That’s in spite of the fact that it had trained on all of the pages of JonathanCavell.com. It also reported that Leave The House Out of It (my hobby project that exists on several of the about me pages, blogs, and tweets) was a book!
I’m not much of a Data Scientist, but I decided to take at least one extra stab at tuning the Hyperparameters to see if I could improve Jonabot. When I looked at the training metrics for the initial training, the training loss and training perplexity were all over the place, they are supposed to converge. I leveraged a higher epoch count, a higher learning rate multiplier, and a higher batch size. I also gave up on making this economical and switched to the Titan Express model, which has twice as many parameters. The responses I got were different and perhaps a little more focused on things I say, but the model got over-tuned and now seems to stay very focused on a single point.
User: Why should I want your advice in my company’s IT Transformation?
Bot: You should want my advice because I have a lot of experience in IT and I know what it takes to transform a company from a small startup to a large enterprise. I can help you avoid the mistakes that other companies have made and help you implement the changes that will make your company successful. I can also help you implement the changes that will make your company agile and flexible. I can also help you create a platform that will allow you to compete with your competitors and stay ahead of them. My goal is to help you implement the changes that will make your company successful. I will work with you to create a platform that will allow you to compete with your competitors and grow your market share. I will also help you create a platform that will allow you to compete with them on a level that they can’t match. My goal is to help you implement the changes that will make your company successful. I will do everything I can to make sure you are successful. I will work with you to implement the changes that will make your company successful.
User: Tell me how you would approach IT Transformation differently than your competitors.
Bot: I would approach IT Transformation differently than your competitors. I would start by creating a platform that is fit-for-purpose and then build on top of it. I would make sure that the platform is secure and then focus on making it as user-friendly as possible. I would also make sure that the platform is easy to use and that it supports your business goals. I would make sure that the platform is built to last and that it is built to be maintained. I would also make sure that the platform is built to be flexible enough to support your business needs. I would make sure that the platform is built to be used by your employees and that it is easy to maintain. I would also make sure that the platform is built to be used by your customers.
So, did all my tweets and blog posts make Jonobot dumber? That’s probably not the case. I blame a few things for the training not improving the model:
First, this is not a lot of data. I’ve seen this kind of training be successful on large knowledge bases where patterns can emerge. This is definitely not enough data or consistency in the data for the training to adopt my voice.
Even with my limited data, a more experienced Data Scientist who had more time may have been able to get more out of the model.
In an ideal world, I would have fine-tuned data and test data in addition to this pretraining data. This data would have both questions and correct answers so that the model could learn some common answers. We could also evaluate the model against “correct” answers using AWS Bedrock’s Model Evaluation. Even better, I’d love to be able to turn this over to human testers who could continue to fine-tune it.
Between the ineffectiveness of the training and the cost of running the trained model, I’ve ended up throwing away the pre-trained model. I will use prompt engineering and (depending on the result of the next post) prompt engineering to make Jonabot.
I run Kyndryl’s Applications, Data, and AI Consulting practice in the US. One of the things that I love about my job is that it exposes me to a wide range of interesting client opportunities. Everything from helping customers move to the cloud, to re-evaluating the way they use their data warehouse and analytics, to making better use of their ERP systems. While this broad exposure offers interesting executive level insights, there are some technologies that are so universally compelling to my clients that I feel like I have to get some hands-on experience so that I can have an informed opinion about how they’ll develop. This has happened a few times with container, serverless, and devops advancements in the cloud that forced me to rewrite my side project for that hands-on experience. It happened a couple years ago when I felt I needed to get my hands dirty with Machine Learning by creating an AI gambler. Over the last few months I have spent a disproportionate amount of my time talking with clients about Generative AI and I know I needed to understand it better.
Unless you’ve been living under a rock the last year, you’ve played a little bit with Generative AI either through ChatGPT or through Google’s BARD that’s built into your Android Device and Google Search Results. For enterprise customers they need to understand how they can leverage Generative AI, how much value it can provide now, and the extent to which this becomes a competitive differentiator in various industries. It’s clear, I need to invest in some learning on the subject, the question is how to find a valuable part of the Generative AI landscape that I can focus on over the holidays?
What Should I Build?
One area that it definitely will not be is in the building of Large Language Models (LLMs). Technology companies like ChatGPT (in partnership with Microsoft), Google with Gemini, AWS with Titan, and Facebook with Llama have dominated the training of “Foundational Models”. Enterprises that don’t have a billion dollars in R&D budget to spare are left to focus primarily on how they can leverage the LLMs provided by these tech companies. Since the budget for my Holiday Project is even lower than my clients’ R&D budgets, I will focus on this customization of LLMs as well. Specifically, I thought I would spend some time over the holidays customizing AWS’ Titan LLM (selected only because I plan on using AWS for the project) to build a Chatbot that’s based on me! I’m hoping I won’t be so successful that it can steal my job, but I am interested to see how far this can go. I plan to name my Chatbot, Jonabot.
What’s the Plan?
If you haven’t been following the Generative AI tech, there are three ways to improve on foundational large language models (LLMs) like AWS Titan better. I’m going to explore each of the three in a blog post as I create Jonabot and then there will be a final blog past where I put my model behind a UI and put it to the test with my family at our Christmas celebration! So look for the following blog posts over the next couple weeks:
Jonabot Post 2: Enhancing with Prompt Engineering – This is the easiest way to manipulate an LLM and you can try it yourself with ChatGPT or with Google. Simply request that your LLM answer questions in a certain way and it will do its best to comply. I will use this to give my LLM a little information about how I typically answer questions and some of my context.
Jonabot Post 3: Enhancing with Pre-Training – This is the most complex form of customizing the Generative AI LLMs. In this case we essentially continue to tune an existing model using specific content. For training Jonabot, this will involve using my blog and my Twitter to augment AWS’ Titan LLM so that it is particularly customized to be like me. This is different from RAG (explained below) in that it will help, in theory, to actually change the way the LLM speaks and answers prompts instead of just augmenting the information it has access to. If I’m honest, I am skeptical how valuable this will be for Jonabot because I don’t have access to a significant amount of my writing (I’m unwilling to use my emails, texts, github, etc… for fear of exposing personal or customer information.
Jonabot Post 4: Enhancing with RAG – This model of enhancing LLMs is what I expect companies will want to do most often. You can think of it, essentially, as giving your LLM a context specific cheat sheet every time it has to answer a question. In this case I’m going to give the LLM specific biographical information about myself, my company, projects I’ve completed, etc… This will all get stored in a vector database and, whenever Jonabot is asked a question we will quickly find the most relevant few pieces of information from these facts and they will get fed into Titan to be used if relevant. We are already seeing RAG be really important to our clients as they work with Generative AI because it allows them to make these “Foundational Models” answer questions that are specific to their enterprise.
Jonabot Post 5: Bring it together and Lessons Learned – I’m not much of a UI developer, but I am hoping to find a framework I can use to easily expose Jonabot to the world! In this last blog post I will discuss finding/leveraging that framework as well as provide any lessons learned from the Jonabot experiment.
Of course this might all change as I discover what the wild world of Generative AI has to offer… but one way or another, grab a bit of peppermint crusted popcorn and a glass of eggnog and let’s see where this chimney takes us!
Like everyone else helping customers navigate the fast moving waters of Data and AI, I have been following the new technologies and products that are springing up around Generative AI. The thing that has struck me as most profound is how the conversation is really being led by the hyperscalers. Sure OpenAI was the first vendor to break the technology, but with their Microsoft Partnership they quickly became part of the Hyperscaler arms race. Amazon followed with Bedrock and Google with Bard and Vertex and while there are lots of niche players, it’s clear that the cloud providers will play the pivitol role.
This struck me as interesting because it represents a shift for the hyperscalers from being infrastructure companies that occasionally sell a platform to being true platform companies where almost no one comes to them for “just” infrastructure. Relatively few firms (outside of technology companies) are trying to build their Generative AI stack from scratch without leveraging the ecosystem of one of the hyperscalers which makes those hyperscalers competition more with Teradata or Cloudera then Dell or Nvidia. While this sticks out in Generative AI because it’s new and there aren’t any established players, it’s actually a trend that has been gradually emerging across data and AI (other places as well, but that’s not my focus today).
I’ve noticed the trickle releases of Azure Fabric, Amazon Sagemaker, and the dozens and dozens of other data tools released by the hyperscalers, but it wasn’t until I was preparing this article that I realized how complete the hyperscaler offerings have become. Take a look at the chart above on “Cloud Platforms are Quickly Becoming the Leading Data Platform Providers”. I looked at Gartner’s major data categories and mapped where there were offerings from each provider. You’ll notice that the hyperscalers actually have enough data technology that for many use cases you don’t need Cloudera or Teradata or even niche add-ons like data masking. The only clear exception I noticed was in Data Quality.
I told you that story to get to this one. This has enormous ramifications for firms that previously shyed away from getting into Big Data and AI because they couldn’t generate sufficient ROI from their use cases to offset the giant costs of specialized hardware and software. Because the Cloud Providers charge by the hour for what you actually use, the initial barrier to entry around hardware and software purchasing is nearly completely gone. You can create a project budget for an individual initiative even if you only have one use case.
I attempted to illustrate this in the chart above, “This means you can buy AI Use Cases One at a Time”. As with most things in the cloud, if you have sufficient workload, and can manage it efficiently, it is often cheaper to run on premise. Where this is transformative is for organizations that only have a few Big Data use cases either because of their size or because of their industry.
Ten years ago, everyone was talking about industries getting “Ubered.” The expectation was that every company would be a software company in a very short order. The question at the time was, “Could you become a software company faster than a software company becomes you?” Ten years later, we’re not all software companies; the big banks are still the big banks (despite a couple of little digital banks), Ford still makes cars, and Tesla still looks like a car manufacturer more than a software company. Do you run or work in IT for a company that never became a software company? The good news is you’re still a company, and I think the risk of getting “Ubered” is low. The bad news is that you still need a Digital Transformation.
Over the last five years, I’ve seen that successful Digital Transformations vary across industries and companies. The key to determining what will make you successful is knowing what differentiates you from your competitors and then figuring out how technology can enhance your USP (unique selling proposition). My observation is that when companies do this introspection and plan a Digital Transformation based on what works for them, they fit into three patterns (see the infographic above):
They convert to a software company.
They Infuse Innovation across the company.
They optimize the cost of technology by managing it as a cost center.
As a technology consultant, I want to believe that everyone’s Digital Transformation will be a game changer, but as an automobile enthusiast, I know better. A great example of a company whose digital transformation would likely only provide incremental value is Ferrari. They famously make their cars by hand, so improvements to the manufacturing process are likely to be small. Ferrari owners don’t expect to interact with their car or the dealership over an app. They will benefit from a digital transformation, but it will be more cost-saving and efficient than game-changing tech. If they don’t already, they’ll want to implement an off-the-shelf ERP system to help them efficiently manage their supply chain, inventory, and distribution. I’m sure there are some -intensive parts of the car and the race teams, but those will likely be point solutions. Overall, for their core business, IT will feel more like a support function (HR, Finance, etc…) that needs to be optimized than a differentiator, even after their digital transformation.
It is the companies in the middle that are most interesting. These companies do not become software companies, but they infuse technology into every portion of their business to create a more differentiated product. Take, for example, a bank like Capital One. As a bank with branches, tellers, and loan officers, they cannot simply “become a software company.” While I have not worked closely with Capital One, a quick perusal of their website shows that they have talented software development teams working on the mobile experience, software for the branches, and credit cards. My guess is that talented software teams can be found in every corner of Capital One’s business, but unlike AirBnB or Uber, Capital One (and almost any bank) can’t just be software. A large part of their differentiation will be in how friendly their tellers are, what types of financial products and companies they invest in, and so on. A company like Capital One should make the goal of their digital transformation to become “Innovation Infused.” Companies like this should use their Digital Transformation to accomplish three key tasks:
Invest in infrastructure and development platforms that the different technology teams throughout the company can leverage to create this innovation efficiently. (I posted an article about infrastructure platforms here based on a talk I did at the AWS Summit, and there are a lot of great articles on creating Internal Developer Platforms, like this one).
Note that this should include rolling out low-code and no-code platforms as well as best-of-breed SaaS solutions that can be configured for differentiation. Companies are increasingly finding ways to stand out with software, even if they don’t code it all.
Build data platforms and AI tools that allow Data and AI to be effectively integrated into many of these separate development teams. Too many companies have all of their data engineers and data scientists in one team that is separated from the real business problems. Similar to the infrastructure and development platforms I spoke about above, your data needs to exist in a platform where APIs, queries, and standard models for ingestion/governance are available to engineers throughout the organization.
Change the way you think about projects! Build out teams that know their customers and focus on products. These teams can be spread across the company, but to maintain a strategic vision at the company level, it is important that there is some consistency (Usually formed from an Agile CoE or Agile PMO). It is also critical that companies change their budgeting and planning mechanisms to support this product transformation. For advice on how to be successful on this item, I highly recommend the book Lean Enterprise.
Over the last couple of years, my team and I have carved out significant expertise in this middle category of companies. While the threat of getting “Ubered” never materialized in most industries, the need to “Infuse Innovation” via software development will become paramount over the next decade. I’ve worked personally with a leading airline, a leading electricity company, and several financial companies that understand this and are making tremendous progress. Helping people and companies infuse innovation into what makes them great has become my overall goal.
I got the chance to present at AWS Summit in NYC on 7/12! I’ve had several people ask me what the speech was about so I thought I’d throw together a few blog posts that walk through the talk. I’m going to break it up in to three posts:
In the first post I covered the common fears that I hear from CIOs when it comes to adopting more cloud. In the second post I dug in to three conceptual things you can do with your cloud transformation to address the fears that come up around security, cost, and effective transformation. In this last post, I want to talk about the high-level architecture that we’ve been putting in place with clients.
Our architecture focuses on a set of fit-for-purpose platforms.
In the previous post I talked about the importance of not seeing the cloud as a single place. That’s what this architecture is designed to solve. Most organizations use the cloud for a variety of different applications that can’t all be served off of the same platform… but too many still thinking of the cloud as a single platform. Often one where they need “a landing zone”. While every company is different, this slide talks about 5 different types of platforms we have commonly seen deployed at our clients:
Cloud Native Accounts – These are for the applications that are being rewritten entirely and will be written and deployed by “DevOps” teams that know how to manage their own infrastructure. We use a cloud vending machine and a set of cloud formation templates to provision these accounts (typically separate ones for dev, test, and prod). Typically in Test and Prod no humans have access to these accounts. All deployments must be done from the pipeline and all infrastructure should be part of those deployments. This gives the highest level of flexibility to sophisticated teams so that they can innovate. Before leveraging this model it is important to have quality, security, and compliance scanning as part of the pipeline and potentially chaos engineering implemented in test or prod.
SAP Accounts – I used SAP in this example slide but this really could be anything. The critical part here is that whatever is in this account is managed by an AMS vendor. For example, Kyndryl offers a Managed SAP Service and a Managed Oracle ERP Service that is completely automated and can deploy entire environments quickly and manage them extremely cost effectively. These managed solutions are likely NOT built with the same tools that you use in the rest of your environments and may not even use the same kind of infrastructure and middleware. For this reason, we encourage customers to think of them as a black box but to put them in individual accounts where they are micro-segmented and the network traffic can be controlled. This is why they sit on top of the same account vending machine and CFT automations as the Cloud Native Accounts.
The remaining three platforms are traditional platforms that will not become multiple accounts (there are some exceptions here for subsidiaries or customer accounts), but are instead platforms that the workloads can be hosted on. You will notice a lot more pink in these areas, that’s because centralized IT takes a lot more responsibility for the IT and avoids the necessity of creating true “DevOps” teams. I know some of the cloud faithful are rolling their eyes at me right now… but in the enterprise there are always going to be cases where the value of transforming is not sufficient to cover the cost of transforming (for example if you’re planning to retire an application) or where transformation is impossible (for example a COTS application that must be hosted on specific types of servers). The platforms we see most often are:
Centralized Container Platform – There can be a lot of value in moving an application from running on App Server VMs to running on containers in a Kubernetes cluster (cost reductions, enforced consistency, rolling updates, increased availability). This is usually not a complete rewrite of the application and the team still has databases, load balancers, file servers, etc… that are not “cloud native”. This centralized platform gives application teams that are only partially transforming to containers a place to land.
Migration Platform – This is the least transformed environment. It is for application teams that want to continue to order servers out of a service catalog and get advice on them from the infrastructure team. You can almost think of it as your “datacenter in the cloud”. There will be significant efficiencies that can be gained here with cloud automation… but the user experience will remain similar to on-prem (and consequently the team can remain similar).
Mainframe Platform – We have many customers that still have on-premise mainframes they are looking to retire (we have lots of opinions on how/whether to do this… but that’s for another blog post). One option that we have seen customers use is to port these applications to Java. These new java apps still require services like a console service and a shared file server to function, so we recommend standing up these support services as part of a platform to support them.
This is what we mean when the cloud isn’t “one place”. It needs to be a set of fit-for-purpose platforms that are aligned to your workloads. There’s a lot of art and a little science to selecting your platforms. It’s easy for some architects to end up with too many and avoid giving app teams the freedom they need and for others to leverage too few and end up not giving those same app teams they support they need from centralized IT. We work with organizations to setup an Agile Product Management group within the infrastructure team that can define that market segmentation and the platforms to support it… but that’s another blog post all together.
I got the chance to present at AWS Summit in NYC on 7/12! I’ve had several people ask me what the speech was about so I thought I’d throw together a few blog posts that walk through the talk. I’m going to break it up in to three posts:
Part 2: A Cloud Transformation Program That Gives Confidence
Part 3: Fit-For-Purpose Platforms
In the first post I covered the common fears that I hear from CIOs when it comes to adopting more cloud. In this post I’m going to dig in to three conceptual things you can do with your cloud transformation to address the fears that come up around security, cost, and effective transformation.
The cloud is not ONE place
The first point that I made is that the cloud is not ONE place. The analogy that we used was imagine being asked, “Do you want to go on a trip?” A good trip for me and a good trip for you are likely very different. Workloads are very different and you can’t put them all in the same place. This is particularly true in enterprises that have been running long enough to have brittle legacy machines and code that’s not worth refactoring. It’s popular to say that everything should be cloud native and there should never be tech debt, but in the enterprise we know we’re going to need to create environments that have to run workloads that can’t scale horizontally… maybe even some that no how to run CICS or COBOL. To make cloud transformations more successful, you must establish this up front and build fit-for-purpose platforms for each. These will address the varied architectures necessary to optimize security and cost in the cloud.
It’s not just the architecture, but the services that must be different.
The second point that I made is that this is not just about place, it’s also about the services. The analogy here is to imagine that we all need help writing English good well, but we all need different kinds of help. If you’re a fluent English speaker maybe all you need is spell check. If you’ve only been speaking English for 6 months you may want a tutor. Similarly, some app teams will want to provision infrastructure as code from their pipelines while others will prefer to order manually via a service catalog (maybe even with some architecture consulting). Too many companies try to make a single service management “plane of glass” and end up stifling innovation in some places and not preventing vulnerabilities and overspend in others.
Finally, I counselled the architects in the room that they have to think of their cloud transformation as never over. The analogy here being, imagine what would have happened if the automotive industry had stopped developing cars as soon as they had something that met the minimum definition. We’d still be driving cars with 30HP and 12mpg.
Cloud Transformation is Never Over
For cloud transformation, it’s best to imagine this as a graph with the Y axis being the value of the capabilities you provide to app teams and the X axis being time. You never expect those lines to plateau, why would you expect your cloud transformation to be “over”?
In the next post I will break down the architecture implied by “The cloud is not one place.”
I got the chance to present at AWS Summit in NYC on 7/12! I’ve had several people ask me what the speech was about so I thought I’d throw together a few blog posts that walk through the talk. I’m going to break it up in to three posts:
Part 1: My CIO Doesn’t Do Enough Cloud
Part 2: A Cloud Transformation Program That Gives Confidence
Part 3: Fit-For-Purpose Platforms
Wondering what all of the comments all over this are about? That’s my team and AWS’ team trying to agree on pictures, stats, and context that made my point about reasonable cloud fears without offending AWS… who apparently would prefer we make cloud sound so easy that workloads practically fall in to it.
The scariest part of this whole ordeal was that the presentation was intended to be targeted at Architects! I barely get to touch a keyboard anymore and the place is going to be swarming with hands on folks that actually know how to make AWS do all kinds of amazing things. I decided I would help them all out by telling them WHY the CIO in their lives is always so afraid to let them use AWS for more and more interesting things.
Quick Aside: Pretty much everything have to say is just as true of any cloud… don’t tell AWS, but I regularly have the same conversation about Azure.
The slide above is actually my second slide. In the first one, I explained that there are millions in opportunities in the cloud; places where companies could be spending less or growing more by leveraging cloud. This slide is geared at explaining what CIOs are reading in their industry rags that’s making them scared to move to cloud more aggressively. Mostly these come in three categories mapped to the stats above:
There’s no way to go to the cloud without transformation… and transformations often fail. The culture and people changes are hard. If your IT team is doing well today (or your CIO is just a couple years from retirement), it may not be worth the risk of trying to undergo this transformation.
Most CIOs that I talk to have their favorite story about how expensive the cloud can be. There’s a state government that after their first of twelve weekend application migrations took one look at their bill, migrated everything back and cancelled the program. I mean, AWS is only 13% of Amazon’s revenue, but 56% of its profit! If you don’t plan for it and don’t do cloud well, it will get costly.
Security is also an issue. I hate people who say the cloud is inherently more secure, the CIA is on the cloud! It is different in how it’s secured though. You either need transformed workloads that leverage zero trust or you need to reproduce all of your on-premise perimeter security in the cloud. Either way, there’s a fair bit of work in front of you and any mistakes could make something more vulnerable than it was on premise.
Those are ALL good fears. Many cloud programs will fall victim to them. The important thing is that we structure our cloud programs to avoid them.
Data moving to the cloud is accelerating again. Source: Faction via Zippia
5-7 years ago it seemed obvious that most companies were going to perform huge transformations that included moving most of their applications and data moving to the cloud. The percentage of workloads and data in the cloud was increasing by 5-10 percent per year. However, in 2020 we started to see that slow down before exploding out of the gate again last year. I believe this is an indicator that we’re just entering in to a robust second phase of cloud adoption driven by a fundamentally different approach to cloud.
2010-2017: Cloud Mania
I think the reason for the original slow down in cloud adoption was workloads. The major clouds were initially constructed for green field development and that’s what they attracted. Companies rewrote their e-commerce, web, and mobile applications to take advantage of what the cloud was good at (big dev teams running their own opps). They also bought SaaS platforms that made sense to replace some of the on-premise systems that had become antiquated (for example Email, CRM and HR systems). The companies (Microsoft, Salesforce, Workday) that dominated in this new world of SaaS were able to run systems that looked similar across large numbers of clients.
2018-2021: Rethinking What Workloads are Cloud Workloads
So why did the percentage of workloads start griding to a hault? Simply, we ran out of low hanging fruit. Moving the companies mainframe, ERP, or even just old workloads didn’t make sense. Lift and Shift models provided little value and often actually increased the price of infrastructure. SaaS companies were unable to master things that SAP and others have spent decades building. Companies that were more than a few years old began to see the wisdom of a hybrid cloud model where they could use the cloud for what it was good for and keep the rest on-premise.
2022-???: New Transformation Techniques
What we’re seeing now is innovation from both the cloud providers and service providers that’s making more and more workloads good candidates for cloud. While a full list of these would be difficult, I’ll zoom in on a few that my team at Kyndryl is focusing on helping clients take advantage of:
Data Platforms – Did you realize that (according to Gartner) Microsoft, Google, and AWS are now all in the “visionaries” category in “Data Science and Machine Learning”? That’s not “Cloud” specific…. that’s just all of Data Science and Machine Learning. They’re quickly catching the capabilities of on-premise focused companies like IBM, Mathworks, and Tibco. The big difference is that when you use a cloud provider for these workloads you are paying with incremental OpEx instead of a big capital investment in software and hardware. That’s making it very attractive for companies that only have a few workloads that they really want to use data science and machine learning for.
Mainframe Workloads – Mainframes have traditionally been viewed as one big black box that was too dangerous to move to cloud. Kyndryl has been working with companies to setup roadmaps that actually, practically get customers off of on-premise mainframes. Some workloads might get rewritten to be cloud native apps, some might get replatformed so they can run on AWS, others might get moved to Kyndryl’s zCloud.
ERP Systems in the Cloud – The ERP providers were (for obvious reasons) not the first movers in to cloud. Their customers had extremely mission critical workloads that had been customized to the point of being very brittle. Kyndryl has partnered with SAP and is able to help customers move those workloads now. The patterns have become more hardened and both AWS and Microsoft have programs to help clients. See more in our whitepaper here.