I will be the one to say it. AI disappointed most of us in 2024. We started the year with Chatbots that were good enough to be entertaining but not good enough to replace a human when accuracy or creativity were important and we ended the year with… well, the same thing. The conversations are a little more entertaining, but the world has not been changed.
In 2023 and 2024 we were so enthralled by the first couple leaps of Generative AI that it seemed likely that getting rid of hallucinations, creating reasoning through the use of agents and multi-step thinking (like OpenAI o1) would very soon propel us to another giant leap. Instead, the growth has been much more incremental.
I still believe that AI is going to change the world. This is a common problem in technology. I’m very much reminded of Bill Gates’ quote, “Most people overestimate what they can do in one year and underestimate what they can do in ten years.” This is where the first part of my key to AI Investment comes in, patience.
Patience will be required in two ways.
First, there’s the obvious, don’t give up on AI. The underlying technologies are expanding at an unbelievable rate; AI’s ability to reason jumped forward in 2024 (I posted about OpenAI o1), Agentic AI showed potential, an AI effectively won the nobel prize for Chemistry, and the legal frameworks for AI are starting to come together. There may be more work required to make AI practical than we thought, but that work is being done.
The second way that patience is required is a little bit less obvious. You need to exercise patience in not throwing good money after bad on Use Cases that are impossible today. One thing that has become clear about AI is that it takes an ecosystem to achieve a use case. Unless you’re NVIDIA or Google you’re not creating your own LLMs, building your own GPUs, writing your own vector algorithms, etc… You have to use what you can buy. I consider all of these things to be your “AI Foundations”… no matter how hard you try as a company, some use cases are just not possible with the current set of AI Foundations.
Think of it using this chart:
There are some use cases that are just above the yellow line and not practical to consider today. This is particularly true for use cases for which there isn’t “partial” value. Think about one of the most talked about use cases… replacing your software developers with AI Agents. It’s just not feasible yet with the foundations that we have. The LLMs are not creative enough, the understanding of requirements is not deep enough, the reasoning is not up to the level of critical thinking. Additionally, this problem doesn’t have interim value milestones as currently framed. Either the AI can perform as a software developer or it can’t. It doesn’t make my application better if it submits code that is only 40% right and fails all the tests.
Patience is necessary to avoid the mistake I’ve seen companies make; they pour more and more resources into these impossible use cases because they see the AI agent go from 40% right to 43% right. They make these incremental gains with painstaking analysis of prompts and patches like adversarial AI. It will never make be close enough to 100% to be used without a fundamental shift in the AI Foundations that we’re building on. Unfortunately, when that shift happens all the work you did on this set of foundations may or may not be applicable. For example, a lot of the prompts written to make ChatGPT 4 try to reason don’t make any sense when they’re fed to ChatGPT o1… so people are just throwing them away.
I know what most of you are thinking… my two recommendations seem to contradict each other. On one hand, I’m saying that AI will be important and you need to continue to invest in it. On the other hand, I’m saying that you should stop investing in many of the AI use cases you think would be most valuable because they’re infeasible. The trick to deploying your 2025 AI investments will be in “The Key to AI Investment in 2025 Part II: Preparation”
It is so good to see Michael Lewis back to writing books that don’t have an agenda. I have read every book Lewis ever wrote and, while I am generally politically aligned, wasn’t excited when he turned to writing about climate change, the disfunction of the Trump administration, and Covid. Going Infinite is a triumphant return to his typical style of telling the story of moment in time that’s difficult for most of us to understand through the lens of an individual who’s emblematic of that time.
In this case the story is of Sam Bankman-Fried (SBF). The story is as entertaining as it is insightful; both giving us amusing anecdotes about the antics of the world’s youngest self-made billionaire and helping shed light on the moment that made his rise and fall possible. It’s a breezy day-long read and I recommend it to everyone.
The thing that struck me most about it is how naive it’s possible to be when you’ve been right so often early in your life. SBF made some brilliant (and notably lucky) moves early in his career and then rose to a point where he must of thought he could never be wrong. Imagine sitting down at a blackjack table, betting $100, winning 20 times in a row, and betting everything you had each time. Now you have over $100M and more than likely a belief that you could never be wrong. You’re no more likely to make good decisions than you were 20 minutes ago, but you’re climbing life’s scoreboard and it appears you’re a genius.
Anyway, that’s what I kept thinking about as I read the book.
I’ve always found Gene Kim’s work interesting and his new book is no exception. When I read The Pheonix Project I was a relatively new Dev Manager and getting my first taste of IT Operations. It allowed me to see the problem I was given in an entirely new light. Kim solved that problem and moved on to bigger challenges (along with collaborator Stephen Spears), culminating in this book which tackles a much more sophisticated problem; how to make an entire Organization more effective. While the evolution of Kim’s thinking has advanced a bit faster than my career, this book is much more relevant to me (while running/leading an organization of 100+) than another DevOps book would have been. It was fun to take inspiration at this new stage of my career from the same guy I took inspiration from in the last phase.
I enjoyed the way this book was laid out. Each primary topic had a section for theory, a section for examples, and a section for a deep case study. It made it easy to navigate the book the way I wanted to. The book was also centered around two core ways of looking at an organization that can be stated and understood simply. The concepts aren’t simple of course, and each requires another hundred pages to expand. One aspect of these expansions that I found extremely captivating was the authors’ comparisons with known management systems and an exploration of why people think they will work and then why they often don’t. This lead to one of my favorite sections of the book, where they break down why using escalation and expediting is not as effective as having iterative, modularized, linear work assignments.
The first way of looking at the organization that Kim explorers is thinking of the organization in Layers:
Layer One is the technical object and the people doing the “line” work.
Layer Two is the tooling and the people who maintain the tooling that allows the “line” work to proceed.
Layer Three is the social circuitry that links Layer One, Layer Two, your customers, your management, other parts of your organization, etc…
I’ve started incorporating this concept into my retrospectives. When we don’t get an outcome we want, I’m having the team think about which level we think can be adjusted.
The second way the authors break down the organization is into three categories of how work can be improved; Simplification (breaking down big problems and giving teams/people something they can solve), Slowification (finding the right time/place to solve problems (for example moving an issue into a lab or testing a change with a small population), and amplification (making sure problems are well known and discussed). Obviously there are dozens of techniques for each of these and the book goes through many of them, but I love the simplicity of the overall solution.
I highly recommend this book for anyone who leads (even if they don’t manage) a team of more than ~15 that is managed on outcomes. It’s less of an answer key with specific tips, and more of a guide to understanding why some things work and others don’t.
If you do pick it up, make sure to read Appendix B on leadership!
In this post, we’re going to try to give Jonabot a little bit of my history so it can answer questions about my past.The concept we are using for the biographical information is Retrieval Augmented Generation (RAG). Essentially, we will augment the AI by giving it access to reference information that’s valuable just before it answers. The best way to think of RAG is as a “cheat sheet”. Imagine asking someone, “Who was the fourth President of the United States?” You would expect them to answer in their own voice, but either with the right answer if they knew it, and if they didn’t, they might guess or say they didn’t know. One of the problems with Generative AI is that it tends to guess and not explain that it’s guessing. This is called a hallucination, and there are several good examples of it. With RAG, we not only ask, “Who was the Fourth President?” but we also give the AI Large Language Model (LLM) the answer (or a document with the answer). This results in an answer that’s in the “voice” of the LLM but contains the right answer. No hallucinations.
The way this is accomplished is to take all of the available information that you want to be on the “cheat sheet” and creating a vector database out of it. This allows that information to be easily searched. Then, when the AI is asked a question, we do a quick search and augment the prompt with the results of the search before putting it to the LLM.
I have seen many clients do things like ingesting their entire FAQ and “help” sections and making them the cheat sheet for their AI Chatbot. This is also useful if you need the Chatbot to be knowledgeable about things that have happened recently (since most LLMs were trained on the internet 2+ years ago). For Jonabot, we want to provide information about me and my history that it wouldn’t have learned by ingesting the internet (since I’m not famous, my wikipedia page the base AWS Titan LLM knows very little about me).
To enable this technically, I created a text document with a whole bunch of biographical information about Jonabot separated into lines. I also broke my entire resume into individual lines and fed them one at a time. I’m not choosing to put this document in my Git repo since, while none of it is private information, I don’t think it’s a good idea to just put all of it out on the internet in one place. Here’s a quick example, though:
And, an example from where I was typing in my resume:
I then created a simple Jupyter Notebook (Biographical_info_import) that works with that file. The notebook does two things:
It creates the Vector Database. It does this by ingesting each line in the document and then committing them to a vector database. For simplicity in this project, I am leveraging the Vector Database that comes with AWS Bedrock, “Titan Embeddings,” and the LangChain libraries to do the establishing and committing. I am also using it locally. This obviously wouldn’t scale for massive usage since it recreates the vector database from the text file every time it runs.
I created a simple query to test how accurately it retrieves information. Eventually, we will use the query results to augment the prompt to the LLM, but for the moment, I want to demonstrate how it works separately.
The results were pretty impressive. I was able to query “raised” and got “Pittsburgh, PA” or “Musical Instruments” and got “Piano and Guitar”. This is, of course, just based on a pure semantic search. The next step is to link these embeddings to the model with the prompt we built in a previous post and see how Jonabot sounds. I leveraged some of the sample code that AWS put together and built out a basic chat interface.
I have to admit, the results were pretty impressive:
A few reflections on what Jonabot said here:
The first response may not be exactly what I would say. I tend to talk about a holistic approach of working bottom-ups on platforms and top-down on business strategy… but the answer is impressive in how much it relates to my background. In particular, I’m impressed that it knew of my focus in cloud, data, agile, etc…
The model is a little bit more confused with a factual question about where I worked… The response to “Interesting, where have you worked?” is virtually an exact copy of part of my Mission statement in my resume but doesn’t mention any of my employers. If we are glass half full people, we could say that it answered more of the question of “where in IT” I have worked. Not satisfied, I asked the follow-up of, “What firms hired you?”. The response is a line pulled directly from my resume about which clients I worked with in my first stint at IBM back in 2005-2008.It’s still not a great answer.
Crown Alley is indeed my favorite bar (it’s around the corner from my house in NYC), but I don’t go there to get the news… it made up everything after the fact.
Overall, RAG seems to add to Jonabot’s performance greatly. This is especially true considering I only spent about an hour throwing together some biographical information and copying my resume. RAG is even more effective if you have a larger Knowledge Store (say your company’s FAQ) to pull from. One concern, which exists with enterprise uses of RAG as much as it does Jonabot, is that it does seem to focus on one answer found by the search (like my clients at IBM instead of all the companies that employed me).
I think Jonabot, with prompt engineering and RAG, is good enough to be a fun holiday project! In my next post, I’ll recap and give lessons learned, and (if I can figure out how to easily) I’ll give you a link to Jonabot and let you chat with him.
Now that we have the basics using a Foundational LLM and a bit of prompt engineering, it’s time to look into our first option for making Jonabot a little more like Jonathan. This will involve a technique called pretraining. This means providing a training set of additional data that the model did not have access to and allowing it to continue to train on that data. The hope is that the resulting model will include some Jonathan-specific ways of speaking and that it will know some of the things I like to talk about. Since we’re going for me as a consultant, we will pull my blog posts and my tweets. These aren’t always, but are usually, professional.
For the Tweets, X lets you pull an archive of all your posts, messages, periscopes, and other content. I found the tweets.js file, which had every tweet I’d ever made. If you want to follow along, you can use the “Parse_Tweets” jupyter notebook to find just the tweet text and add it to a JSONL file (which is the training format that Amazon Bedrock uses. When I looked through my data, I noticed that many of my tweets included links to either images or to other tweets that didn’t make sense without context, so I removed anything that had a link in it. I also noticed that I had a bunch of tweets from untappd, which is an app I use to track which beers I like. I removed those as I don’t think they’ll help train Jonobot.
Similarly, WordPress allows you to export your entire WordPress site. In this case, it comes as an XML file. I used the “Parse_Blog” jupyter notebook to go through that export and store each blog post or page in the JSONL file. Two quick notes on this:
Amazon uses the concept of “tokens” to limit the amount of content involved in each transaction. The limits are listed here. For training data, the limit per <input> in the JSONL is 4096 tokens, with AWS stating that each token is a “few characters”. To be conservative and save time, I just made each <input> 4000 characters or less. Only a few of my longer blog posts needed to be cleaned up.
In case you’re trying to reproduce this work based on what’s in the git repo… I discovered that Bedrock only accepts one file at a time for pretraining, so I pasted them together manually.
Now that we have some training data, it’s time to train our model! We ended up with 1100 “<input>” statements representing things I have said! Hopefully, this will make a model that sounds more like me than the base LLM model.
Amazon Bedrock makes this training much easier than you’d expect. You simply go to the Custom Models section of Bedrock and click “Create Continued Pretraining Job”. Note, if you’re going to do this more than once and put it into an LLMOps pipeline, you can do the same via cloud formations or boto3 in Python. I used this configuration:
If you’ve been following along closely, you will have noticed that the Training Job is leveraging the “Titan Text G1 – Lite” model instead of the “Titan Text G1 – Express” model I had used in the prompt testing. This swap was done because I realized how EXPENSIVE it would be to run inference against the model. Unfortunately, AWS has not really set this up to be enthusiast-friendly and the smallest possible HOURLY charge for throughput on a model based on “Express” is $20/hr. For “Lite” it’s only $7/hr. If I am going to have any hope of hosting this for a long time, it will have to be on the $7/hr version.
Now, the moment everyone has been waiting for… let’s try it out! I asked the same two questions that I asked in the prompt engineering post. Unfortunately, the answers got worse:
User: Why should I want your advice in my company’s IT Transformation?
Bot: I am an expert in IT Transformation. I have been doing this for 20 years. I have been a consultant for 15 of those years. I have been a leader of IT Transformation for 10 of those years.
User: Tell me how you would approach IT Transformation differently than your competitors.
Bot: I would approach IT Transformation differently than my competitors. I would start by looking at the company’s business strategy. I would then look at the company’s IT strategy. I would then look at the company’s IT Transformation strategy. I would then look at the company’s IT Transformation roadmap.
In addition to doing poorly on these questions, the training didn’t even seem to make the model factually aware of things I would have hoped that it would become aware of. When I asked the model who Jonathan Cavell was, it confidently reported that I am a philosopher at the University of Toronto! That’s in spite of the fact that it had trained on all of the pages of JonathanCavell.com. It also reported that Leave The House Out of It (my hobby project that exists on several of the about me pages, blogs, and tweets) was a book!
I’m not much of a Data Scientist, but I decided to take at least one extra stab at tuning the Hyperparameters to see if I could improve Jonabot. When I looked at the training metrics for the initial training, the training loss and training perplexity were all over the place, they are supposed to converge. I leveraged a higher epoch count, a higher learning rate multiplier, and a higher batch size. I also gave up on making this economical and switched to the Titan Express model, which has twice as many parameters. The responses I got were different and perhaps a little more focused on things I say, but the model got over-tuned and now seems to stay very focused on a single point.
User: Why should I want your advice in my company’s IT Transformation?
Bot: You should want my advice because I have a lot of experience in IT and I know what it takes to transform a company from a small startup to a large enterprise. I can help you avoid the mistakes that other companies have made and help you implement the changes that will make your company successful. I can also help you implement the changes that will make your company agile and flexible. I can also help you create a platform that will allow you to compete with your competitors and stay ahead of them. My goal is to help you implement the changes that will make your company successful. I will work with you to create a platform that will allow you to compete with your competitors and grow your market share. I will also help you create a platform that will allow you to compete with them on a level that they can’t match. My goal is to help you implement the changes that will make your company successful. I will do everything I can to make sure you are successful. I will work with you to implement the changes that will make your company successful.
User: Tell me how you would approach IT Transformation differently than your competitors.
Bot: I would approach IT Transformation differently than your competitors. I would start by creating a platform that is fit-for-purpose and then build on top of it. I would make sure that the platform is secure and then focus on making it as user-friendly as possible. I would also make sure that the platform is easy to use and that it supports your business goals. I would make sure that the platform is built to last and that it is built to be maintained. I would also make sure that the platform is built to be flexible enough to support your business needs. I would make sure that the platform is built to be used by your employees and that it is easy to maintain. I would also make sure that the platform is built to be used by your customers.
So, did all my tweets and blog posts make Jonobot dumber? That’s probably not the case. I blame a few things for the training not improving the model:
First, this is not a lot of data. I’ve seen this kind of training be successful on large knowledge bases where patterns can emerge. This is definitely not enough data or consistency in the data for the training to adopt my voice.
Even with my limited data, a more experienced Data Scientist who had more time may have been able to get more out of the model.
In an ideal world, I would have fine-tuned data and test data in addition to this pretraining data. This data would have both questions and correct answers so that the model could learn some common answers. We could also evaluate the model against “correct” answers using AWS Bedrock’s Model Evaluation. Even better, I’d love to be able to turn this over to human testers who could continue to fine-tune it.
Between the ineffectiveness of the training and the cost of running the trained model, I’ve ended up throwing away the pre-trained model. I will use prompt engineering and (depending on the result of the next post) prompt engineering to make Jonabot.
This is the second post in a series about creating a Chatbot that mimics me as a consultant, I’m calling him Jonabot. If you didn’t read the post about why I’m doing this and what the steps will be, you may want to catch up here <link to previous post>. All of the code for this project is on my GitHub. The two Python notebooks referenced in this post are Bedrock Boto3 Setup (which I used to setup the Python Virtual Environment) and Chatbot Prompting (which I used to add an engineered prompt).
[Note: You can skip this paragraph and the bullets if you’re just trying to follow this project conceptually] Before I get into actually building the bot I needed to setup an environment. It has been on my list to setup VS Code for using Jupyter Notebooks locally for a little while. A couple years ago when I created Wall-E I used Sagemaker Studio, but this year I wanted more of the development and code to be local and outside of AWS so I could more easily repeat this process leveraging ChatGPT and Azure at some point if I decide to. In case you’re trying to follow along at home, here’s what I had to do to get that running:
Added Jupyter and Python plugins to VSCode.
Enabled AWS Code Whisperer so I can play with their GenAI features (hopefully more on that soon). AWS Code Whisper is the tool that’s similar to Microsoft’s Co-Pilot but it’s AWS specific.
Ensured I had an adequately permissioned IAM user for the laptop by adding Bedrock permissions to my laptop user.
I located one of the AWS supplied sample projects for Bedrock and borrowed their setup notebook.
It included a useful utility for connecting to Bedrock.
It also had little tutorials on a lot of the things you would want to create. I borrowed heavily from the AWS
Once the environment was setup I went about adding context to the chatbot around how to talk like me. Remember as we’re building this that in addition to providing this input via prompt engineering we will also be adding context via RAG and training via all of my tweets and blog posts. Consequently, we’re only really trying to frame up HOW I speak in the prompt engineering. I took a pole of my family and friends that were sitting around and we came up with the following things being important (I added a few things about my professional background and methodology for transformation in numbers 5 to 7):
I like to use puns and analogies
I am a big sports fan, especially Penn State, the Pirates, and the Steelers
That I love “dad jokes” and that my answers should have lots of puns (can you tell how much my family appreciates my sense of humor?)
I like to include anecdotes about my life in my work answers.
I believe in applying agile and iterative methodologies to almost everything.
I am an AWS expert, and learning a lot about Big Data and AI.
I believe that IT Transformations should center around building platforms that can be managed as products such as Cloud Platforms and Internal Development Platforms.
In addition to providing the model with this context about how it should behave, I also made a couple changes to the settings of the model. I made it less verbose and lowered the “temperature”. These are both consistent with my tendency to be direct when I’m talking to employees, colleagues, and clients. It also encourages the AI Model to stick to the script a little bit more.
Let’s ask it a few questions and see how we did:
User: Why would I want your advice in my company’s IT Transformation?
Jonabot: Well, let's see. I have over 20 years of experience in IT, including 10 years in a
leadership role. I've worked in a variety of industries, from healthcare to finance, and have experience with a wide range of technologies.
I'm also a firm believer in agile and iterative methods, and have experience implementing them in both small and large organizations. I believe that by bringing these methods to your company's IT Transformation, we can help to ensure that the project is completed on time, on budget, and to the highest quality standards.
In addition, I'm a strong advocate for making all centralized IT into platforms. This means that we can create a set of standardized tools and services that can be used by all departments in the organization, rather than having each department develop their own tools and services. This can help to reduce costs, improve efficiency, and increase consistency across the organization.
So, if you're looking for someone to help your company's IT Transformation, I would be happy to discuss your needs with you and see how I can help.
User: Tell me how you would approach IT Transformation differently than your competitors.
Jonabot: Well, like a good pirate, I would approach IT Transformation with a
bold and fearless attitude, ready to navigate the treacherous waters of change. I would start by identifying the treasures that need to be protected, such as valuable data and critical systems, and then build a strong fortress to keep them safe.
Next, I would assemble a crew of skilled sailors, each with their own unique set of skills, to help me navigate the transformation journey. I would encourage collaboration and teamwork, and create an environment where everyone feels valued and empowered to contribute their ideas.
Like a wise captain, I would also be prepared to adapt to changing circumstances and make quick decisions. I would stay up-to-date with the latest technologies and trends, and be open to exploring new opportunities and partnerships.
Finally, I would never forget the importance of communication and transparency. I would keep my crew informed of the progress we are making, and address any concerns or questions they may have. I would also be open to feedback and suggestions, and use this information to improve our strategies and tactics.
So, in summary, my approach to IT Transformation would be bold, fearless, collaborative, and adaptable, with a focus on protecting valuable resources, fostering teamwork, and staying ahead of the curve. Arrr, let's set sail on this transformation journey together!
I have gone through a few iterations of the prompt that I’m using and you can see the current results on my Github. Overall, I’m fairly pleased by the results above and think that we’re on our way as we begin to make it even better. A few observations on the testing I’ve done:
There are some clear hallucinations. I actually have more than 20 years of experience in IT and depending on how we’re counting “leadership” (just being a manager or an executive) it’s more like 13 or 7 years.
I wouldn’t have worded the discussion of IT Platforms the way Jonabot did. For me, it’s not really a discussion of standardization. That said, we’re not too far away from something I would say.
I obviously would not have talked about being a Transformation Pirate. However since we told Jonabot that I like the Pirates and that I use an analogy… it’s not too bad. Hopefully some of the training with tweets about the Pittsburgh Pirates it won’t make that mistake in future iterations.
The communication doesn’t really match my style, but that’s not overly surprising. I haven’t really done much to train it yet.
I run Kyndryl’s Applications, Data, and AI Consulting practice in the US. One of the things that I love about my job is that it exposes me to a wide range of interesting client opportunities. Everything from helping customers move to the cloud, to re-evaluating the way they use their data warehouse and analytics, to making better use of their ERP systems. While this broad exposure offers interesting executive level insights, there are some technologies that are so universally compelling to my clients that I feel like I have to get some hands-on experience so that I can have an informed opinion about how they’ll develop. This has happened a few times with container, serverless, and devops advancements in the cloud that forced me to rewrite my side project for that hands-on experience. It happened a couple years ago when I felt I needed to get my hands dirty with Machine Learning by creating an AI gambler. Over the last few months I have spent a disproportionate amount of my time talking with clients about Generative AI and I know I needed to understand it better.
Unless you’ve been living under a rock the last year, you’ve played a little bit with Generative AI either through ChatGPT or through Google’s BARD that’s built into your Android Device and Google Search Results. For enterprise customers they need to understand how they can leverage Generative AI, how much value it can provide now, and the extent to which this becomes a competitive differentiator in various industries. It’s clear, I need to invest in some learning on the subject, the question is how to find a valuable part of the Generative AI landscape that I can focus on over the holidays?
What Should I Build?
One area that it definitely will not be is in the building of Large Language Models (LLMs). Technology companies like ChatGPT (in partnership with Microsoft), Google with Gemini, AWS with Titan, and Facebook with Llama have dominated the training of “Foundational Models”. Enterprises that don’t have a billion dollars in R&D budget to spare are left to focus primarily on how they can leverage the LLMs provided by these tech companies. Since the budget for my Holiday Project is even lower than my clients’ R&D budgets, I will focus on this customization of LLMs as well. Specifically, I thought I would spend some time over the holidays customizing AWS’ Titan LLM (selected only because I plan on using AWS for the project) to build a Chatbot that’s based on me! I’m hoping I won’t be so successful that it can steal my job, but I am interested to see how far this can go. I plan to name my Chatbot, Jonabot.
What’s the Plan?
If you haven’t been following the Generative AI tech, there are three ways to improve on foundational large language models (LLMs) like AWS Titan better. I’m going to explore each of the three in a blog post as I create Jonabot and then there will be a final blog past where I put my model behind a UI and put it to the test with my family at our Christmas celebration! So look for the following blog posts over the next couple weeks:
Jonabot Post 2: Enhancing with Prompt Engineering – This is the easiest way to manipulate an LLM and you can try it yourself with ChatGPT or with Google. Simply request that your LLM answer questions in a certain way and it will do its best to comply. I will use this to give my LLM a little information about how I typically answer questions and some of my context.
Jonabot Post 3: Enhancing with Pre-Training – This is the most complex form of customizing the Generative AI LLMs. In this case we essentially continue to tune an existing model using specific content. For training Jonabot, this will involve using my blog and my Twitter to augment AWS’ Titan LLM so that it is particularly customized to be like me. This is different from RAG (explained below) in that it will help, in theory, to actually change the way the LLM speaks and answers prompts instead of just augmenting the information it has access to. If I’m honest, I am skeptical how valuable this will be for Jonabot because I don’t have access to a significant amount of my writing (I’m unwilling to use my emails, texts, github, etc… for fear of exposing personal or customer information.
Jonabot Post 4: Enhancing with RAG – This model of enhancing LLMs is what I expect companies will want to do most often. You can think of it, essentially, as giving your LLM a context specific cheat sheet every time it has to answer a question. In this case I’m going to give the LLM specific biographical information about myself, my company, projects I’ve completed, etc… This will all get stored in a vector database and, whenever Jonabot is asked a question we will quickly find the most relevant few pieces of information from these facts and they will get fed into Titan to be used if relevant. We are already seeing RAG be really important to our clients as they work with Generative AI because it allows them to make these “Foundational Models” answer questions that are specific to their enterprise.
Jonabot Post 5: Bring it together and Lessons Learned – I’m not much of a UI developer, but I am hoping to find a framework I can use to easily expose Jonabot to the world! In this last blog post I will discuss finding/leveraging that framework as well as provide any lessons learned from the Jonabot experiment.
Of course this might all change as I discover what the wild world of Generative AI has to offer… but one way or another, grab a bit of peppermint crusted popcorn and a glass of eggnog and let’s see where this chimney takes us!
Like everyone else helping customers navigate the fast moving waters of Data and AI, I have been following the new technologies and products that are springing up around Generative AI. The thing that has struck me as most profound is how the conversation is really being led by the hyperscalers. Sure OpenAI was the first vendor to break the technology, but with their Microsoft Partnership they quickly became part of the Hyperscaler arms race. Amazon followed with Bedrock and Google with Bard and Vertex and while there are lots of niche players, it’s clear that the cloud providers will play the pivitol role.
This struck me as interesting because it represents a shift for the hyperscalers from being infrastructure companies that occasionally sell a platform to being true platform companies where almost no one comes to them for “just” infrastructure. Relatively few firms (outside of technology companies) are trying to build their Generative AI stack from scratch without leveraging the ecosystem of one of the hyperscalers which makes those hyperscalers competition more with Teradata or Cloudera then Dell or Nvidia. While this sticks out in Generative AI because it’s new and there aren’t any established players, it’s actually a trend that has been gradually emerging across data and AI (other places as well, but that’s not my focus today).
I’ve noticed the trickle releases of Azure Fabric, Amazon Sagemaker, and the dozens and dozens of other data tools released by the hyperscalers, but it wasn’t until I was preparing this article that I realized how complete the hyperscaler offerings have become. Take a look at the chart above on “Cloud Platforms are Quickly Becoming the Leading Data Platform Providers”. I looked at Gartner’s major data categories and mapped where there were offerings from each provider. You’ll notice that the hyperscalers actually have enough data technology that for many use cases you don’t need Cloudera or Teradata or even niche add-ons like data masking. The only clear exception I noticed was in Data Quality.
I told you that story to get to this one. This has enormous ramifications for firms that previously shyed away from getting into Big Data and AI because they couldn’t generate sufficient ROI from their use cases to offset the giant costs of specialized hardware and software. Because the Cloud Providers charge by the hour for what you actually use, the initial barrier to entry around hardware and software purchasing is nearly completely gone. You can create a project budget for an individual initiative even if you only have one use case.
I attempted to illustrate this in the chart above, “This means you can buy AI Use Cases One at a Time”. As with most things in the cloud, if you have sufficient workload, and can manage it efficiently, it is often cheaper to run on premise. Where this is transformative is for organizations that only have a few Big Data use cases either because of their size or because of their industry.
Ten years ago, everyone was talking about industries getting “Ubered.” The expectation was that every company would be a software company in a very short order. The question at the time was, “Could you become a software company faster than a software company becomes you?” Ten years later, we’re not all software companies; the big banks are still the big banks (despite a couple of little digital banks), Ford still makes cars, and Tesla still looks like a car manufacturer more than a software company. Do you run or work in IT for a company that never became a software company? The good news is you’re still a company, and I think the risk of getting “Ubered” is low. The bad news is that you still need a Digital Transformation.
Over the last five years, I’ve seen that successful Digital Transformations vary across industries and companies. The key to determining what will make you successful is knowing what differentiates you from your competitors and then figuring out how technology can enhance your USP (unique selling proposition). My observation is that when companies do this introspection and plan a Digital Transformation based on what works for them, they fit into three patterns (see the infographic above):
They convert to a software company.
They Infuse Innovation across the company.
They optimize the cost of technology by managing it as a cost center.
As a technology consultant, I want to believe that everyone’s Digital Transformation will be a game changer, but as an automobile enthusiast, I know better. A great example of a company whose digital transformation would likely only provide incremental value is Ferrari. They famously make their cars by hand, so improvements to the manufacturing process are likely to be small. Ferrari owners don’t expect to interact with their car or the dealership over an app. They will benefit from a digital transformation, but it will be more cost-saving and efficient than game-changing tech. If they don’t already, they’ll want to implement an off-the-shelf ERP system to help them efficiently manage their supply chain, inventory, and distribution. I’m sure there are some -intensive parts of the car and the race teams, but those will likely be point solutions. Overall, for their core business, IT will feel more like a support function (HR, Finance, etc…) that needs to be optimized than a differentiator, even after their digital transformation.
It is the companies in the middle that are most interesting. These companies do not become software companies, but they infuse technology into every portion of their business to create a more differentiated product. Take, for example, a bank like Capital One. As a bank with branches, tellers, and loan officers, they cannot simply “become a software company.” While I have not worked closely with Capital One, a quick perusal of their website shows that they have talented software development teams working on the mobile experience, software for the branches, and credit cards. My guess is that talented software teams can be found in every corner of Capital One’s business, but unlike AirBnB or Uber, Capital One (and almost any bank) can’t just be software. A large part of their differentiation will be in how friendly their tellers are, what types of financial products and companies they invest in, and so on. A company like Capital One should make the goal of their digital transformation to become “Innovation Infused.” Companies like this should use their Digital Transformation to accomplish three key tasks:
Invest in infrastructure and development platforms that the different technology teams throughout the company can leverage to create this innovation efficiently. (I posted an article about infrastructure platforms here based on a talk I did at the AWS Summit, and there are a lot of great articles on creating Internal Developer Platforms, like this one).
Note that this should include rolling out low-code and no-code platforms as well as best-of-breed SaaS solutions that can be configured for differentiation. Companies are increasingly finding ways to stand out with software, even if they don’t code it all.
Build data platforms and AI tools that allow Data and AI to be effectively integrated into many of these separate development teams. Too many companies have all of their data engineers and data scientists in one team that is separated from the real business problems. Similar to the infrastructure and development platforms I spoke about above, your data needs to exist in a platform where APIs, queries, and standard models for ingestion/governance are available to engineers throughout the organization.
Change the way you think about projects! Build out teams that know their customers and focus on products. These teams can be spread across the company, but to maintain a strategic vision at the company level, it is important that there is some consistency (Usually formed from an Agile CoE or Agile PMO). It is also critical that companies change their budgeting and planning mechanisms to support this product transformation. For advice on how to be successful on this item, I highly recommend the book Lean Enterprise.
Over the last couple of years, my team and I have carved out significant expertise in this middle category of companies. While the threat of getting “Ubered” never materialized in most industries, the need to “Infuse Innovation” via software development will become paramount over the next decade. I’ve worked personally with a leading airline, a leading electricity company, and several financial companies that understand this and are making tremendous progress. Helping people and companies infuse innovation into what makes them great has become my overall goal.
This book didn’t change my life like how discovering an agile methodology for software development or reading Mindset did, but it is a great book if you need a subtle reminder on how to build a good company or a good team. Essentially, Sinek applies the concept of a “growth mindset” to companies in a way that’s easy to understand and very motivating.
Sinek asks readers to recognize that business is not a “finite game” like a game of football where there is a prescribed end and a declared winner. It is an “infinite game” where you can’t “win”, you can just hope to leave the company you worked for better than it was when you started.
Sinek creates five steps for building a company or team ready for the infinite game. I’m going to list them all here, but I really think that 1 and 4 are the core. Of course, Sinek had to write a whole book, and I only have to write this blog post:
Advance a Just Cause – For you and your company to function in an infinite game, you must have a just cause worthy of an infinite pursuit, otherwise, you will revert to measuring yourself against quarterly outputs. Sinek makes a good observation that America succeeds because the founding fathers sought to establish a country that would seek to protect “inalienable rights” rather than one that sought to lower the taxes that Britain had levied on the colonies, and that is why we are still in the infinite game.
Build Trusting Teams – Sinek argues that you need to build trust amongst your teams. Individuals need to believe that they are being looked after beyond finite results.
Worthy Rival – The argument is that we need to understand what our competition is doing. Not to identify weaknesses and beat them in finite ways, but to understand our strengths and weaknesses and refine our just cause.
Existential Flexibility – We need to understand that the way that we apply our just cause today may not be the best or only way to apply it tomorrow. When we see opportunities to improve on how we perform our just cause, we must be ready to move to it. Sinek provides the example of Walt Disney realizing that he could also use Theme Parks in addition to movies to create a joyous departure from reality.
Courage to Lead – The argument here is that you must have the courage to make tough decisions consistent with your just cause even if those trying to apply finite game measurements are resistant.
Overall, I think the book is worth a quick read especially if you have recently taken over a new team or are considering a pivot.