Why Most CIOs Are Quietly Praying for Retirement — And the Few Who Aren’t Are About to Get Very Rich
I had a moment this week where I was sitting across from a Director of IT and it hit me — this poor bastard has the toughest job in the entire company. The business folks get to be full-time dreamers: “Hey, can we automate this? Can the AI just know what to do? Can it walk my dog while I’m in this meeting?”
Meanwhile he’s over there thinking about data security, system reliability, whether some employee is gonna click on an email that says “You’ve won a $1,000 Walmart gift card!”, whether Ukrainian hackers are going to steal their customer data at 2 a.m., and whether his entire team is about to get replaced by three interns and ChatGPT — all while knowing none of this stuff actually works the way the brochures promised.
And here’s the part that makes me feel for the guy — for his entire career he’s been rewarded for keeping the machines running and not getting fired. Now we’re asking him to suddenly become a profit center, to be out over his skis with AI initiatives. It’s like telling the hall monitor he’s now responsible for running the company’s underground poker game. Did I just compare our AI software to an underground poker game? Yeah, probably not the best analogy, but hang with me here, I’m rolling.
Meanwhile the C-suite is over there wondering why nothing’s happened yet, completely oblivious to the fact that they’ve spent twenty years brutally punishing IT for not playing defense. Hell, I know CIOs who got fired because Windows 95 sucked.
The real kicker is how to even get started. Our philosophy has always been to start small — automate one workflow, prove it works, and then compound fast. Smart in theory. In practice, with a big organization, that feels like bringing a birthday candle to a forest fire.
The C-suite doesn’t get excited about incremental. They want to see something that actually moves the needle. So you’re stuck trying to thread this ridiculous gap: build something small enough to actually work, get real user adoption, and make sure the vendor isn’t full of shit.
Honestly, I don’t envy that seat one bit. At Collide, we’re committed to being real partners with the folks actually doing the building. I’ve got serious scar tissue from getting fired for not being “openly collaborative” with other oil and gas companies on well spacing back in the shale days, and I’m never making that mistake again. We’re gonna share what we learn, educate when we can, and actually listen — God knows we have a lot to learn too.
Truth is, my tech guys are dying to find some partners in crime — and I really gotta stop with the crime analogies, I swear that’s not what we’re doing here — because they get all excited explaining the latest and greatest AI breakthrough and I respond with the technical sophistication of a man asking if his rotary phone has Bluetooth.
Sip slowly, my friends.
显示更多
i look so tall (im not i swear)
"I swear no one listens to me like Go"
Today I am announcing META-Bench, the first pure intelligence benchmark for AI. It leverages the hit auto-battler strategy game, TFT.
I SWEAR I AM NOT TROLLING let me explain.
The industry suffers from labs overfitting and giving us models that score high despite being fundamentally low IQ.
Over the years there have been many attempts at benchmarking AI with competitive gaming. I am going to explain the failure points, and why META-Bench is truly the first of its kind.
Chess.
When picking a game to benchmark with, chess is the obvious first choice. It has clear rules, large player base, and a well defined elo system.
The issue with static rule games though is that the best strategies can be figured out ahead of time and baked into the model during the training process. Too easily hacked. Memorizing more strategies is not a proof of intelligence.
Dota2/ League.
We’ve all heard of OpenAI Five. The issue with benchmarking on a MOBA is that reaction speed is a meaningless metric. We do not need our highly intelligent AI to be able to respond at the speed of top human pro players.
And truth be told, we are years away from a LLM that is able to play MOBAs at the highest levels off of vision alone, even though the problem is seemingly solved years ago.
What we need is a game that:
- Has defined rules but cannot be results hacked during the training process
- Large ecosystem of human players
- Clear cut results and an elo system
- Results that is not reaction time dependent
There is only ONE game in the world that meet all the requirements needed for this benchmark.
Teamfight Tactics.
For those unfamiliar, TFT is a strategy based auto-battler created by Riot Games with ~100 million monthly active players worldwide. It is a highly competitive multiplayer turn based game.
It’s as if Chess and League of Legends had a baby that’s born to be an AI benchmark:
- There is a new set released every 3 months.
- Time limitations in the 10-40 second range rather than the milliseconds required for MOBAs
- Skill based enough for esports yet uncertain enough to require reasoning over hard scripts
“Can’t labs just train models to be good at TFT?”
Nope and the reason why it’s unhackable comes down to how the benchmark itself is set up.
Due to the fact that the entire game is changed every 3 months and patched every 2 weeks, any data on a previous TFT set is effectively useless when it comes to raw pattern recognition.
Strategy wise, there are core concepts that carries over from set to set. That’s why we have the same players hitting the highest elo every season even though each set is so different.
Any efforts at overfitting here can be fully negated if the benchmark harness used for all models has every core strategy built in. You are never going to beat a carefully curated harness layer with strategy training at the model layer.
By presenting the models in the harness with the same core strategic concepts, the only difference in outputs will be its ability to reason across the different scenarios of each game. The luck elements of TFT already ensures that no 2 games will be the same in the reasoning required. Run the models against each other enough times and you will have a clear winner.
Aka, the world’s first true IQ test for AI.
I really, really want to know which AI model would win this. So I am going to build this. Not too sure how I’m going to fund it yet so if you would like to invest HMU.
I’m also looking to put together a small team of individuals who are both high elo in TFT and highly experienced with agentic AI.
And if you are even remotely curious on the results, like and help share this post 🫡
显示更多
This is the last time I'm posting my 🍑, I swear...!
For my new moots and followers: This is a nsfw account, minors dni 🔞 bullies, akgaes and antis dni. My job on twt is to thirst for Flx. I'm OT8 and multiship. Come interact with me, i swear i don't bite 🤭
I follow back other nsfw accounts or if we interact and i know you 🙏
显示更多
I swear I just be doing shit 😂
Just hit a stand-up show! The crowd laughed so hard at my bit about confusing coffee orders, I swear half the room ordered lattes mid-set. (刚结束一场脱口秀演出!我那关于搞混咖啡订单的段子让全场笑到一半人演出中途都点了拿铁。)
显示更多
i swear he kept saying im still here but working in silence and i said okay every time but he sounded annoyed that i was okay with it?!?!? AND THEN HE JUST HUNG UP IM SO CONFUSED I DIDNT WVEN FART
i look so tall ?? i swear im not