ChaosSearch Blog - Tips for Wrestling Your Data Chaos

Making the World's AWS Bills Less Daunting

Written by Courtney Pallotta | Jun 21, 2022

Armed with a Ph.D. from UC San Diego, our guest started off with internships at Google and Microsoft before gaining valuable experience as a VP and a highly sought-after consultant for startups and SMBs.

Now he’s one of the world’s foremost experts on wrangling vast data sets and maximizing efficiency.

 

 

Don’t miss our engaging conversation with Alex Rasmussen, Principal Cloud Economist at The Duckbill Group, where we’ll cover:

  • His fascinating work on large-scale data processing
  • Leveraging new solutions to old problems
  • Managing your costs through cloud architecture
  • Machine learning, AI, and what the future holds

Want to learn more about using data to solve big problems at scale? Let’s dive in.

 

“One of the big problems that existed back then and still exists now is that the data has to be good in order for the analysis to be good.” — Alex Rasmussen

 

Back to the beginning

As an undergrad at UC Berkeley, Alex worked as a resident computer dude, helping dormitory dwellers remove malware and get their papers done on time.

While pursuing his Ph.D., he served his time in the internship trenches before becoming a grad student researcher focused on making large-scale processing more efficient. His project achieved 60% better performance than a massive cluster of Hadoop nodes administered by Yahoo.

Leveraging his unique skillset landed him a gig as Principal Software Engineer at Trifacta, a Bay Area startup, where he led the development of their iconic desktop product.

He’s also put in time in the biotech industry and as a data engineering consultant to small and medium businesses. Alex has seen a lot of challenges — and many promising developments when it comes to wrangling Big Data.

 

The art of the possible

The open-source framework of Hadoop opened up a whole new world of possibilities around building scalable clusters of machines that could handle large-scale data processing.

Alex wondered how large these clusters could grow. Could they become more efficient and make better use of all that hardware? Giant on-prem systems were just sort of sitting around, collecting wide swaths of data.

Everyone’s heard the maxim “garbage in, garbage out.” Preparing mountains of data for analysis was an onerous task.

The answer? “Something something machine learning,” Alex jokes.

 

“We're seeing this return back to SQL as the lingua franca of data analysis, which has its good parts and its bad parts. But it was very interesting to see that explosion.” — Alex Rasmussen

 

Cloud data warehouses

Thanks to rapid progress and the decoupling of computing from storage, people are solving more problems without being chained to real-world tech stacks. “It’s fundamentally changed the way that the people who build these things can reason about them,” Alex points out.

We’ve been through some chaotic and interesting times, but Alex believes that SQL is coming back into common usage when it comes to data analysis.

He’s seen a migration from depending on single monolithic database engines to something more agile and fragmented.

“The flip side of that,” he acknowledges, is that “it’s very difficult for the average consumer to make any sense of the market overall.”

That’s where Duckbill comes in.

 

Cost management

Having worn many hats during the course of his career, Alex is perfectly placed at this fun-loving yet ruthlessly efficient boutique consulting firm.

AWS costs can spiral out of control but Duckbill’s expert team helps companies get a grip and ensure they’re paying for exactly what they’re using, and nothing more.

“In a cloud environment, architecture and cost are kind of the same thing,” he states. Alex works closely with his clients to closely evaluate their data and what they’re trying to accomplish with it, and then find the most cost-effective solution.

His impressive educational background and deep tech experience let him ask exactly the right questions.

 

“At a 10,000-foot view, we are still very much in the Wild West when it comes to machine learning.” — Alex Rasmussen

 

The hype cycle

“Machine learning” has gone from being a terror-inspiring science fiction plotline to a magical end-all-be-all buzzword for every problem on Earth, but it’s actually neither of those things.

While ML and AI can help with many projects, “a lot of the same problems are still there,” as Alex sees first-hand. But it’s still very early days.

Alex offers up a crucial piece of advice for business leaders — “if you feel like you have a problem that machine learning can solve, the first question you should be asking is, ‘can I solve this problem realistically without machine learning?’”

Many companies invested tons of money and time into super-hyped machine learning initiatives that ended up going nowhere. Before falling into this trap, he recommends that you start with simply identifying the exact problems you would like to solve.

Shiny new object syndrome can be quite expensive these days.

A self-confessed “data nerd,” Alex is truly excited about emerging developments and remains dedicated to democratizing access to data at scale.

 

More information about Alex and today’s topics:

 

To make sure you never miss an episode of Data Legends: Stories from the IT Trenches, follow on Google, Apple Podcasts, Spotify, our website, or anywhere you get podcasts.

 

 

Additional Resources

Read the Blog: Optimize Your AWS Data Lake with Data Enrichment and Smart Pipelines

Listen to the Podcast: Keeping the Chaos Searchable

Check out the Whitepaper: 2022 Cloud Data & Analytics Survey Report