Systems | Development | Analytics | API | Testing

December 2021

9 Expert Tips for Using Snowflake

Snowflake is a robust data warehouse that has changed the data science game for many organizations. Snowflake lets you analyze your data using the most sophisticated query engine available today with its cloud-native architecture. But using Snowflake is not always as simple as using other products on the market. Below are nine expert tips to help you master the Snowflake platform.

Three reasons you need modern cloud analytics now

Data is everywhere. As the sheer volume and number of data sources continue to explode, so do new opportunities for modern businesses to create and act on insights. That is if they are equipped with the right analytics technology. Historically, many businesses have settled for “good enough” analytics tools, putting up with lackluster bundles from full-stack vendors in an attempt to minimize cost or risk.

The true cost of Kubernetes: People, Time and Productivity

While writing a comparison of Kubernetes and Koyeb, we tried to determine how much operating a Kubernetes cluster really costs. This section of our comparison took us hours to write and ended up being so long that we decided to write a dedicated post about it. Full disclaimer: At Koyeb, we're building a serverless platform and we have a purpose-built orchestration engine.

The Next Generation of Cloud Connectivity: Apache Kafka, API Gateway and Service Mesh

Let’s boldly go where no one has gone before. Get ready, Star Trek fans! Jean-Luc Picard will be representing our microservice. Once we have Jean-Luc in our ship (microservice in production), what happens on day 2? We still need to add authorization, load balancing, rate limiting, etc. With an API gateway, like Kong Gateway, you don’t have to know how to do this because a set of program components, called plugins, allow you to implement this without any problem.

Data Goes Around The World In 80 Seconds With Snowflake

See how a database named Phileas Fogg can journey around the world in 80 seconds on Snowflake in this animated short. With Snowflake, PHILEAS_FOGG can failover in the event of disruption to enable continuous business operations and be joined with local data sets for global data collaboration across clouds.

Top 3 CloudOps Priorities for 2022, from Hitachi Vantara & AWS

As an estimated 92% of enterprises have adopted hybrid and multicloud strategies, according to the 2021 State of the Cloud Report from Flexera, cloud operations (CloudOps) teams face increasing pressure to simultaneously manage costs while improving business outcomes. What levers can CloudOps teams pull to achieve operational objectives such as reducing hybrid and distributed cloud complexity, enhancing security, and automating processes?

Will cloud ecosystems finally make insight to action a reality?

For decades, the technologies and systems that deliver analytics have undergone massive change. What hasn’t changed, however, is the goal: using data-driven insights to drive actions. Insight to action has been a consistent vision for the industry. Everyone from data practitioners to technology developers have sought this elusive goal, but as Chief Data Strategy Officer Cindi Howson points out, it has remained unfulfilled — until now.

How to migrate an on-premises data warehouse to BigQuery on Google Cloud

Data teams across companies have continuous challenges of consolidating data, processing it and making it useful. They deal with challenges such as a mixture of multiple ETL jobs, long ETL windows capacity-bound on-premise data warehouses and ever-increasing demands from users. They also need to make sure that the downstream requirements of ML, reporting and analytics are met with the data processing.

What is Amazon Redshift Spectrum?

Amazon S3 (Simple Storage Service) has been around since 2006. Most use this scalable, cloud-based service for archiving and backing up data. Within 10 years of its birth, S3 stored over 2 trillion objects, each up to 5 terabytes in size. Enterprises value their data as something worth preserving. But much of this data lies inert, in “cold” data lakes, unavailable for analysis. Also called “dark data”, it can hold key insights for enterprises.

Redshift Join: How to use Redshift's Join Clause

Redshift’s JOIN clause is perhaps the second most important clause after SELECT clause, and it is used even more ubiquitously, considering how interconnected a typical application database’s tables are. Due to that connectivity between datasets, data developers require many joins to collect and process all the data points involved in most use cases. Unfortunately, as the number of tables you’re joining in grows, so does the sloth of your query.

PostgreSQL to Amazon Redshift: 4 Ways to Replicate Your Data

PostgreSQL is the preferred platform of millions of developers around the world. The open-source tool is one of the most powerful databases on the planet, with the ability to handle sophisticated analytical workloads and high levels of concurrency. That makes PostgreSQL (also called Postgres) a popular DB for scientific research and AI/ML projects. It’s also a popular production database for data-driven companies in every industry. But no database is perfect.

Channel global decoupling for region discovery

Ably is a platform for pub/sub messaging. Publishes are done on named channels, and clients subscribed to a given channel have all messages on that channel delivered to them. The Ably pub/sub backend is multi-region: we run the production cluster in 7 AWS regions, and channel pub/sub operates seamlessly between them.

CDP on Azure: Harnessing the Power of Data Flow and Event Processing

Data is being created at an ever increasing rate and generating insights through event streams has become a critical function for businesses. How can we process this data flowing in the enterprise, evaluate, enrich and transform it, all in real time to enable fast analytics to support intelligent decision making? Join us for this session where we will look at how we can use the elastic nature of Azure to scale Data Flows and perform SQL operations in realtime on streaming data from a variety of sources.

AWS Redshift Pricing: How much does Redshift cost?

While Redshift is arguably the best data warehouse on the market, it can come with a hefty price tag. We’ve created this Redshift pricing guide to help you evaluate Redshift cheaply, create a budget for full implementation, and optimize your Redshift set up so that you’re getting the most bang for your data buck. Ready to get started? Think of this blog post as a “choose your own adventure” guide.

Microsoft Azure vs Amazon Redshift

When choosing any SaaS application, you must start with a clear understanding of your business requirements. Then ask yourself the following questions: Develop a framework for data processing requirements, and you'll find a data warehouse solution that provides the right amount of power, functionality, and high performance for data analytics. Keep the answers to these questions in mind when reading through this article.

What Is Snowflake?

As a company’s data assets grow, the need for cloud computing increases in tandem. For keeping pace with this growth, Snowflake stands above the rest. What makes Snowflake so special? This cloud-agnostic platform takes the best of traditional database technology and combines it with modern cloud computing to drive the agility and innovation companies need to remain competitive. It features on-the-fly scaling, flexible clustering options, and the capability to hold several petabytes of information.

What I Learned at AWS re:Invent 2021

Last week, our friends at Amazon hosted the 10th annual AWS re:Invent conference at the always-exciting Venetian Resort in Las Vegas. The ChaosSearch team was out in full post-Covid force and felt the incredible energy and enthusiasm as the conference returned to an in-person format for the first time since 2019. Our team of Chaosians stayed busy in the buzzing expo hall and attending conference talks so we could be first to hear about every new AWS service announcement, feature launch, and innovation.

Delivering High Performance for Cloudera Data Platform Operational Database (HBase) When Using S3

CDP Operational Database (COD) is a real-time auto-scaling operational database powered by Apache HBase and Apache Phoenix. It is one of the main Data Services that runs on Cloudera Data Platform (CDP) Public Cloud. You can access COD right from your CDP console. With COD, application developers can now leverage the power of HBase and Phoenix without the overheads related to deployment and management.

Too many Cloud Testing Tools are Distracting and Addictive. Here's How to Fix It.

In this new work from home era, a lot of companies have moved more and more towards online services and new tools to keep their productivity at similar levels as before. It’s harder and harder to keep track of all the tools and services you use on a regular basis to test all your websites and API services of your business. Here at LoadFocus it gets easier and easier to make use of the integrated testing services we provide as of today.

IT Professionals Reveal Cloud Data Platform Highs and Lows of 2021

Wondering whether your struggles with the data lake, cloud data platform, or analytics at large are typical? Are you ahead or behind the curve? ChaosSearch recently commissioned a survey to understand the advantages and setbacks organizations face today in these areas, and we’re excited to share a sneak peek of the results. To uncover more detailed findings from our research, sign up to receive the full report once it’s available here.
Sponsored Post

Service Mocks: Scaling a SaaS Demo with Traffic Replay

Building, running and scaling SaaS demo systems that run around the clock is a big engineering challenge. Through the power of traffic replay, we scaled our demos in a huge way. A few weeks ago we launched a new demo sandbox. This is actually a second generation version of our existing demo system that I built a few months ago (codename: decoy). Because the traffic viewer page shows the most recent data by default, you need to constantly be pumping new data in there. Any type of real-time SaaS system is going to have a similar requirement. So this needs to be planned.

How Hybrid and Cloud-Based Architectures are Unlocking the Power of Data

It takes vision, purpose, and skill to unlock the power of data. It also takes the right strategy. For ExxonMobil, Ares Trading (Merck), and the University of California San Diego (UCSD), the right strategy is taking full advantage of the cloud. All three organizations have partnered with Cloudera, leveraging a hybrid or cloud-based architecture to improve the lives of the people who depend on their organizations’ data.

Keboola vs Azure Data Factory: The 8 critical differences

ETL pipelines help companies extract, transform, and load data so it is ready to provide insights and value to the company. But running a smooth data operation depends on building reliable and scalable data ingestion pipelines. SaaS vendors like Keboola and Azure Data Factory take away the heavy lifting.