Systems | Development | Analytics | API | Testing

Why SQL is your key to querying Kafka

If you’re an engineer exploring a streaming platform like Kafka, chances are you’ve spent some time trying to work out what’s going on with the data in there. But if you’re introducing Kafka to a team of data scientists or developers unfamiliar with its idiosyncrasies, you might have spent days, weeks, months trying to tack on self-service capabilities. We’ve been there.

Data dump to data catalog for Apache Kafka

From data stagnating in warehouses to a growing number of real-time applications, in this article we explain why we need a new class of Data Catalogs: this time for real-time data. The 2010s brought us organizations “doing big data”. Teams were encouraged to dump it into a data lake and leave it for others to harvest. But data lakes soon became data swamps.

The reinvention of the Telco: From Pipe to Processor

The next generation of 5G networks are unlocking a mind-bending array of new use cases. Blistering speed, super low latency, and access to more powerful mobile hardware bring VR, AR and ultra high-definition experiences into sharp focus for the near future. But there’s a bigger shift being driven by 5G, and it’s not actually about speed at all. It’s about re-thinking the modern telco business model.

Building a Scalable Process Using NiFi, Kafka and HBase on CDP

Navistar is a leading global manufacturer of commercial trucks. With a fleet of 350,000 vehicles, unscheduled maintenance and vehicle breakdowns created ongoing disruption to their business. Navistar required a diagnostics platform that would help them predict when a vehicle needed maintenance to minimize downtime.

Use AI To Quickly Handle Sensitive Data Management

The growing waves of data that you’re pulling in include sensitive, personal or confidential data. This can become a compliance nightmare, especially with rules around PII, GDPR and CCPA, and it takes too much time to manually decide what should be protected. In this session, we will show how AI-driven data catalogs can identify sensitive data and share  that identification with your data security platforms to automate its discovery, identification and security.  You'll see how this dramatically reduces your time to onboard data and makes it safely available  to your business  communities.

Enabling high-speed Spark direct reader for Apache Hive ACID tables

Apache Hive supports transactional tables which provide ACID guarantees. There has been a significant amount of work that has gone into hive to make these transactional tables highly performant. Apache Spark provides some capabilities to access hive external tables but it cannot access hive managed tables. To access hive managed tables from spark Hive Warehouse Connector needs to be used.

Data Modeling in a Post-COVID-19 World

As a result of the COVID-19 pandemic, organizations around the world have had to transform overnight. Businesses that had been delaying digital transformation, or that hadn’t been thinking about it at all, have suddenly realized that moving their data analytics to the cloud is the key to coping with and surviving the COVID-19 disruption. The next phase is about rebounding and thriving in a post-COVID-19 world.