Systems | Development | Analytics | API | Testing

Multi-Raft - Boost up write performance for Apache Hadoop-Ozone

Apache Hadoop-Ozone is a new-era object storage solution for Big Data platform. It is scalable with strong consistency. Ozone uses Raft protocol, implemented by Apache Ratis (Incubating), to achieve high availability in its distributed system. My team in Tencent started to introduce Ozone as a backend object storage in production a few months ago and we’re onboarding more and more data warehouse users.

Speed Up Development With Powered by Fivetran

Powered by Fivetran (PBF) provides a simple framework for developers to go beyond internal analytics projects to build data pipelines into their applications within the Fivetran platform. With no engineering overhead, you can easily access hundreds of customer accounts across countless Fivetran-supported data sources, including advertising platforms, CRM systems, databases, web events and more.

MLRun Functions DEMO: Python Jupyter (Open-Source Data Science Orchestration + Experiment Tracking)

MLRun is a generic and convenient mechanism for #data scientists and software developers to build, run, and monitor #machinelearning (ML) tasks and pipelines on a scalable cluster while automatically tracking executed code, metadata, inputs, and outputs. On-Premise or Barebone/Metal - including Edge AI / Analytics Customers include NetApp, Quadient, Payoneer (and many more).

How to handle errors on elastic.io

You can see if any of your flows have errors as soon as you log in to your dashboard. It shows you the number of records processed in total – in green – and the number of records with errors – in red. You can also see the same information if you click on the corresponding flow from the dashboard. In my case, you can see I have three records processed and all three returned errors.

Advanced data mapping techniques - The passthrough feature

Let’s start by naming our flow. Here we are taking the webhook connector as our trigger. In case you skipped our previous short tutorial on webhooks, you need to copy this link and paste it in an empty tab to receive a sample. Now we are going to modify the sample – or correctly speaking, add it manually. These are just sample values to demonstrate how the passthrough feature works, so there is no point in seeking any deep meaning in them.

A perfect environment to learn & develop on Apache Kafka

Apache Kafka has gained traction as one of the most widely adopted technologies for building streaming applications - but introducing it (and scaling it) into your business can be a struggle. The problem isn’t with Kafka itself so much as the different components you need to learn and different tools required to operate it. For those motivated enough, you can invest money, effort and long Friday nights into learning, fixing and streamlining Kafka - and you’ll get there.

Behind the Scenes of Node.js Distributions

If you are installing Node.js in Linux to use it in production, there is a big chance that you are using NodeSource Node.js Binary Distributions. In this talk you can find the process in which NodeSource Node.js Binary Distributions is updated, how new versions are supported, the human and infrastructure process, and some limitations of maintaining the channel. Also and most importantly, how the community can get involved with this project.